diff --git a/docs/geopandas-interop.ipynb b/docs/geopandas-interop.ipynb index ce111166e..cef23d963 100644 --- a/docs/geopandas-interop.ipynb +++ b/docs/geopandas-interop.ipynb @@ -14,15 +14,15 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "id": "0434bead-2628-4844-a3f6-2f9c15a21899", "metadata": {}, "outputs": [], "source": [ - "import sedonadb\n", + "import sedona.db\n", "import geopandas as gpd\n", "\n", - "sd = sedonadb.connect()" + "sd = sedona.db.connect()" ] }, { diff --git a/docs/index.md b/docs/index.md index 3a398a745..45b2119b2 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,6 +1,8 @@ --- hide: - navigation + +title: Introducing SedonaDB --- -# SedonaDB +SedonaDB is a high-performance, dependency-free geospatial compute engine designed for single-node processing, making it ideal for smaller datasets on local machines or cloud instances. + +The initial `0.1` release supports a core set of vector operations, with comprehensive vector and raster computation capabilities planned for the near future. + +## Key features + +SedonaDB has several advantages: + +* **Exceptional Performance:** Built in Rust to process massive geospatial datasets with exceptional speed. +* **Unified Geospatial Toolkit:** Access a comprehensive suite of functions for both vector and raster data in a single, powerful library. +* **Seamless Ecosystem Integration:** Built on Apache Arrow for smooth interoperability with popular data science libraries like GeoPandas, DuckDB, and Polars. +* **Flexible APIs:** Effortlessly switch between Python and SQL interfaces to match your preferred workflow and skill set. +* **Guaranteed CRS Propagation:** Automatically manages coordinate reference systems (CRS) to ensure spatial accuracy and prevent common errors. +* **Broad File Format Support:** Work with a wide range of both modern and legacy geospatial file formats like geoparquet. +* **Highly Extensible:** Easily customize and extend the library's functionality to meet your project's unique requirements. + +## Run a query in SQL, Python, or Rust -SedonaDB is a high-performance, dependency-free geospatial compute engine. +SedonaDB offers a flexible query interface in SQL, Python, or Rust. -You can easily run SedonaDB locally or in the cloud. The first release supports a core set of vector operations, but the full-suite of common vector and raster computations will be supported soon. +Engineered for speed, SedonaDB provides performant geospatial processing on a single machine. This makes it perfect for the rapid analysis of smaller datasets, whether you're working locally or on a cloud server. While the initial release focuses on core vector operations, a full suite of vector and raster computations is on the roadmap. -SedonaDB only runs on a single machine, so it’s perfect for processing smaller datasets. You can use SedonaSpark, SedonaFlink, or SedonaSnow for operations on larger datasets. +For massive, distributed workloads, you can leverage the power of SedonaSpark, +SedonaFlink, or SedonaSnow. === "SQL" @@ -67,21 +86,9 @@ SedonaDB only runs on a single machine, so it’s perfect for processing smaller sd_sql("SELECT ST_Point(0, 1) as geom") ``` -## Key features - -SedonaDB has several advantages: - -* **Blazing-Fast Performance:** Built in Rust to process massive geospatial datasets with exceptional speed. -* **Unified Geospatial Toolkit:** Access a comprehensive suite of functions for both vector and raster data in a single, powerful library. -* **Seamless Ecosystem Integration:** Built on Apache Arrow for smooth interoperability with popular data science libraries like GeoPandas, DuckDB, and Polars. -* **Flexible APIs:** Effortlessly switch between Python and SQL interfaces to match your preferred workflow and skillset. -* **Guaranteed CRS Propagation:** Automatically manages coordinate reference systems (CRS) to ensure spatial accuracy and prevent common errors. -* **Broad File Format Support:** Work with a wide range of both modern and legacy geospatial file formats like geoparquet. -* **Highly Extensible:** Easily customize and extend the library's functionality to meet your project's unique requirements. +## Install SedonaDB -## Installation - -Here’s how to install SedonaDB with various build tools: +Here's how to install SedonaDB with various build tools: === "pip" @@ -95,12 +102,8 @@ Here’s how to install SedonaDB with various build tools: install.packages("sedonadb", repos = "https://community.r-multiverse.org") ``` -## SedonaDB example with vector data - -TODO - ## Have questions? -Feel free to start a GitHub Discussion or join the Discord community to ask the developers any questions you may have. +Start a [GitHub Discussion](https://github.com/apache/sedona-db/issues) or join the [Discord community](https://discord.com/invite/9A3k5dEBsY) and ask the developers any questions you may have. We look forward to collaborating with you! diff --git a/docs/programming-guide.ipynb b/docs/programming-guide.ipynb index 392fdbd93..93c7208d2 100644 --- a/docs/programming-guide.ipynb +++ b/docs/programming-guide.ipynb @@ -11,11 +11,11 @@ "\n", "You will learn how to create SedonaDB DataFrames, run spatial queries, and perform I/O operations with various types of files.\n", "\n", - "Let’s start by establishing a SedonaDB connection.\n", + "Let's start by establishing a SedonaDB connection.\n", "\n", "## Establish SedonaDB connection\n", "\n", - "Here’s how to create the SedonaDB connection:" + "Here's how to create the SedonaDB connection:" ] }, { @@ -25,9 +25,9 @@ "metadata": {}, "outputs": [], "source": [ - "import sedonadb\n", + "import sedona.db\n", "\n", - "sd = sedonadb.connect()" + "sd = sedona.db.connect()" ] }, { @@ -35,13 +35,13 @@ "id": "7aeaa60f-2325-418c-8e72-4344bd4a75fe", "metadata": {}, "source": [ - "Now let’s see how to create SedonaDB DataFrames.\n", + "Now, let's see how to create SedonaDB dataframes.\n", "\n", "## Create SedonaDB DataFrame\n", "\n", "**Manually creating SedonaDB DataFrame**\n", "\n", - "Here’s how to manually create a SedonaDB DataFrame:" + "Here's how to manually create a SedonaDB DataFrame:" ] }, { @@ -95,7 +95,7 @@ "source": [ "**Create SedonaDB DataFrame from files in S3**\n", "\n", - "For most production applications, you will create SedonaDB DataFrames by reading data from a file. Let’s see how to read GeoParquet files in AWS S3 into a SedonaDB DataFrame." + "For most production applications, you will create SedonaDB DataFrames by reading data from a file. Let's see how to read GeoParquet files in AWS S3 into a SedonaDB DataFrame." ] }, { @@ -116,7 +116,7 @@ "id": "858fcc66-816d-4c71-8875-82b74169eccd", "metadata": {}, "source": [ - "Let’s now run some spatial queries.\n", + "Now, let's run some spatial queries.\n", "\n", "**Read from GeoPandas DataFrame**\n", "\n", @@ -181,11 +181,11 @@ "source": [ "## Spatial queries\n", "\n", - "Let’s see how to run spatial operations like filtering, joins, and clustering algorithms.\n", + "Let's see how to run spatial operations like filtering, joins, and clustering algorithms.\n", "\n", - "***Spatial filtering***\n", + "**Spatial filtering**\n", "\n", - "Let’s run a spatial filtering operation to fetch all the objects in the following polygon:" + "Let's run a spatial filtering operation to fetch all the objects in the following polygon:" ] }, { @@ -232,21 +232,21 @@ "source": [ "You can see it only includes the divisions in the Nova Scotia area. Skip to the visualization section to see how this data can be graphed on a map.\n", "\n", - "***K-nearest neighbors (KNN) joins***\n", + "**K-nearest neighbors (KNN) joins**\n", "\n", "Create `restaurants` and `customers` tables so we can demonstrate the KNN join functionality." ] }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "id": "deaa36db-2fee-4ba2-ab79-1dc756cb1655", "metadata": {}, "outputs": [], "source": [ "df = sd.sql(\"\"\"\n", "SELECT name, ST_Point(lng, lat) AS location\n", - "FROM (VALUES \n", + "FROM (VALUES\n", " (101, -74.0, 40.7, 'Pizza Palace'),\n", " (102, -73.99, 40.69, 'Burger Barn'),\n", " (103, -74.02, 40.72, 'Taco Town'),\n", @@ -259,7 +259,7 @@ "\n", "df = sd.sql(\"\"\"\n", "SELECT name, ST_Point(lng, lat) AS location\n", - "FROM (VALUES \n", + "FROM (VALUES\n", " (1, -74.0, 40.7, 'Alice'),\n", " (2, -73.9, 40.8, 'Bob'),\n", " (3, -74.1, 40.6, 'Carol')\n", @@ -349,17 +349,23 @@ "id": "2e93fe6a-b0a7-4ec0-952c-dde9edcacdc4", "metadata": {}, "source": [ - "Notice how each customer has two rows - one for each of the two closest restaurants.\n", - "\n", - "## Files\n", + "Notice how each customer has two rows - one for each of the two closest restaurants." + ] + }, + { + "cell_type": "markdown", + "id": "3cb1e53b", + "metadata": {}, + "source": [ + "## GeoParquet support\n", "\n", - "You can read GeoParquet files with SedonaDB, see the following example:\n", + "You can also read GeoParquet files with SedonaDB with `read_parquet()`\n", "\n", "```python\n", - "df = sd.read_parquet(\"some_file.parquet\")\n", + "df = sd.read_parquet(\"DATA_FILE.parquet\")\n", "```\n", "\n", - "Once you read the file, you can easily expose it as a view and query it with spatial SQL, as we demonstrated in the example above." + "Once you read the file, you can easily expose it as a view and query it with spatial SQL, as we demonstrated in the example above.\n" ] } ], diff --git a/docs/quickstart-cli.md b/docs/quickstart-cli.md deleted file mode 100644 index 26d0bd4f7..000000000 --- a/docs/quickstart-cli.md +++ /dev/null @@ -1,94 +0,0 @@ - - -# CLI Quickstart - -SedonaDB's command-line interface provides an interactive SQL shell that can be used to -leverage the SedonaDB engine for SQL-only/shell-centric workflows. SedonaDB's CLI is -based on the [DataFusion CLI](https://datafusion.apache.org/user-guide/cli/index.html), -whose documentation may be useful for advanced features not covered in detail here. - -## Installation - -You can install `sedona-cli` using Cargo: - -```shell -cargo install sedona-cli -``` - -## Usage - -Running `sedona-cli` from a terminal will start an interactive SQL shell. Queries must end -in a semicolon (`;`) and can be cleared with `Control-C`. - -``` -Sedona CLI v0.0.1 -> SELECT ST_Point(0, 1) as geom; -┌────────────┐ -│ geom │ -│ wkb │ -╞════════════╡ -│ POINT(0 1) │ -└────────────┘ - -1 row(s)/1 column(s) fetched. -Elapsed 0.024 seconds. -``` - -See the [SQL Reference]() for details on the SQL functions and features available to the CLI. - -## Help - -From the interactive shell, use `\?` for special command help: - -``` -> \? -Command,Description -\d,list tables -\d name,describe table -\q,quit datafusion-cli -\?,help -\h,function list -\h function,search function -\quiet (true|false)?,print or set quiet mode -\pset [NAME [VALUE]],"set table output option -(format)" -``` - -From the command line, use `--help` to list launch options and/or options for interacting -with the CLI in a non-interactive context. - -``` -Command Line Client for Sedona's DataFusion-based query engine. - -Usage: sedona-cli [OPTIONS] - -Options: - -p, --data-path Path to your data, default to current directory - -c, --command [...] Execute the given command string(s), then exit. Commands are expected to be non empty. - -f, --file [...] Execute commands from file(s), then exit - -r, --rc [...] Run the provided files on startup instead of ~/.datafusionrc - --format [default: automatic] [possible values: csv, tsv, table, json, nd-json, automatic] - -q, --quiet Reduce printing other than the results and work quietly - --maxrows The max number of rows to display for 'Table' format - [possible values: numbers(0/10/...), inf(no limit)] [default: 40] - --color Enables console syntax highlighting - -h, --help Print help - -V, --version Print version -``` diff --git a/docs/quickstart-python.ipynb b/docs/quickstart-python.ipynb index e243a5c4d..931b35cfd 100644 --- a/docs/quickstart-python.ipynb +++ b/docs/quickstart-python.ipynb @@ -10,7 +10,7 @@ "SedonaDB for Python can be installed from [PyPI](https://pypi.org):\n", "\n", "```shell\n", - "pip install apache-sedona[db]\n", + "pip install \"apache-sedona[db]\"\n", "```\n", "\n", "If you can import the module and connect to a new session, you're good to go!" @@ -28,7 +28,7 @@ "text": [ "┌────────────┐\n", "│ geom │\n", - "│ wkb │\n", + "│ geometry │\n", "╞════════════╡\n", "│ POINT(0 1) │\n", "└────────────┘\n" @@ -36,9 +36,9 @@ } ], "source": [ - "import sedonadb\n", + "import sedona.db\n", "\n", - "sd = sedonadb.connect()\n", + "sd = sedona.db.connect()\n", "sd.sql(\"SELECT ST_Point(0, 1) as geom\").show()" ] }, @@ -74,7 +74,7 @@ "text": [ "┌──────────────┬───────────────────────────────┐\n", "│ name ┆ geometry │\n", - "│ utf8view ┆ wkb_view │\n", + "│ utf8view ┆ geometry │\n", "╞══════════════╪═══════════════════════════════╡\n", "│ Vatican City ┆ POINT(12.4533865 41.9032822) │\n", "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n", @@ -127,7 +127,7 @@ "text": [ "┌─────────────────────────────┬───────────────┬────────────────────────────────────────────────────┐\n", "│ name ┆ continent ┆ geometry │\n", - "│ utf8view ┆ utf8view ┆ wkb_view │\n", + "│ utf8view ┆ utf8view ┆ geometry │\n", "╞═════════════════════════════╪═══════════════╪════════════════════════════════════════════════════╡\n", "│ Fiji ┆ Oceania ┆ MULTIPOLYGON(((180 -16.067132663642447,180 -16.55… │\n", "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n", @@ -179,7 +179,7 @@ "text": [ "┌───────────────┬──────────────────────┬─────────────────────┬───────────────┬─────────────────────┐\n", "│ name ┆ geometry ┆ name ┆ continent ┆ geometry │\n", - "│ utf8view ┆ wkb_view ┆ utf8view ┆ utf8view ┆ wkb_view