Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions dev/generate-release-docs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# This script generates documentation content for a release branch.
# It should be run once when creating a new release branch to "freeze"
# the generated docs (configs, compatibility matrices) into the branch.
#
# Usage: ./dev/generate-release-docs.sh
#
# This script:
# 1. Compiles the spark module to access CometConf and CometCast
# 2. Runs GenerateDocs to populate the template markers in the docs
# 3. The resulting changes should be committed to the release branch
#
# Example workflow when cutting release 0.13.0:
# git checkout -b branch-0.13 main
# ./dev/generate-release-docs.sh
# git add docs/source/user-guide/latest/
# git commit -m "Generate docs for 0.13.0 release"
# git push origin branch-0.13

set -e

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"

cd "${PROJECT_ROOT}"

echo "Compiling common and spark modules..."
./mvnw -q compile -pl common,spark -DskipTests

echo "Generating documentation content..."
./mvnw -q exec:java -pl spark \
-Dexec.mainClass=org.apache.comet.GenerateDocs \
-Dexec.arguments="${PROJECT_ROOT}/docs/source/user-guide/latest/" \
-Dexec.classpathScope=compile

echo ""
echo "Done! Generated documentation content in docs/source/user-guide/latest/"
echo ""
echo "Next steps:"
echo " git add docs/source/user-guide/latest/"
echo " git commit -m 'Generate docs for release'"
echo " git push"
18 changes: 17 additions & 1 deletion dev/release/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,29 @@ origin git@github.com:yourgithubid/datafusion-comet.git (push)
Create a release branch from the latest commit in main and push to the `apache` repo:

```shell
get fetch apache
git fetch apache
git checkout main
git reset --hard apache/main
git checkout -b branch-0.1
git push apache branch-0.1
```

### Generate Release Documentation

Generate the documentation content for this release. The docs on `main` contain only template markers,
so we need to generate the actual content (config tables, compatibility matrices) for the release branch:

```shell
./dev/generate-release-docs.sh
git add docs/source/user-guide/latest/
git commit -m "Generate docs for 0.1.0 release"
git push apache branch-0.1
```

This freezes the documentation to reflect the configs and expressions available in this release.

### Update Maven Version

Update the `pom.xml` files in the release branch to update the Maven version from `0.1.0-SNAPSHOT` to `0.1.0`.

There is no need to update the Rust crate versions because they will already be `0.1.0`.
Expand Down
11 changes: 11 additions & 0 deletions docs/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,15 @@ python3 generate-versions.py
rm temp/user-guide/0.9/overview.md 2> /dev/null
rm temp/user-guide/0.8/overview.md 2> /dev/null

# Generate dynamic content (configs, compatibility matrices) for latest docs
# This runs GenerateDocs against the temp copy, not source files
echo "Generating dynamic documentation content..."
cd ..
./mvnw -q compile -pl spark -DskipTests -am
./mvnw -q exec:java -pl spark \
-Dexec.mainClass=org.apache.comet.GenerateDocs \
-Dexec.arguments="$(pwd)/docs/temp/user-guide/latest/" \
-Dexec.classpathScope=compile
cd docs

make SOURCEDIR=`pwd`/temp html
2 changes: 1 addition & 1 deletion docs/generate-versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,5 +105,5 @@ def generate_docs(snapshot_version: str, latest_released_version: str, previous_
print("Generating versioned user guide docs...")
snapshot_version = get_version_from_pom()
latest_released_version = "0.12.0"
previous_versions = ["0.8.0", "0.9.1", "0.10.1", "0.11.0"]
previous_versions = ["0.10.1", "0.11.0"]
generate_docs(snapshot_version, latest_released_version, previous_versions)
41 changes: 0 additions & 41 deletions docs/source/user-guide/0.8/index.rst

This file was deleted.

41 changes: 0 additions & 41 deletions docs/source/user-guide/0.9/index.rst

This file was deleted.

90 changes: 0 additions & 90 deletions docs/source/user-guide/latest/compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,107 +84,17 @@ Cast operations in Comet fall into three levels of support:

### Legacy Mode

<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->

<!--BEGIN:CAST_LEGACY_TABLE-->
<!-- prettier-ignore-start -->
| | binary | boolean | byte | date | decimal | double | float | integer | long | short | string | timestamp |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| binary | - | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | C | N/A |
| boolean | N/A | - | C | N/A | U | C | C | C | C | C | C | U |
| byte | U | C | - | N/A | C | C | C | C | C | C | C | U |
| date | N/A | U | U | - | U | U | U | U | U | U | C | U |
| decimal | N/A | C | C | N/A | - | C | C | C | C | C | C | U |
| double | N/A | C | C | N/A | I | - | C | C | C | C | C | U |
| float | N/A | C | C | N/A | I | C | - | C | C | C | C | U |
| integer | U | C | C | N/A | C | C | C | - | C | C | C | U |
| long | U | C | C | N/A | C | C | C | C | - | C | C | U |
| short | U | C | C | N/A | C | C | C | C | C | - | C | U |
| string | C | C | C | C | I | C | C | C | C | C | - | I |
| timestamp | N/A | U | U | C | U | U | U | U | C | U | C | - |
<!-- prettier-ignore-end -->

**Notes:**

- **decimal -> string**: There can be formatting differences in some case due to Spark using scientific notation where Comet does not
- **double -> decimal**: There can be rounding differences
- **double -> string**: There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
- **float -> decimal**: There can be rounding differences
- **float -> string**: There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
- **string -> date**: Only supports years between 262143 BC and 262142 AD
- **string -> decimal**: Does not support fullwidth unicode digits (e.g \\uFF10)
or strings containing null bytes (e.g \\u0000)
- **string -> timestamp**: Not all valid formats are supported
<!--END:CAST_LEGACY_TABLE-->

### Try Mode

<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->

<!--BEGIN:CAST_TRY_TABLE-->
<!-- prettier-ignore-start -->
| | binary | boolean | byte | date | decimal | double | float | integer | long | short | string | timestamp |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| binary | - | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | C | N/A |
| boolean | N/A | - | C | N/A | U | C | C | C | C | C | C | U |
| byte | U | C | - | N/A | C | C | C | C | C | C | C | U |
| date | N/A | U | U | - | U | U | U | U | U | U | C | U |
| decimal | N/A | C | C | N/A | - | C | C | C | C | C | C | U |
| double | N/A | C | C | N/A | I | - | C | C | C | C | C | U |
| float | N/A | C | C | N/A | I | C | - | C | C | C | C | U |
| integer | U | C | C | N/A | C | C | C | - | C | C | C | U |
| long | U | C | C | N/A | C | C | C | C | - | C | C | U |
| short | U | C | C | N/A | C | C | C | C | C | - | C | U |
| string | C | C | C | C | I | C | C | C | C | C | - | I |
| timestamp | N/A | U | U | C | U | U | U | U | C | U | C | - |
<!-- prettier-ignore-end -->

**Notes:**

- **decimal -> string**: There can be formatting differences in some case due to Spark using scientific notation where Comet does not
- **double -> decimal**: There can be rounding differences
- **double -> string**: There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
- **float -> decimal**: There can be rounding differences
- **float -> string**: There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
- **string -> date**: Only supports years between 262143 BC and 262142 AD
- **string -> decimal**: Does not support fullwidth unicode digits (e.g \\uFF10)
or strings containing null bytes (e.g \\u0000)
- **string -> timestamp**: Not all valid formats are supported
<!--END:CAST_TRY_TABLE-->

### ANSI Mode

<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->

<!--BEGIN:CAST_ANSI_TABLE-->
<!-- prettier-ignore-start -->
| | binary | boolean | byte | date | decimal | double | float | integer | long | short | string | timestamp |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| binary | - | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | C | N/A |
| boolean | N/A | - | C | N/A | U | C | C | C | C | C | C | U |
| byte | U | C | - | N/A | C | C | C | C | C | C | C | U |
| date | N/A | U | U | - | U | U | U | U | U | U | C | U |
| decimal | N/A | C | C | N/A | - | C | C | C | C | C | C | U |
| double | N/A | C | C | N/A | I | - | C | C | C | C | C | U |
| float | N/A | C | C | N/A | I | C | - | C | C | C | C | U |
| integer | U | C | C | N/A | C | C | C | - | C | C | C | U |
| long | U | C | C | N/A | C | C | C | C | - | C | C | U |
| short | U | C | C | N/A | C | C | C | C | C | - | C | U |
| string | C | C | C | C | I | C | C | C | C | C | - | I |
| timestamp | N/A | U | U | C | U | U | U | U | C | U | C | - |
<!-- prettier-ignore-end -->

**Notes:**

- **decimal -> string**: There can be formatting differences in some case due to Spark using scientific notation where Comet does not
- **double -> decimal**: There can be rounding differences
- **double -> string**: There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
- **float -> decimal**: There can be rounding differences
- **float -> string**: There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
- **string -> date**: Only supports years between 262143 BC and 262142 AD
- **string -> decimal**: Does not support fullwidth unicode digits (e.g \\uFF10)
or strings containing null bytes (e.g \\u0000)
- **string -> timestamp**: ANSI mode not supported
<!--END:CAST_ANSI_TABLE-->

See the [tracking issue](https://github.com/apache/datafusion-comet/issues/286) for more details.
Loading
Loading