Add index.md for PPL functions documentation#5033
Conversation
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds a new documentation index page at Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/user/ppl/functions/index.md (1)
6-242: Remove duplicate function entries in the Date and Time Functions section.The datetime section contains duplicate entries for
[DAY],[DAYNAME], and[DAYOFMONTH]on lines 88-90, immediately after their first occurrence on lines 85-87. Additionally, fix the inconsistent link format in the Expressions section:[Arithmetic Operators](expressions.md#arithmetic-operators)uses no forward slash before the#, while[Predicate Operators](expressions.md/#predicate-operators)includes one. Standardize the format across all links.
🤖 Fix all issues with AI agents
In @docs/user/ppl/functions/index.md:
- Around line 85-90: The functions list in this markdown contains duplicate
consecutive entries for DAY, DAYNAME, and DAYOFMONTH; remove the second set of
duplicated lines so each function appears only once by editing the list in the
functions index (delete the repeated entries for DAY, DAYNAME, and DAYOFMONTH).
🧹 Nitpick comments (1)
docs/user/ppl/functions/index.md (1)
8-11: Consolidate cross-category function documentation duplicates.Functions are documented in multiple documentation files, creating potential confusion:
EXISTS: Collection Functions →collection.md(line 103) and Condition Functions →condition.md(line 155)EARLIEST: Aggregations →aggregations.md(line 335) and Condition Functions →condition.md(line 625)LATEST: Aggregations →aggregations.md(line 383) and Condition Functions →condition.md(line 691)SUM: Aggregations →aggregations.md(line 63) and Mathematical →math.md(line 137)AVG: Aggregations →aggregations.md(line 87) and Mathematical →math.md(line 187)MAX: Aggregations →aggregations.md(line 111) and Statistical →statistical.md(line 3)MIN: Aggregations →aggregations.md(line 153) and Statistical →statistical.md(line 73)Consider consolidating these entries to a single canonical location or clarifying the different use cases for each function across categories.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/user/ppl/functions/index.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @docs/user/ppl/functions/index.md:
- Around line 142-143: Update the inconsistent link by changing
"expressions.md#arithmetic-operators" to use the same format as the other links
(i.e., "expressions.md/#arithmetic-operators") so it matches the existing
"expressions.md/#predicate-operators" pattern; apply the same file/#anchor
format consistently throughout the document.
- Around line 6-238: The expressions link for "Arithmetic Operators" uses an
inconsistent anchor format; update the link target in the functions index (the
line with "Arithmetic Operators") from expressions.md#arithmetic-operators to
expressions.md/#arithmetic-operators so it matches the standard anchor format
used by "Predicate Operators" and other entries, ensuring the change is applied
to the entry labeled "Arithmetic Operators" in the list.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/user/ppl/functions/index.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (25, unit)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (1)
docs/user/ppl/functions/index.md (1)
8-11: Verify duplicate function entries across categories.Several functions appear in multiple categories, which may confuse users:
- EXISTS: Collection (line 32) and Condition (line 48)
- EARLIEST: Aggregation (line 18) and Condition (line 57)
- LATEST: Aggregation (line 19) and Condition (line 58)
- MAX: Aggregation (line 10) and Statistical (line 215)
- MIN: Aggregation (line 11) and Statistical (line 216)
- SUM: Aggregation (line 8) and Mathematical (line 167)
- AVG: Aggregation (line 9) and Mathematical (line 168)
Given that the commit message mentions "remove duplicate entries," please verify whether these remaining duplicates are intentional cross-references or should be consolidated.
| - [Aggregation Functions](aggregations.md) | ||
| - [COUNT](aggregations.md/#count) | ||
| - [SUM](aggregations.md/#sum) | ||
| - [AVG](aggregations.md/#avg) | ||
| - [MAX](aggregations.md/#max) | ||
| - [MIN](aggregations.md/#min) | ||
| - [VAR_SAMP](aggregations.md/#var_samp) | ||
| - [VAR_POP](aggregations.md/#var_pop) | ||
| - [STDDEV_SAMP](aggregations.md/#stddev_samp) | ||
| - [STDDEV_POP](aggregations.md/#stddev_pop) | ||
| - [DISTINCT_COUNT, DC](aggregations.md/#distinct_count-dc) | ||
| - [DISTINCT_COUNT_APPROX](aggregations.md/#distinct_count_approx) | ||
| - [EARLIEST](aggregations.md/#earliest) | ||
| - [LATEST](aggregations.md/#latest) | ||
| - [TAKE](aggregations.md/#take) | ||
| - [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-or-percentile_approx) | ||
| - [MEDIAN](aggregations.md/#median) | ||
| - [FIRST](aggregations.md/#first) | ||
| - [LAST](aggregations.md/#last) | ||
| - [LIST](aggregations.md/#list) | ||
| - [VALUES](aggregations.md/#values) | ||
|
|
||
| - [Collection Functions](collection.md) | ||
| - [ARRAY](collection.md/#array) | ||
| - [ARRAY_LENGTH](collection.md/#array_length) | ||
| - [FORALL](collection.md/#forall) | ||
| - [EXISTS](collection.md/#exists) | ||
| - [FILTER](collection.md/#filter) | ||
| - [TRANSFORM](collection.md/#transform) | ||
| - [REDUCE](collection.md/#reduce) | ||
| - [MVJOIN](collection.md/#mvjoin) | ||
| - [MVAPPEND](collection.md/#mvappend) | ||
| - [SPLIT](collection.md/#split) | ||
| - [MVDEDUP](collection.md/#mvdedup) | ||
| - [MVFIND](collection.md/#mvfind) | ||
| - [MVINDEX](collection.md/#mvindex) | ||
| - [MVMAP](collection.md/#mvmap) | ||
| - [MVZIP](collection.md/#mvzip) | ||
|
|
||
| - [Condition Functions](condition.md) | ||
| - [ISNULL](condition.md/#isnull) | ||
| - [ISNOTNULL](condition.md/#isnotnull) | ||
| - [EXISTS](condition.md/#exists) | ||
| - [IFNULL](condition.md/#ifnull) | ||
| - [NULLIF](condition.md/#nullif) | ||
| - [IF](condition.md/#if) | ||
| - [CASE](condition.md/#case) | ||
| - [COALESCE](condition.md/#coalesce) | ||
| - [ISPRESENT](condition.md/#ispresent) | ||
| - [ISBLANK](condition.md/#isblank) | ||
| - [ISEMPTY](condition.md/#isempty) | ||
| - [EARLIEST](condition.md/#earliest) | ||
| - [LATEST](condition.md/#latest) | ||
| - [REGEXP_MATCH](condition.md/#regexp_match) | ||
|
|
||
| - [Type Conversion Functions](conversion.md) | ||
| - [CAST](conversion.md/#cast) | ||
| - [TOSTRING](conversion.md/#tostring) | ||
| - [TONUMBER](conversion.md/#tonumber) | ||
|
|
||
| - [Cryptographic Functions](cryptographic.md) | ||
| - [SHA1](cryptographic.md/#sha1) | ||
| - [SHA2](cryptographic.md/#sha2) | ||
|
|
||
| - [Date and Time Functions](datetime.md) | ||
| - [ADDDATE](datetime.md/#adddate) | ||
| - [ADDTIME](datetime.md/#addtime) | ||
| - [CONVERT_TZ](datetime.md/#convert_tz) | ||
| - [CURDATE](datetime.md/#curdate) | ||
| - [CURRENT_DATE](datetime.md/#current_date) | ||
| - [CURRENT_TIME](datetime.md/#current_time) | ||
| - [CURRENT_TIMESTAMP](datetime.md/#current_timestamp) | ||
| - [CURTIME](datetime.md/#curtime) | ||
| - [DATE](datetime.md/#date) | ||
| - [DATE_ADD](datetime.md/#date_add) | ||
| - [DATE_FORMAT](datetime.md/#date_format) | ||
| - [DATETIME](datetime.md/#datetime) | ||
| - [DATE_SUB](datetime.md/#date_sub) | ||
| - [DATEDIFF](datetime.md/#datediff) | ||
| - [DAY](datetime.md/#day) | ||
| - [DAYNAME](datetime.md/#dayname) | ||
| - [DAYOFMONTH](datetime.md/#dayofmonth) | ||
| - [DAY_OF_MONTH](datetime.md/#day_of_month) | ||
| - [DAYOFWEEK](datetime.md/#dayofweek) | ||
| - [DAY_OF_WEEK](datetime.md/#day_of_week) | ||
| - [DAYOFYEAR](datetime.md/#dayofyear) | ||
| - [DAY_OF_YEAR](datetime.md/#day_of_year) | ||
| - [EXTRACT](datetime.md/#extract) | ||
| - [FROM_DAYS](datetime.md/#from_days) | ||
| - [FROM_UNIXTIME](datetime.md/#from_unixtime) | ||
| - [GET_FORMAT](datetime.md/#get_format) | ||
| - [HOUR](datetime.md/#hour) | ||
| - [HOUR_OF_DAY](datetime.md/#hour_of_day) | ||
| - [LAST_DAY](datetime.md/#last_day) | ||
| - [LOCALTIMESTAMP](datetime.md/#localtimestamp) | ||
| - [LOCALTIME](datetime.md/#localtime) | ||
| - [MAKEDATE](datetime.md/#makedate) | ||
| - [MAKETIME](datetime.md/#maketime) | ||
| - [MICROSECOND](datetime.md/#microsecond) | ||
| - [MINUTE](datetime.md/#minute) | ||
| - [MINUTE_OF_HOUR](datetime.md/#minute_of_hour) | ||
| - [MONTH](datetime.md/#month) | ||
| - [MONTH_OF_YEAR](datetime.md/#month_of_year) | ||
| - [MONTHNAME](datetime.md/#monthname) | ||
| - [NOW](datetime.md/#now) | ||
| - [PERIOD_ADD](datetime.md/#period_add) | ||
| - [PERIOD_DIFF](datetime.md/#period_diff) | ||
| - [QUARTER](datetime.md/#quarter) | ||
| - [SEC_TO_TIME](datetime.md/#sec_to_time) | ||
| - [SECOND](datetime.md/#second) | ||
| - [SECOND_OF_MINUTE](datetime.md/#second_of_minute) | ||
| - [STRFTIME](datetime.md/#strftime) | ||
| - [STR_TO_DATE](datetime.md/#str_to_date) | ||
| - [SUBDATE](datetime.md/#subdate) | ||
| - [SUBTIME](datetime.md/#subtime) | ||
| - [SYSDATE](datetime.md/#sysdate) | ||
| - [TIME](datetime.md/#time) | ||
| - [TIME_FORMAT](datetime.md/#time_format) | ||
| - [TIME_TO_SEC](datetime.md/#time_to_sec) | ||
| - [TIMEDIFF](datetime.md/#timediff) | ||
| - [TIMESTAMP](datetime.md/#timestamp) | ||
| - [TIMESTAMPADD](datetime.md/#timestampadd) | ||
| - [TIMESTAMPDIFF](datetime.md/#timestampdiff) | ||
| - [TO_DAYS](datetime.md/#to_days) | ||
| - [TO_SECONDS](datetime.md/#to_seconds) | ||
| - [UNIX_TIMESTAMP](datetime.md/#unix_timestamp) | ||
| - [UTC_DATE](datetime.md/#utc_date) | ||
| - [UTC_TIME](datetime.md/#utc_time) | ||
| - [UTC_TIMESTAMP](datetime.md/#utc_timestamp) | ||
| - [WEEK](datetime.md/#week) | ||
| - [WEEKDAY](datetime.md/#weekday) | ||
| - [WEEK_OF_YEAR](datetime.md/#week_of_year) | ||
| - [YEAR](datetime.md/#year) | ||
| - [YEARWEEK](datetime.md/#yearweek) | ||
|
|
||
| - [Expressions](expressions.md) | ||
| - [Arithmetic Operators](expressions.md#arithmetic-operators) | ||
| - [Predicate Operators](expressions.md/#predicate-operators) | ||
|
|
||
| - [IP Address Functions](ip.md) | ||
| - [CIDRMATCH](ip.md/#cidrmatch) | ||
| - [GEOIP](ip.md/#geoip) | ||
|
|
||
| - [JSON Functions](json.md) | ||
| - [JSON](json.md/#json) | ||
| - [JSON_VALID](json.md/#json_valid) | ||
| - [JSON_OBJECT](json.md/#json_object) | ||
| - [JSON_ARRAY](json.md/#json_array) | ||
| - [JSON_ARRAY_LENGTH](json.md/#json_array_length) | ||
| - [JSON_EXTRACT](json.md/#json_extract) | ||
| - [JSON_DELETE](json.md/#json_delete) | ||
| - [JSON_SET](json.md/#json_set) | ||
| - [JSON_APPEND](json.md/#json_append) | ||
| - [JSON_EXTEND](json.md/#json_extend) | ||
| - [JSON_KEYS](json.md/#json_keys) | ||
|
|
||
| - [Mathematical Functions](math.md) | ||
| - [ADD](math.md/#add) | ||
| - [SUBTRACT](math.md/#subtract) | ||
| - [MULTIPLY](math.md/#multiply) | ||
| - [DIVIDE](math.md/#divide) | ||
| - [SUM](math.md/#sum) | ||
| - [AVG](math.md/#avg) | ||
| - [ACOS](math.md/#acos) | ||
| - [ASIN](math.md/#asin) | ||
| - [ATAN](math.md/#atan) | ||
| - [ATAN2](math.md/#atan2) | ||
| - [CEIL](math.md/#ceil) | ||
| - [CEILING](math.md/#ceiling) | ||
| - [CONV](math.md/#conv) | ||
| - [COS](math.md/#cos) | ||
| - [COSH](math.md/#cosh) | ||
| - [COT](math.md/#cot) | ||
| - [CRC32](math.md/#crc32) | ||
| - [DEGREES](math.md/#degrees) | ||
| - [E](math.md/#e) | ||
| - [EXP](math.md/#exp) | ||
| - [EXPM1](math.md/#expm1) | ||
| - [FLOOR](math.md/#floor) | ||
| - [LN](math.md/#ln) | ||
| - [LOG](math.md/#log) | ||
| - [LOG2](math.md/#log2) | ||
| - [LOG10](math.md/#log10) | ||
| - [MOD](math.md/#mod) | ||
| - [MODULUS](math.md/#modulus) | ||
| - [PI](math.md/#pi) | ||
| - [POW](math.md/#pow) | ||
| - [POWER](math.md/#power) | ||
| - [RADIANS](math.md/#radians) | ||
| - [RAND](math.md/#rand) | ||
| - [ROUND](math.md/#round) | ||
| - [SIGN](math.md/#sign) | ||
| - [SIGNUM](math.md/#signum) | ||
| - [SIN](math.md/#sin) | ||
| - [SINH](math.md/#sinh) | ||
| - [SQRT](math.md/#sqrt) | ||
| - [CBRT](math.md/#cbrt) | ||
| - [RINT](math.md/#rint) | ||
|
|
||
| - [Relevance Functions](relevance.md) | ||
| - [MATCH](relevance.md/#match) | ||
| - [MATCH_PHRASE](relevance.md/#match_phrase) | ||
| - [MATCH_PHRASE_PREFIX](relevance.md/#match_phrase_prefix) | ||
| - [MULTI_MATCH](relevance.md/#multi_match) | ||
| - [SIMPLE_QUERY_STRING](relevance.md/#simple_query_string) | ||
| - [MATCH_BOOL_PREFIX](relevance.md/#match_bool_prefix) | ||
| - [QUERY_STRING](relevance.md/#query_string) | ||
|
|
||
| - [Statistical Functions](statistical.md) | ||
| - [MAX](statistical.md/#max) | ||
| - [MIN](statistical.md/#min) | ||
|
|
||
| - [String Functions](string.md) | ||
| - [CONCAT](string.md/#concat) | ||
| - [CONCAT_WS](string.md/#concat_ws) | ||
| - [LENGTH](string.md/#length) | ||
| - [LIKE](string.md/#like) | ||
| - [ILIKE](string.md/#ilike) | ||
| - [LOCATE](string.md/#locate) | ||
| - [LOWER](string.md/#lower) | ||
| - [LTRIM](string.md/#ltrim) | ||
| - [POSITION](string.md/#position) | ||
| - [REPLACE](string.md/#replace) | ||
| - [REVERSE](string.md/#reverse) | ||
| - [RIGHT](string.md/#right) | ||
| - [RTRIM](string.md/#rtrim) | ||
| - [SUBSTRING](string.md/#substring) | ||
| - [TRIM](string.md/#trim) | ||
| - [UPPER](string.md/#upper) | ||
| - [REGEXP_REPLACE](string.md/#regexp_replace) | ||
|
|
||
| - [System Functions](system.md) | ||
| - [TYPEOF](system.md/#typeof) No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify all target documentation files exist and contain the expected anchors
echo "=== Checking if target documentation files exist ==="
target_files=("aggregations.md" "collection.md" "condition.md" "conversion.md" "cryptographic.md" "datetime.md" "expressions.md" "ip.md" "json.md" "math.md" "relevance.md" "statistical.md" "string.md" "system.md")
docs_dir="docs/user/ppl/functions"
missing_files=()
for file in "${target_files[@]}"; do
if [ -f "$docs_dir/$file" ]; then
echo "✓ $file exists"
else
echo "✗ $file MISSING"
missing_files+=("$file")
fi
done
echo ""
echo "=== Checking headers in target files ==="
# Extract all links from index.md and verify anchors exist
# Get the index file
index_file="$docs_dir/index.md"
if [ -f "$index_file" ]; then
echo "✓ index.md exists"
echo ""
# Extract links and check each one
echo "=== Verifying sample anchors from each target file ==="
# Check a sample from each file
echo "Checking aggregations.md for COUNT..."
rg -A 2 "^\[COUNT\]" "$docs_dir/aggregations.md" | head -5 || echo "COUNT not found as heading"
echo "Checking datetime.md for ADDDATE..."
rg -A 2 "^\[ADDDATE\]|^## ADDDATE" "$docs_dir/datetime.md" | head -5 || echo "ADDDATE not found as heading"
echo "Checking string.md for CONCAT..."
rg -A 2 "^\[CONCAT\]|^## CONCAT" "$docs_dir/string.md" | head -5 || echo "CONCAT not found as heading"
echo ""
echo "=== Checking markdown header patterns ==="
# Let's see what the actual header format is
echo "Sample headers from aggregations.md:"
rg "^#+" "$docs_dir/aggregations.md" | head -10
else
echo "✗ index.md not found"
fiRepository: opensearch-project/sql
Length of output: 956
🏁 Script executed:
#!/bin/bash
# Detailed anchor verification
docs_dir="docs/user/ppl/functions"
echo "=== Checking expressions.md link format issue ==="
# The index has "expressions.md#arithmetic-operators" without "/"
# Should be "expressions.md/#arithmetic-operators"
grep -n "expressions.md" "$docs_dir/index.md"
echo ""
echo "=== Verifying functions with underscores ==="
# Check VAR_SAMP
echo "VAR_SAMP in aggregations.md:"
rg "^### VAR_SAMP|^## VAR_SAMP" "$docs_dir/aggregations.md"
# Check DISTINCT_COUNT_APPROX
echo "DISTINCT_COUNT_APPROX in aggregations.md:"
rg "^### DISTINCT_COUNT_APPROX|^## DISTINCT_COUNT_APPROX" "$docs_dir/aggregations.md"
echo ""
echo "=== Verifying PERCENTILE anchor ==="
# The index uses #percentile-or-percentile_approx which is unusual
echo "PERCENTILE function definitions:"
rg "^### PERCENTILE" "$docs_dir/aggregations.md"
echo ""
echo "=== Checking duplicate functions ==="
# EXISTS should be in both collection.md and condition.md
echo "EXISTS in collection.md:"
rg "^### EXISTS|^## EXISTS" "$docs_dir/collection.md"
echo "EXISTS in condition.md:"
rg "^### EXISTS|^## EXISTS" "$docs_dir/condition.md"
# EARLIEST should be in both aggregations.md and condition.md
echo "EARLIEST in aggregations.md:"
rg "^### EARLIEST|^## EARLIEST" "$docs_dir/aggregations.md"
echo "EARLIEST in condition.md:"
rg "^### EARLIEST|^## EARLIEST" "$docs_dir/condition.md"
# LATEST should be in both aggregations.md and condition.md
echo "LATEST in aggregations.md:"
rg "^### LATEST|^## LATEST" "$docs_dir/aggregations.md"
echo "LATEST in condition.md:"
rg "^### LATEST|^## LATEST" "$docs_dir/condition.md"
echo ""
echo "=== Checking all headers in aggregations.md for anchor match ==="
rg "^### " "$docs_dir/aggregations.md" | head -20Repository: opensearch-project/sql
Length of output: 1197
🏁 Script executed:
#!/bin/bash
# Verify markdown anchor generation for special characters
docs_dir="docs/user/ppl/functions"
echo "=== Testing actual anchor format from files ==="
# For DISTINCT_COUNT, DC - check the actual anchor slug
echo "Checking aggregations.md for exact header format and potential anchor:"
rg -B 2 -A 5 "^### DISTINCT_COUNT, DC" "$docs_dir/aggregations.md"
echo ""
echo "=== Check PERCENTILE or PERCENTILE_APPROX anchor format ==="
rg -B 2 -A 5 "^### PERCENTILE or PERCENTILE_APPROX" "$docs_dir/aggregations.md"
echo ""
echo "=== Verify expressions.md has ARITHMETIC OPERATORS section ==="
rg "^## |^### " "$docs_dir/expressions.md"
echo ""
echo "=== Check if JSON_* functions have correct anchor naming ==="
rg "^### JSON_" "$docs_dir/json.md" | head -15
echo ""
echo "=== Verify all headers exist for math.md functions ==="
# Check for MOD vs MODULUS duplication
rg "^### (MOD|MODULUS)" "$docs_dir/math.md"
echo ""
echo "=== Final check: Verify expressions.md inconsistency ==="
echo "Line 142 (should have /): "
sed -n '142p' "$docs_dir/index.md"
echo ""
echo "Line 143 (already has /):"
sed -n '143p' "$docs_dir/index.md"Repository: opensearch-project/sql
Length of output: 1494
🏁 Script executed:
#!/bin/bash
docs_dir="docs/user/ppl/functions"
echo "=== Check JSON functions headers ==="
rg "^### JSON" "$docs_dir/json.md"
echo ""
echo "=== Verify critical finding: expressions.md link format ==="
echo "According to markdown conventions, both links should use the '/' separator:"
echo "Current line 142: expressions.md#arithmetic-operators (INCORRECT - missing /)"
echo "Current line 143: expressions.md/#predicate-operators (CORRECT)"
echo ""
echo "Markdown anchor links should use format: filename/#anchor"
echo ""
echo "=== Summary of all target files and function counts ==="
for file in aggregations.md collection.md condition.md conversion.md cryptographic.md datetime.md expressions.md ip.md json.md math.md relevance.md statistical.md string.md system.md; do
count=$(rg "^### |^## " "$docs_dir/$file" | grep -E "^[^/]*:###|^[^/]*:## [A-Z]" | wc -l)
echo "$file: $count header entries"
doneRepository: opensearch-project/sql
Length of output: 920
🏁 Script executed:
#!/bin/bash
docs_dir="docs/user/ppl/functions"
echo "=== Check JSON functions directly ==="
cat "$docs_dir/json.md" | grep "^### JSON"
echo ""
echo "=== Check math.md for MOD and MODULUS ==="
cat "$docs_dir/math.md" | grep "^### MOD"
echo ""
echo "=== Verify all aggregations functions exist ==="
echo "Expected functions from index:"
grep "aggregations.md/#" "$docs_dir/index.md" | sed 's/.*aggregations.md\/#//g' | sed 's/).*//' | sort
echo ""
echo "Actual headers in aggregations.md:"
cat "$docs_dir/aggregations.md" | grep "^### " | sed 's/^### //g' | sed 's/ *$//' | sortRepository: opensearch-project/sql
Length of output: 633
🏁 Script executed:
#!/bin/bash
docs_dir="docs/user/ppl/functions"
echo "=== Check json.md file size and first 50 lines ==="
wc -l "$docs_dir/json.md"
echo ""
head -50 "$docs_dir/json.md"
echo ""
echo "=== Check math.md for MOD/MODULUS ==="
head -100 "$docs_dir/math.md" | tail -50Repository: opensearch-project/sql
Length of output: 2979
🏁 Script executed:
#!/bin/bash
docs_dir="docs/user/ppl/functions"
echo "=== Get all headers from json.md ==="
grep "^## \|^### " "$docs_dir/json.md"
echo ""
echo "=== Get all headers from math.md ==="
grep "^## \|^### " "$docs_dir/math.md"
echo ""
echo "=== Compare: Index expects these JSON anchors ==="
grep "json.md/#" "$docs_dir/index.md" | sed 's/.*json.md\/#//g' | sed 's/).*//' | sort
echo ""
echo "=== Actual JSON function headers in file ==="
grep "^## \|^### " "$docs_dir/json.md" | grep -i json | sortRepository: opensearch-project/sql
Length of output: 2139
🏁 Script executed:
#!/bin/bash
docs_dir="docs/user/ppl/functions"
echo "=== Critical finding: expressions.md link inconsistency ==="
sed -n '141,143p' "$docs_dir/index.md"
echo ""
echo "=== Verify all target files for expected functions ==="
echo "Collection functions in index vs actual:"
echo "Expected:"
grep "collection.md/#" "$docs_dir/index.md" | sed 's/.*collection.md\/#//g' | sed 's/).*//' | sort | head -10
echo "Actual headers:"
grep "^## " "$docs_dir/collection.md" | grep -v "^## Description" | sed 's/^## //g' | sort | head -10
echo ""
echo "=== Check condition.md functions ==="
echo "Expected from index:"
grep "condition.md/#" "$docs_dir/index.md" | sed 's/.*condition.md\/#//g' | sed 's/).*//' | sort
echo "Actual headers:"
grep "^## " "$docs_dir/condition.md" | grep -v "^## Description" | sed 's/^## //g' | sort
echo ""
echo "=== Final comprehensive link verification ==="
python3 << 'EOF'
import re
with open('docs/user/ppl/functions/index.md', 'r') as f:
content = f.read()
# Find all markdown links with patterns like [Text](file.md/#anchor)
links = re.findall(r'\[([^\]]+)\]\(([^)]+)\)', content)
issues = []
for text, link in links:
if '.md' in link:
if '.md#' in link and '.md/#' not in link:
# Check if this is the expressions.md case
if 'expressions.md#' in link:
issues.append(f"Line with '{text}': '{link}' - Missing '/' after .md")
if issues:
print("CRITICAL ISSUES FOUND:")
for issue in issues:
print(f" - {issue}")
else:
print("No critical link format issues found")
EOFRepository: opensearch-project/sql
Length of output: 152
🏁 Script executed:
#!/bin/bash
docs_dir="docs/user/ppl/functions"
echo "=== Verify expressions.md link format issue ==="
sed -n '141,143p' "$docs_dir/index.md"
echo ""
echo "=== Check if this is a real broken link ==="
# Both formats should work in markdown, but the index is inconsistent
echo "Current formats in index:"
echo "Line 142: expressions.md#arithmetic-operators (missing / after .md)"
echo "Line 143: expressions.md/#predicate-operators (correct / after .md)"
echo ""
echo "Markdown standard uses: filename/#anchor"Repository: opensearch-project/sql
Length of output: 521
Fix inconsistent link format on line 142.
All target documentation files exist and contain the referenced functions. However, there is a formatting inconsistency in the expressions.md links: line 142 uses expressions.md#arithmetic-operators (missing /), while line 143 correctly uses expressions.md/#predicate-operators. Change line 142 to expressions.md/#arithmetic-operators to match the standard markdown anchor link format.
🤖 Prompt for AI Agents
In @docs/user/ppl/functions/index.md around lines 6 - 238, The expressions link
for "Arithmetic Operators" uses an inconsistent anchor format; update the link
target in the functions index (the line with "Arithmetic Operators") from
expressions.md#arithmetic-operators to expressions.md/#arithmetic-operators so
it matches the standard anchor format used by "Predicate Operators" and other
entries, ensuring the change is applied to the entry labeled "Arithmetic
Operators" in the list.
| - [Arithmetic Operators](expressions.md#arithmetic-operators) | ||
| - [Predicate Operators](expressions.md/#predicate-operators) |
There was a problem hiding this comment.
Fix link format inconsistency.
Line 142 uses expressions.md#arithmetic-operators (no / before #), while line 143 and all other links in the document consistently use the file.md/#anchor format. This inconsistency should be corrected for uniformity.
📝 Proposed fix
-- [Arithmetic Operators](expressions.md#arithmetic-operators)
+- [Arithmetic Operators](expressions.md/#arithmetic-operators)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - [Arithmetic Operators](expressions.md#arithmetic-operators) | |
| - [Predicate Operators](expressions.md/#predicate-operators) | |
| - [Arithmetic Operators](expressions.md/#arithmetic-operators) | |
| - [Predicate Operators](expressions.md/#predicate-operators) |
🤖 Prompt for AI Agents
In @docs/user/ppl/functions/index.md around lines 142 - 143, Update the
inconsistent link by changing "expressions.md#arithmetic-operators" to use the
same format as the other links (i.e., "expressions.md/#arithmetic-operators") so
it matches the existing "expressions.md/#predicate-operators" pattern; apply the
same file/#anchor format consistently throughout the document.
Description
Add index.md under PPL functions directory to make it clear what functions are supported with PPL
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.