Skip to content

Add index.md for PPL functions documentation#5033

Merged
penghuo merged 3 commits into
opensearch-project:mainfrom
ritvibhatt:add-functions-index
Jan 15, 2026
Merged

Add index.md for PPL functions documentation#5033
penghuo merged 3 commits into
opensearch-project:mainfrom
ritvibhatt:add-functions-index

Conversation

@ritvibhatt

@ritvibhatt ritvibhatt commented Jan 9, 2026

Copy link
Copy Markdown
Contributor

Description

Add index.md under PPL functions directory to make it clear what functions are supported with PPL

  • Lists all categories of functions and all functions in each category

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@coderabbitai

coderabbitai Bot commented Jan 9, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Summary by CodeRabbit

  • Documentation
    • Added a navigable index of PPL built-in functions, grouping functions by category (Aggregation, Collection, Condition, Type Conversion, Cryptographic, Date & Time, Expressions, IP Address, JSON, Mathematical, Relevance, Statistical, String, System) with links to category pages and anchors for individual functions for easier reference.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds a new documentation index page at docs/user/ppl/functions/index.md that catalogs PPL built-in functions, grouping them into categories (Aggregation, Collection, Condition, Type Conversion, Cryptographic, Date and Time, Expressions, IP Address, JSON, Mathematical, Relevance, Statistical, String, System) with navigational links and anchors.

Changes

Cohort / File(s) Summary
PPL Functions Documentation Index
docs/user/ppl/functions/index.md
New navigational reference page (≈238 lines added) listing built-in PPL functions grouped by category with links to category pages and anchors for individual functions; purely documentation content — review for link correctness and category completeness.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding an index.md file for PPL functions documentation.
Description check ✅ Passed The description is clearly related to the changeset, explaining the purpose of adding index.md and what it contains.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/user/ppl/functions/index.md (1)

6-242: Remove duplicate function entries in the Date and Time Functions section.

The datetime section contains duplicate entries for [DAY], [DAYNAME], and [DAYOFMONTH] on lines 88-90, immediately after their first occurrence on lines 85-87. Additionally, fix the inconsistent link format in the Expressions section: [Arithmetic Operators](expressions.md#arithmetic-operators) uses no forward slash before the #, while [Predicate Operators](expressions.md/#predicate-operators) includes one. Standardize the format across all links.

🤖 Fix all issues with AI agents
In @docs/user/ppl/functions/index.md:
- Around line 85-90: The functions list in this markdown contains duplicate
consecutive entries for DAY, DAYNAME, and DAYOFMONTH; remove the second set of
duplicated lines so each function appears only once by editing the list in the
functions index (delete the repeated entries for DAY, DAYNAME, and DAYOFMONTH).
🧹 Nitpick comments (1)
docs/user/ppl/functions/index.md (1)

8-11: Consolidate cross-category function documentation duplicates.

Functions are documented in multiple documentation files, creating potential confusion:

  • EXISTS: Collection Functions → collection.md (line 103) and Condition Functions → condition.md (line 155)
  • EARLIEST: Aggregations → aggregations.md (line 335) and Condition Functions → condition.md (line 625)
  • LATEST: Aggregations → aggregations.md (line 383) and Condition Functions → condition.md (line 691)
  • SUM: Aggregations → aggregations.md (line 63) and Mathematical → math.md (line 137)
  • AVG: Aggregations → aggregations.md (line 87) and Mathematical → math.md (line 187)
  • MAX: Aggregations → aggregations.md (line 111) and Statistical → statistical.md (line 3)
  • MIN: Aggregations → aggregations.md (line 153) and Statistical → statistical.md (line 73)

Consider consolidating these entries to a single canonical location or clarifying the different use cases for each function across categories.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74b2fb3 and e8fcdd2.

📒 Files selected for processing (1)
  • docs/user/ppl/functions/index.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist

Comment thread docs/user/ppl/functions/index.md Outdated
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @docs/user/ppl/functions/index.md:
- Around line 142-143: Update the inconsistent link by changing
"expressions.md#arithmetic-operators" to use the same format as the other links
(i.e., "expressions.md/#arithmetic-operators") so it matches the existing
"expressions.md/#predicate-operators" pattern; apply the same file/#anchor
format consistently throughout the document.
- Around line 6-238: The expressions link for "Arithmetic Operators" uses an
inconsistent anchor format; update the link target in the functions index (the
line with "Arithmetic Operators") from expressions.md#arithmetic-operators to
expressions.md/#arithmetic-operators so it matches the standard anchor format
used by "Predicate Operators" and other entries, ensuring the change is applied
to the entry labeled "Arithmetic Operators" in the list.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8fcdd2 and 6410ea5.

📒 Files selected for processing (1)
  • docs/user/ppl/functions/index.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (1)
docs/user/ppl/functions/index.md (1)

8-11: Verify duplicate function entries across categories.

Several functions appear in multiple categories, which may confuse users:

  • EXISTS: Collection (line 32) and Condition (line 48)
  • EARLIEST: Aggregation (line 18) and Condition (line 57)
  • LATEST: Aggregation (line 19) and Condition (line 58)
  • MAX: Aggregation (line 10) and Statistical (line 215)
  • MIN: Aggregation (line 11) and Statistical (line 216)
  • SUM: Aggregation (line 8) and Mathematical (line 167)
  • AVG: Aggregation (line 9) and Mathematical (line 168)

Given that the commit message mentions "remove duplicate entries," please verify whether these remaining duplicates are intentional cross-references or should be consolidated.

Comment on lines +6 to +238
- [Aggregation Functions](aggregations.md)
- [COUNT](aggregations.md/#count)
- [SUM](aggregations.md/#sum)
- [AVG](aggregations.md/#avg)
- [MAX](aggregations.md/#max)
- [MIN](aggregations.md/#min)
- [VAR_SAMP](aggregations.md/#var_samp)
- [VAR_POP](aggregations.md/#var_pop)
- [STDDEV_SAMP](aggregations.md/#stddev_samp)
- [STDDEV_POP](aggregations.md/#stddev_pop)
- [DISTINCT_COUNT, DC](aggregations.md/#distinct_count-dc)
- [DISTINCT_COUNT_APPROX](aggregations.md/#distinct_count_approx)
- [EARLIEST](aggregations.md/#earliest)
- [LATEST](aggregations.md/#latest)
- [TAKE](aggregations.md/#take)
- [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-or-percentile_approx)
- [MEDIAN](aggregations.md/#median)
- [FIRST](aggregations.md/#first)
- [LAST](aggregations.md/#last)
- [LIST](aggregations.md/#list)
- [VALUES](aggregations.md/#values)

- [Collection Functions](collection.md)
- [ARRAY](collection.md/#array)
- [ARRAY_LENGTH](collection.md/#array_length)
- [FORALL](collection.md/#forall)
- [EXISTS](collection.md/#exists)
- [FILTER](collection.md/#filter)
- [TRANSFORM](collection.md/#transform)
- [REDUCE](collection.md/#reduce)
- [MVJOIN](collection.md/#mvjoin)
- [MVAPPEND](collection.md/#mvappend)
- [SPLIT](collection.md/#split)
- [MVDEDUP](collection.md/#mvdedup)
- [MVFIND](collection.md/#mvfind)
- [MVINDEX](collection.md/#mvindex)
- [MVMAP](collection.md/#mvmap)
- [MVZIP](collection.md/#mvzip)

- [Condition Functions](condition.md)
- [ISNULL](condition.md/#isnull)
- [ISNOTNULL](condition.md/#isnotnull)
- [EXISTS](condition.md/#exists)
- [IFNULL](condition.md/#ifnull)
- [NULLIF](condition.md/#nullif)
- [IF](condition.md/#if)
- [CASE](condition.md/#case)
- [COALESCE](condition.md/#coalesce)
- [ISPRESENT](condition.md/#ispresent)
- [ISBLANK](condition.md/#isblank)
- [ISEMPTY](condition.md/#isempty)
- [EARLIEST](condition.md/#earliest)
- [LATEST](condition.md/#latest)
- [REGEXP_MATCH](condition.md/#regexp_match)

- [Type Conversion Functions](conversion.md)
- [CAST](conversion.md/#cast)
- [TOSTRING](conversion.md/#tostring)
- [TONUMBER](conversion.md/#tonumber)

- [Cryptographic Functions](cryptographic.md)
- [SHA1](cryptographic.md/#sha1)
- [SHA2](cryptographic.md/#sha2)

- [Date and Time Functions](datetime.md)
- [ADDDATE](datetime.md/#adddate)
- [ADDTIME](datetime.md/#addtime)
- [CONVERT_TZ](datetime.md/#convert_tz)
- [CURDATE](datetime.md/#curdate)
- [CURRENT_DATE](datetime.md/#current_date)
- [CURRENT_TIME](datetime.md/#current_time)
- [CURRENT_TIMESTAMP](datetime.md/#current_timestamp)
- [CURTIME](datetime.md/#curtime)
- [DATE](datetime.md/#date)
- [DATE_ADD](datetime.md/#date_add)
- [DATE_FORMAT](datetime.md/#date_format)
- [DATETIME](datetime.md/#datetime)
- [DATE_SUB](datetime.md/#date_sub)
- [DATEDIFF](datetime.md/#datediff)
- [DAY](datetime.md/#day)
- [DAYNAME](datetime.md/#dayname)
- [DAYOFMONTH](datetime.md/#dayofmonth)
- [DAY_OF_MONTH](datetime.md/#day_of_month)
- [DAYOFWEEK](datetime.md/#dayofweek)
- [DAY_OF_WEEK](datetime.md/#day_of_week)
- [DAYOFYEAR](datetime.md/#dayofyear)
- [DAY_OF_YEAR](datetime.md/#day_of_year)
- [EXTRACT](datetime.md/#extract)
- [FROM_DAYS](datetime.md/#from_days)
- [FROM_UNIXTIME](datetime.md/#from_unixtime)
- [GET_FORMAT](datetime.md/#get_format)
- [HOUR](datetime.md/#hour)
- [HOUR_OF_DAY](datetime.md/#hour_of_day)
- [LAST_DAY](datetime.md/#last_day)
- [LOCALTIMESTAMP](datetime.md/#localtimestamp)
- [LOCALTIME](datetime.md/#localtime)
- [MAKEDATE](datetime.md/#makedate)
- [MAKETIME](datetime.md/#maketime)
- [MICROSECOND](datetime.md/#microsecond)
- [MINUTE](datetime.md/#minute)
- [MINUTE_OF_HOUR](datetime.md/#minute_of_hour)
- [MONTH](datetime.md/#month)
- [MONTH_OF_YEAR](datetime.md/#month_of_year)
- [MONTHNAME](datetime.md/#monthname)
- [NOW](datetime.md/#now)
- [PERIOD_ADD](datetime.md/#period_add)
- [PERIOD_DIFF](datetime.md/#period_diff)
- [QUARTER](datetime.md/#quarter)
- [SEC_TO_TIME](datetime.md/#sec_to_time)
- [SECOND](datetime.md/#second)
- [SECOND_OF_MINUTE](datetime.md/#second_of_minute)
- [STRFTIME](datetime.md/#strftime)
- [STR_TO_DATE](datetime.md/#str_to_date)
- [SUBDATE](datetime.md/#subdate)
- [SUBTIME](datetime.md/#subtime)
- [SYSDATE](datetime.md/#sysdate)
- [TIME](datetime.md/#time)
- [TIME_FORMAT](datetime.md/#time_format)
- [TIME_TO_SEC](datetime.md/#time_to_sec)
- [TIMEDIFF](datetime.md/#timediff)
- [TIMESTAMP](datetime.md/#timestamp)
- [TIMESTAMPADD](datetime.md/#timestampadd)
- [TIMESTAMPDIFF](datetime.md/#timestampdiff)
- [TO_DAYS](datetime.md/#to_days)
- [TO_SECONDS](datetime.md/#to_seconds)
- [UNIX_TIMESTAMP](datetime.md/#unix_timestamp)
- [UTC_DATE](datetime.md/#utc_date)
- [UTC_TIME](datetime.md/#utc_time)
- [UTC_TIMESTAMP](datetime.md/#utc_timestamp)
- [WEEK](datetime.md/#week)
- [WEEKDAY](datetime.md/#weekday)
- [WEEK_OF_YEAR](datetime.md/#week_of_year)
- [YEAR](datetime.md/#year)
- [YEARWEEK](datetime.md/#yearweek)

- [Expressions](expressions.md)
- [Arithmetic Operators](expressions.md#arithmetic-operators)
- [Predicate Operators](expressions.md/#predicate-operators)

- [IP Address Functions](ip.md)
- [CIDRMATCH](ip.md/#cidrmatch)
- [GEOIP](ip.md/#geoip)

- [JSON Functions](json.md)
- [JSON](json.md/#json)
- [JSON_VALID](json.md/#json_valid)
- [JSON_OBJECT](json.md/#json_object)
- [JSON_ARRAY](json.md/#json_array)
- [JSON_ARRAY_LENGTH](json.md/#json_array_length)
- [JSON_EXTRACT](json.md/#json_extract)
- [JSON_DELETE](json.md/#json_delete)
- [JSON_SET](json.md/#json_set)
- [JSON_APPEND](json.md/#json_append)
- [JSON_EXTEND](json.md/#json_extend)
- [JSON_KEYS](json.md/#json_keys)

- [Mathematical Functions](math.md)
- [ADD](math.md/#add)
- [SUBTRACT](math.md/#subtract)
- [MULTIPLY](math.md/#multiply)
- [DIVIDE](math.md/#divide)
- [SUM](math.md/#sum)
- [AVG](math.md/#avg)
- [ACOS](math.md/#acos)
- [ASIN](math.md/#asin)
- [ATAN](math.md/#atan)
- [ATAN2](math.md/#atan2)
- [CEIL](math.md/#ceil)
- [CEILING](math.md/#ceiling)
- [CONV](math.md/#conv)
- [COS](math.md/#cos)
- [COSH](math.md/#cosh)
- [COT](math.md/#cot)
- [CRC32](math.md/#crc32)
- [DEGREES](math.md/#degrees)
- [E](math.md/#e)
- [EXP](math.md/#exp)
- [EXPM1](math.md/#expm1)
- [FLOOR](math.md/#floor)
- [LN](math.md/#ln)
- [LOG](math.md/#log)
- [LOG2](math.md/#log2)
- [LOG10](math.md/#log10)
- [MOD](math.md/#mod)
- [MODULUS](math.md/#modulus)
- [PI](math.md/#pi)
- [POW](math.md/#pow)
- [POWER](math.md/#power)
- [RADIANS](math.md/#radians)
- [RAND](math.md/#rand)
- [ROUND](math.md/#round)
- [SIGN](math.md/#sign)
- [SIGNUM](math.md/#signum)
- [SIN](math.md/#sin)
- [SINH](math.md/#sinh)
- [SQRT](math.md/#sqrt)
- [CBRT](math.md/#cbrt)
- [RINT](math.md/#rint)

- [Relevance Functions](relevance.md)
- [MATCH](relevance.md/#match)
- [MATCH_PHRASE](relevance.md/#match_phrase)
- [MATCH_PHRASE_PREFIX](relevance.md/#match_phrase_prefix)
- [MULTI_MATCH](relevance.md/#multi_match)
- [SIMPLE_QUERY_STRING](relevance.md/#simple_query_string)
- [MATCH_BOOL_PREFIX](relevance.md/#match_bool_prefix)
- [QUERY_STRING](relevance.md/#query_string)

- [Statistical Functions](statistical.md)
- [MAX](statistical.md/#max)
- [MIN](statistical.md/#min)

- [String Functions](string.md)
- [CONCAT](string.md/#concat)
- [CONCAT_WS](string.md/#concat_ws)
- [LENGTH](string.md/#length)
- [LIKE](string.md/#like)
- [ILIKE](string.md/#ilike)
- [LOCATE](string.md/#locate)
- [LOWER](string.md/#lower)
- [LTRIM](string.md/#ltrim)
- [POSITION](string.md/#position)
- [REPLACE](string.md/#replace)
- [REVERSE](string.md/#reverse)
- [RIGHT](string.md/#right)
- [RTRIM](string.md/#rtrim)
- [SUBSTRING](string.md/#substring)
- [TRIM](string.md/#trim)
- [UPPER](string.md/#upper)
- [REGEXP_REPLACE](string.md/#regexp_replace)

- [System Functions](system.md)
- [TYPEOF](system.md/#typeof) No newline at end of file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify all target documentation files exist and contain the expected anchors

echo "=== Checking if target documentation files exist ==="
target_files=("aggregations.md" "collection.md" "condition.md" "conversion.md" "cryptographic.md" "datetime.md" "expressions.md" "ip.md" "json.md" "math.md" "relevance.md" "statistical.md" "string.md" "system.md")

docs_dir="docs/user/ppl/functions"
missing_files=()

for file in "${target_files[@]}"; do
  if [ -f "$docs_dir/$file" ]; then
    echo "$file exists"
  else
    echo "$file MISSING"
    missing_files+=("$file")
  fi
done

echo ""
echo "=== Checking headers in target files ==="

# Extract all links from index.md and verify anchors exist
# Get the index file
index_file="$docs_dir/index.md"

if [ -f "$index_file" ]; then
  echo "✓ index.md exists"
  echo ""
  
  # Extract links and check each one
  echo "=== Verifying sample anchors from each target file ==="
  
  # Check a sample from each file
  echo "Checking aggregations.md for COUNT..."
  rg -A 2 "^\[COUNT\]" "$docs_dir/aggregations.md" | head -5 || echo "COUNT not found as heading"
  
  echo "Checking datetime.md for ADDDATE..."
  rg -A 2 "^\[ADDDATE\]|^## ADDDATE" "$docs_dir/datetime.md" | head -5 || echo "ADDDATE not found as heading"
  
  echo "Checking string.md for CONCAT..."
  rg -A 2 "^\[CONCAT\]|^## CONCAT" "$docs_dir/string.md" | head -5 || echo "CONCAT not found as heading"
  
  echo ""
  echo "=== Checking markdown header patterns ==="
  # Let's see what the actual header format is
  echo "Sample headers from aggregations.md:"
  rg "^#+" "$docs_dir/aggregations.md" | head -10
else
  echo "✗ index.md not found"
fi

Repository: opensearch-project/sql

Length of output: 956


🏁 Script executed:

#!/bin/bash
# Detailed anchor verification

docs_dir="docs/user/ppl/functions"

echo "=== Checking expressions.md link format issue ==="
# The index has "expressions.md#arithmetic-operators" without "/" 
# Should be "expressions.md/#arithmetic-operators"
grep -n "expressions.md" "$docs_dir/index.md"

echo ""
echo "=== Verifying functions with underscores ==="
# Check VAR_SAMP
echo "VAR_SAMP in aggregations.md:"
rg "^### VAR_SAMP|^## VAR_SAMP" "$docs_dir/aggregations.md"

# Check DISTINCT_COUNT_APPROX
echo "DISTINCT_COUNT_APPROX in aggregations.md:"
rg "^### DISTINCT_COUNT_APPROX|^## DISTINCT_COUNT_APPROX" "$docs_dir/aggregations.md"

echo ""
echo "=== Verifying PERCENTILE anchor ==="
# The index uses #percentile-or-percentile_approx which is unusual
echo "PERCENTILE function definitions:"
rg "^### PERCENTILE" "$docs_dir/aggregations.md"

echo ""
echo "=== Checking duplicate functions ==="
# EXISTS should be in both collection.md and condition.md
echo "EXISTS in collection.md:"
rg "^### EXISTS|^## EXISTS" "$docs_dir/collection.md"

echo "EXISTS in condition.md:"
rg "^### EXISTS|^## EXISTS" "$docs_dir/condition.md"

# EARLIEST should be in both aggregations.md and condition.md
echo "EARLIEST in aggregations.md:"
rg "^### EARLIEST|^## EARLIEST" "$docs_dir/aggregations.md"

echo "EARLIEST in condition.md:"
rg "^### EARLIEST|^## EARLIEST" "$docs_dir/condition.md"

# LATEST should be in both aggregations.md and condition.md
echo "LATEST in aggregations.md:"
rg "^### LATEST|^## LATEST" "$docs_dir/aggregations.md"

echo "LATEST in condition.md:"
rg "^### LATEST|^## LATEST" "$docs_dir/condition.md"

echo ""
echo "=== Checking all headers in aggregations.md for anchor match ==="
rg "^### " "$docs_dir/aggregations.md" | head -20

Repository: opensearch-project/sql

Length of output: 1197


🏁 Script executed:

#!/bin/bash
# Verify markdown anchor generation for special characters

docs_dir="docs/user/ppl/functions"

echo "=== Testing actual anchor format from files ==="

# For DISTINCT_COUNT, DC - check the actual anchor slug
echo "Checking aggregations.md for exact header format and potential anchor:"
rg -B 2 -A 5 "^### DISTINCT_COUNT, DC" "$docs_dir/aggregations.md"

echo ""
echo "=== Check PERCENTILE or PERCENTILE_APPROX anchor format ==="
rg -B 2 -A 5 "^### PERCENTILE or PERCENTILE_APPROX" "$docs_dir/aggregations.md"

echo ""
echo "=== Verify expressions.md has ARITHMETIC OPERATORS section ==="
rg "^## |^### " "$docs_dir/expressions.md"

echo ""
echo "=== Check if JSON_* functions have correct anchor naming ==="
rg "^### JSON_" "$docs_dir/json.md" | head -15

echo ""
echo "=== Verify all headers exist for math.md functions ==="
# Check for MOD vs MODULUS duplication
rg "^### (MOD|MODULUS)" "$docs_dir/math.md"

echo ""
echo "=== Final check: Verify expressions.md inconsistency ==="
echo "Line 142 (should have /): "
sed -n '142p' "$docs_dir/index.md"
echo ""
echo "Line 143 (already has /):"
sed -n '143p' "$docs_dir/index.md"

Repository: opensearch-project/sql

Length of output: 1494


🏁 Script executed:

#!/bin/bash
docs_dir="docs/user/ppl/functions"

echo "=== Check JSON functions headers ==="
rg "^### JSON" "$docs_dir/json.md"

echo ""
echo "=== Verify critical finding: expressions.md link format ==="
echo "According to markdown conventions, both links should use the '/' separator:"
echo "Current line 142: expressions.md#arithmetic-operators (INCORRECT - missing /)"
echo "Current line 143: expressions.md/#predicate-operators (CORRECT)"
echo ""
echo "Markdown anchor links should use format: filename/#anchor"

echo ""
echo "=== Summary of all target files and function counts ==="
for file in aggregations.md collection.md condition.md conversion.md cryptographic.md datetime.md expressions.md ip.md json.md math.md relevance.md statistical.md string.md system.md; do
  count=$(rg "^### |^## " "$docs_dir/$file" | grep -E "^[^/]*:###|^[^/]*:## [A-Z]" | wc -l)
  echo "$file: $count header entries"
done

Repository: opensearch-project/sql

Length of output: 920


🏁 Script executed:

#!/bin/bash
docs_dir="docs/user/ppl/functions"

echo "=== Check JSON functions directly ==="
cat "$docs_dir/json.md" | grep "^### JSON"

echo ""
echo "=== Check math.md for MOD and MODULUS ==="
cat "$docs_dir/math.md" | grep "^### MOD"

echo ""
echo "=== Verify all aggregations functions exist ==="
echo "Expected functions from index:"
grep "aggregations.md/#" "$docs_dir/index.md" | sed 's/.*aggregations.md\/#//g' | sed 's/).*//' | sort

echo ""
echo "Actual headers in aggregations.md:"
cat "$docs_dir/aggregations.md" | grep "^### " | sed 's/^### //g' | sed 's/ *$//' | sort

Repository: opensearch-project/sql

Length of output: 633


🏁 Script executed:

#!/bin/bash
docs_dir="docs/user/ppl/functions"

echo "=== Check json.md file size and first 50 lines ==="
wc -l "$docs_dir/json.md"
echo ""
head -50 "$docs_dir/json.md"

echo ""
echo "=== Check math.md for MOD/MODULUS ==="
head -100 "$docs_dir/math.md" | tail -50

Repository: opensearch-project/sql

Length of output: 2979


🏁 Script executed:

#!/bin/bash
docs_dir="docs/user/ppl/functions"

echo "=== Get all headers from json.md ==="
grep "^## \|^### " "$docs_dir/json.md"

echo ""
echo "=== Get all headers from math.md ==="
grep "^## \|^### " "$docs_dir/math.md"

echo ""
echo "=== Compare: Index expects these JSON anchors ==="
grep "json.md/#" "$docs_dir/index.md" | sed 's/.*json.md\/#//g' | sed 's/).*//' | sort

echo ""
echo "=== Actual JSON function headers in file ==="
grep "^## \|^### " "$docs_dir/json.md" | grep -i json | sort

Repository: opensearch-project/sql

Length of output: 2139


🏁 Script executed:

#!/bin/bash
docs_dir="docs/user/ppl/functions"

echo "=== Critical finding: expressions.md link inconsistency ==="
sed -n '141,143p' "$docs_dir/index.md"

echo ""
echo "=== Verify all target files for expected functions ==="

echo "Collection functions in index vs actual:"
echo "Expected:"
grep "collection.md/#" "$docs_dir/index.md" | sed 's/.*collection.md\/#//g' | sed 's/).*//' | sort | head -10
echo "Actual headers:"
grep "^## " "$docs_dir/collection.md" | grep -v "^## Description" | sed 's/^## //g' | sort | head -10

echo ""
echo "=== Check condition.md functions ==="
echo "Expected from index:"
grep "condition.md/#" "$docs_dir/index.md" | sed 's/.*condition.md\/#//g' | sed 's/).*//' | sort
echo "Actual headers:"
grep "^## " "$docs_dir/condition.md" | grep -v "^## Description" | sed 's/^## //g' | sort

echo ""
echo "=== Final comprehensive link verification ==="
python3 << 'EOF'
import re

with open('docs/user/ppl/functions/index.md', 'r') as f:
    content = f.read()

# Find all markdown links with patterns like [Text](file.md/#anchor)
links = re.findall(r'\[([^\]]+)\]\(([^)]+)\)', content)

issues = []
for text, link in links:
    if '.md' in link:
        if '.md#' in link and '.md/#' not in link:
            # Check if this is the expressions.md case
            if 'expressions.md#' in link:
                issues.append(f"Line with '{text}': '{link}' - Missing '/' after .md")

if issues:
    print("CRITICAL ISSUES FOUND:")
    for issue in issues:
        print(f"  - {issue}")
else:
    print("No critical link format issues found")
EOF

Repository: opensearch-project/sql

Length of output: 152


🏁 Script executed:

#!/bin/bash
docs_dir="docs/user/ppl/functions"

echo "=== Verify expressions.md link format issue ==="
sed -n '141,143p' "$docs_dir/index.md"

echo ""
echo "=== Check if this is a real broken link ==="
# Both formats should work in markdown, but the index is inconsistent
echo "Current formats in index:"
echo "Line 142: expressions.md#arithmetic-operators (missing / after .md)"
echo "Line 143: expressions.md/#predicate-operators (correct / after .md)"
echo ""
echo "Markdown standard uses: filename/#anchor"

Repository: opensearch-project/sql

Length of output: 521


Fix inconsistent link format on line 142.

All target documentation files exist and contain the referenced functions. However, there is a formatting inconsistency in the expressions.md links: line 142 uses expressions.md#arithmetic-operators (missing /), while line 143 correctly uses expressions.md/#predicate-operators. Change line 142 to expressions.md/#arithmetic-operators to match the standard markdown anchor link format.

🤖 Prompt for AI Agents
In @docs/user/ppl/functions/index.md around lines 6 - 238, The expressions link
for "Arithmetic Operators" uses an inconsistent anchor format; update the link
target in the functions index (the line with "Arithmetic Operators") from
expressions.md#arithmetic-operators to expressions.md/#arithmetic-operators so
it matches the standard anchor format used by "Predicate Operators" and other
entries, ensuring the change is applied to the entry labeled "Arithmetic
Operators" in the list.

Comment on lines +142 to +143
- [Arithmetic Operators](expressions.md#arithmetic-operators)
- [Predicate Operators](expressions.md/#predicate-operators)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix link format inconsistency.

Line 142 uses expressions.md#arithmetic-operators (no / before #), while line 143 and all other links in the document consistently use the file.md/#anchor format. This inconsistency should be corrected for uniformity.

📝 Proposed fix
-- [Arithmetic Operators](expressions.md#arithmetic-operators)
+- [Arithmetic Operators](expressions.md/#arithmetic-operators)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- [Arithmetic Operators](expressions.md#arithmetic-operators)
- [Predicate Operators](expressions.md/#predicate-operators)
- [Arithmetic Operators](expressions.md/#arithmetic-operators)
- [Predicate Operators](expressions.md/#predicate-operators)
🤖 Prompt for AI Agents
In @docs/user/ppl/functions/index.md around lines 142 - 143, Update the
inconsistent link by changing "expressions.md#arithmetic-operators" to use the
same format as the other links (i.e., "expressions.md/#arithmetic-operators") so
it matches the existing "expressions.md/#predicate-operators" pattern; apply the
same file/#anchor format consistently throughout the document.

@penghuo penghuo added PPL Piped processing language documentation Improvements or additions to documentation labels Jan 13, 2026
@penghuo penghuo enabled auto-merge (squash) January 13, 2026 16:39
@penghuo penghuo merged commit 389971f into opensearch-project:main Jan 15, 2026
38 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants