Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,4 @@ nitin_docs/
# Jupyter notebooks
*.ipynb
.ipynb_checkpoints/
.ipynb_checkpoints/
76 changes: 73 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ executor = Executor(client, registry)

# Simple query
result = executor.execute("""
SELECT title, price
FROM products
SELECT title, price
FROM products
WHERE category = 'electronics' AND price < 500
ORDER BY price ASC
LIMIT 10
Expand Down Expand Up @@ -156,16 +156,86 @@ The layered approach emerged from TDD — writing tests first revealed natural b
- [x] Hybrid search (filters + vector)
- [x] Full-text search: `LIKE 'prefix%'` (prefix), `fulltext(field, 'terms')` function
- [x] GEO field queries with full operator support (see below)
- [x] Date functions: `YEAR()`, `MONTH()`, `DAY()`, `DATE_FORMAT()`, etc. (see below)

## What's Not Implemented (Yet...)

- [ ] JOINs (Redis doesn't support cross-index joins)
- [ ] Subqueries
- [ ] HAVING clause
- [ ] DISTINCT
- [ ] DATE/DATETIME support (use NUMERIC with Unix timestamps)
- [ ] Index creation from SQL (CREATE INDEX)

### DATE/DATETIME Handling

Redis does not have a native DATE field type. Dates are stored as **NUMERIC fields** with Unix timestamps.

**sql-redis automatically converts ISO 8601 date literals to Unix timestamps:**

```sql
-- Date literal (automatically converted to timestamp 1704067200)
SELECT * FROM events WHERE created_at > '2024-01-01'

-- Datetime literal with time
SELECT * FROM events WHERE created_at > '2024-01-01T12:00:00'

-- Date range with BETWEEN
SELECT * FROM events WHERE created_at BETWEEN '2024-01-01' AND '2024-01-31'

-- Multiple date conditions
SELECT * FROM events WHERE created_at > '2024-01-01' AND created_at < '2024-12-31'
```

**Supported date formats:**
- Date: `'2024-01-01'` (interpreted as midnight UTC)
- Datetime: `'2024-01-01T12:00:00'` or `'2024-01-01 12:00:00'`
- Datetime with timezone: `'2024-01-01T12:00:00Z'`, `'2024-01-01T12:00:00+00:00'`

**Note:** All dates without timezone are interpreted as UTC. You can also use raw Unix timestamps if preferred:

```sql
SELECT * FROM events WHERE created_at > 1704067200
```

### Date Functions

Extract date parts using SQL functions that map to Redis `APPLY` expressions:

| SQL Function | Redis Function | Description |
|--------------|----------------|-------------|
| `YEAR(field)` | `year(@field)` | Extract year (e.g., 2024) |
| `MONTH(field)` | `monthofyear(@field)` | Extract month (0-11) |
| `DAY(field)` | `dayofmonth(@field)` | Extract day of month (1-31) |
| `HOUR(field)` | `hour(@field)` | Round to hour |
| `MINUTE(field)` | `minute(@field)` | Round to minute |
| `DAYOFWEEK(field)` | `dayofweek(@field)` | Day of week (0=Sunday) |
| `DAYOFYEAR(field)` | `dayofyear(@field)` | Day of year (0-365) |
| `DATE_FORMAT(field, fmt)` | `timefmt(@field, fmt)` | Format timestamp |
Comment thread
nkanu17 marked this conversation as resolved.

Comment thread
nkanu17 marked this conversation as resolved.
**Examples:**

```sql
-- Extract year and month
SELECT name, YEAR(created_at) AS year, MONTH(created_at) AS month FROM events

-- Filter by year
SELECT name FROM events WHERE YEAR(created_at) = 2024

-- Group by date parts
SELECT YEAR(created_at) AS year, COUNT(*) FROM events GROUP BY year

-- Format dates
SELECT name, DATE_FORMAT(created_at, '%Y-%m-%d') AS date FROM events
```

**Note:** Redis's `monthofyear()` returns 0-11 (not 1-12), and `dayofweek()` returns 0 for Sunday.

#### Limitations

- `NOT YEAR(field) = 2024` is not supported (raises `ValueError`)
- `DATE_FORMAT()` is only supported in SELECT, not in WHERE (raises `ValueError`)
- Date functions combined with `OR` are not supported (raises `ValueError`)

### GEO Field Support

GEO fields are **fully implemented** with standard SQL-like syntax:
Expand Down
25 changes: 21 additions & 4 deletions sql_redis/analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,13 @@

from dataclasses import dataclass, field

from sql_redis.parser import AggregationSpec, ComputedField, Condition, ParsedQuery
from sql_redis.parser import (
AggregationSpec,
ComputedField,
Condition,
DateFunctionSpec,
ParsedQuery,
)


@dataclass
Expand All @@ -24,6 +30,7 @@ class AnalyzedQuery:
field_types: dict[str, str] = field(default_factory=dict)
aggregations: list[AggregationSpec] = field(default_factory=list)
computed_fields: list[ComputedField] = field(default_factory=list)
date_functions: list[DateFunctionSpec] = field(default_factory=list)
groupby_fields: list[str] = field(default_factory=list)
is_global_aggregation: bool = False
vector_search: VectorSearchAnalysis | None = None
Expand Down Expand Up @@ -108,19 +115,29 @@ def analyze(self, parsed: ParsedQuery) -> AnalyzedQuery:
if parsed.vector_search:
referenced_fields.add(parsed.vector_search.field)

# Fields from GROUP BY
# Fields from date functions (YEAR, MONTH, etc.)
for date_func in parsed.date_functions:
referenced_fields.add(date_func.field)

# Collect aliases from date functions and computed fields (for GROUP BY)
alias_names = {df.alias for df in parsed.date_functions}
alias_names.update(cf.alias for cf in parsed.computed_fields)

# Fields from GROUP BY (exclude aliases since they're computed)
for field_name in parsed.groupby_fields:
referenced_fields.add(field_name)
if field_name not in alias_names:
referenced_fields.add(field_name)

# Resolve field types
for field_name in referenced_fields:
if field_name not in schema:
raise ValueError(f"Unknown field: {field_name}")
result.field_types[field_name] = schema[field_name]

# Copy aggregations and computed fields
# Copy aggregations, computed fields, and date functions
result.aggregations = parsed.aggregations
result.computed_fields = parsed.computed_fields
result.date_functions = parsed.date_functions
result.groupby_fields = parsed.groupby_fields

# Determine if this is a global aggregation
Expand Down
Loading
Loading