Skip to content

Commit bde3640

Browse files
committed
docs: clarify filter pipeline and lua hook scope
1 parent 70258b0 commit bde3640

6 files changed

Lines changed: 272 additions & 111 deletions

File tree

docs/src/en/filter/filter.md

Lines changed: 60 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2,47 +2,82 @@
22
outline: deep
33
---
44
# Built-in Filter Rules
5-
RedisShake provides various built-in filter rules that users can choose from according to their needs.
65

7-
## Filtering Keys
8-
RedisShake supports filtering by key name, key name prefixes, and suffixes. You can set the following options in the configuration file, for example:
6+
RedisShake evaluates filter rules after commands are parsed but before anything is sent to the destination. The filter therefore controls which commands ever leave RedisShake, and only the commands that pass this stage are eligible for further processing by the optional [function](./function.md) hook.
7+
8+
## Where filtering happens
9+
10+
```
11+
source reader --> filter rules --> (optional Lua function) --> writer / target
12+
```
13+
14+
* Commands enter the filter after RedisShake has parsed the RESP payload from the reader. At this point the request is already considered valid and would be forwarded if no filters were configured.
15+
* Filtering happens before any other transformation stage, so blocked commands never reach the optional Lua function or the writer.
16+
* The stage operates on the same command representation that writers use, which keeps behaviour consistent for all readers.
17+
18+
## How Filter Evaluation Works
19+
20+
1. **Block rules run first.** If a key, database, command, or command group matches a `block_*` rule, the entire entry is dropped immediately.
21+
2. **Allow lists are optional.** When no `allow_*` rule is configured for a category, everything is permitted by default. As soon as you define an allow list, only the explicitly listed items will pass.
22+
3. **Multi-key consistency.** Commands with multiple keys (for example, `MSET`) must either pass for all keys or the entry is discarded. RedisShake also emits logs when a mixed result is detected to help you troubleshoot your patterns.
23+
24+
Combining allow and block lists lets you quickly express exceptions such as “allow user keys except temporary cache variants.” Block rules take precedence, so avoid listing the same pattern in both allow and block lists.
25+
26+
## Key Filtering
27+
28+
RedisShake supports filtering by key names, prefixes, suffixes, and regular expressions. For example:
29+
930
```toml
1031
[filter]
11-
allow_keys = ["user:1001", "product:2001"] # allowed key names
12-
allow_key_prefix = ["user:", "product:"] # allowed key name prefixes
13-
allow_key_suffix = [":active", ":valid"] # allowed key name suffixes
14-
allow_key_regex = [":\\d{11}:"] # allowed key name regex, 11-digit mobile phone number
15-
block_keys = ["temp:1001", "cache:2001"] # blocked key names
16-
block_key_prefix = ["temp:", "cache:"] # blocked key name prefixes
17-
block_key_suffix = [":tmp", ":old"] # blocked key name suffixes
18-
block_key_regex = [":test:\\d{11}:"] # blocked key name regex, 11-digit mobile phone number with "test" prefix
32+
allow_keys = ["user:1001", "product:2001"] # allow-listed key names
33+
allow_key_prefix = ["user:", "product:"] # allow-listed key prefixes
34+
allow_key_suffix = [":active", ":valid"] # allow-listed key suffixes
35+
allow_key_regex = [":\\d{11}:"] # allow-listed key regex (11-digit phone numbers)
36+
block_keys = ["temp:1001", "cache:2001"] # block-listed key names
37+
block_key_prefix = ["temp:", "cache:"] # block-listed key prefixes
38+
block_key_suffix = [":tmp", ":old"] # block-listed key suffixes
39+
block_key_regex = [":test:\\d{11}:"] # block-listed key regex with "test" prefix
1940
```
20-
If these options are not set, all keys are allowed by default.
2141

22-
## Filtering Databases
23-
You can specify allowed or blocked database numbers, for example:
42+
Regular expressions follow Go’s syntax. Escape backslashes carefully when writing inline TOML strings. Regex support allows complex tenant-isolation scenarios, such as filtering phone numbers or shard identifiers.
43+
44+
## Database Filtering
45+
46+
Limit synchronization to specific logical databases or skip known noisy ones:
47+
2448
```toml
2549
[filter]
2650
allow_db = [0, 1, 2]
2751
block_db = [3, 4, 5]
2852
```
29-
If these options are not set, all databases are allowed by default.
3053

31-
## Filtering Commands
32-
RedisShake allows you to filter specific Redis commands, for example:
54+
If neither `allow_db` nor `block_db` is set, all databases are synchronized.
55+
56+
## Command and Command-Group Filtering
57+
58+
Restrict the traffic by command name or by the Redis command group. This is useful when the destination lacks support for scripting or cluster administration commands.
59+
3360
```toml
3461
[filter]
3562
allow_command = ["GET", "SET"]
3663
block_command = ["DEL", "FLUSHDB"]
37-
```
38-
39-
## Filtering Command Groups
4064

41-
You can also filter by command groups. Available command groups include:
42-
SERVER, STRING, CLUSTER, CONNECTION, BITMAP, LIST, SORTED_SET, GENERIC, TRANSACTIONS, SCRIPTING, TAIRHASH, TAIRSTRING, TAIRZSET, GEO, HASH, HYPERLOGLOG, PUBSUB, SET, SENTINEL, STREAM
43-
For example:
44-
```toml
45-
[filter]
4665
allow_command_group = ["STRING", "HASH"]
4766
block_command_group = ["SCRIPTING", "PUBSUB"]
4867
```
68+
69+
Command groups follow the [Redis command key specifications](https://redis.io/docs/reference/key-specs/). Use groups to efficiently exclude entire data structures (for example, block `SCRIPTING` to avoid unsupported Lua scripts when synchronizing to a cluster).
70+
71+
## Configuration Reference
72+
73+
| Option | Type | Description |
74+
| --- | --- | --- |
75+
| `allow_keys` / `block_keys` | `[]string` | Exact key names to allow or block. |
76+
| `allow_key_prefix` / `block_key_prefix` | `[]string` | Filter keys by prefix. |
77+
| `allow_key_suffix` / `block_key_suffix` | `[]string` | Filter keys by suffix. |
78+
| `allow_key_regex` / `block_key_regex` | `[]string` | Regular expressions evaluated against the full key. |
79+
| `allow_db` / `block_db` | `[]int` | Logical database numbers to include or exclude. |
80+
| `allow_command` / `block_command` | `[]string` | Redis command names. |
81+
| `allow_command_group` / `block_command_group` | `[]string` | Redis command groups such as `STRING`, `HASH`, `SCRIPTING`. |
82+
83+
All options are optional. When both an allow and block rule apply to the same category, block rules win. Keep configurations symmetrical across active/standby clusters to avoid asymmetric data drops during failover.

docs/src/en/filter/function.md

Lines changed: 69 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,28 @@ outline: deep
44

55
# What is function
66

7-
RedisShake provides a function feature that implements the `transform` capability in [ETL (Extract-Transform-Load)](https://en.wikipedia.org/wiki/Extract,_transform,_load). By utilizing functions, you can achieve similar functionalities:
8-
* Change the `db` to which data belongs, for example, writing data from source `db 0` to destination `db 1`.
9-
* Filter data, for instance, only writing source data with keys starting with `user:` to the destination.
10-
* Modify key prefixes, such as writing a source key `prefix_old_key` to a destination key `prefix_new_key`.
11-
* ...
7+
The **function** option extends the `[filter]` section with a Lua hook. Built-in filter rules run first to decide whether a command should leave RedisShake; only the surviving commands enter the Lua function, where you can reshape, split, or enrich them before they reach the destination. This hook is intended for lightweight adjustments that are difficult to express with static allow/block lists.
128

13-
To use the function feature, you only need to write a Lua script. After RedisShake retrieves data from the source, it converts the data into Redis commands. Then, it processes these commands, parsing information such as `KEYS`, `ARGV`, `SLOTS`, `GROUP`, and passes this information to the Lua script. The Lua script processes this data and returns the processed commands. Finally, RedisShake writes the processed data to the destination.
9+
With the function feature you can:
10+
11+
* Change the database (`db`) to which data belongs (for example, write source `db 0` into destination `db 1`).
12+
* Filter or drop specific data, keeping only keys that match custom business rules.
13+
* Rewrite commands, such as expanding `MSET` into multiple `SET` commands or adding new key prefixes.
14+
* Emit additional commands (for metrics or cache warming) derived from the incoming data stream.
15+
16+
## Execution Flow
17+
18+
1. RedisShake retrieves commands from the reader and parses metadata such as command name, keys, key slots, and group.
19+
2. Built-in filter rules evaluate the command. Anything blocked here never reaches Lua or the writer.
20+
3. For the remaining entries, RedisShake creates a Lua state and exposes read-only context variables (`DB`, `CMD`, `KEYS`, and so on) plus helper functions under the `shake` table.
21+
4. Your Lua code decides which commands to send downstream by calling `shake.call` zero or more times.
22+
23+
If your script does not invoke `shake.call`, the original command is suppressed. This makes it easy to implement drop-and-replace logic, but also means forgetting a `shake.call` will silently discard data. Always add logging while testing.
24+
25+
## Quick Start
26+
27+
Place the Lua script inline in the `[filter]` section of the configuration file:
1428

15-
Here's a specific example:
1629
```toml
1730
[filter]
1831
function = """
@@ -30,47 +43,52 @@ address = "127.0.0.1:6379"
3043
[redis_writer]
3144
address = "127.0.0.1:6380"
3245
```
33-
`DB` is information provided by RedisShake, indicating the db to which the current data belongs. `shake.log` is used for logging, and `shake.call` is used to call Redis commands. The purpose of the above script is to discard data from source `db 0` and write data from other `db`s to the destination.
3446

35-
In addition to `DB`, there is other information such as `KEYS`, `ARGV`, `SLOTS`, `GROUP`, and available functions include `shake.log` and `shake.call`. For details, please refer to [function API](#function-api).
47+
`DB` is information provided by RedisShake, indicating the database to which the current data belongs. `shake.log` is used for logging, and `shake.call` emits a Redis command to the destination. The above script discards data from source `db 0` and forwards data from the other databases.
3648

3749
## function API
3850

3951
### Variables
4052

41-
Because some commands contain multiple keys, such as the `mset` command, the variables `KEYS`, `KEY_INDEXES`, and `SLOTS` are all array types. If you are certain that a command has only one key, you can directly use `KEYS[1]`, `KEY_INDEXES[1]`, `SLOTS[1]`.
53+
Because some commands contain multiple keys, such as `MSET`, the variables `KEYS`, `KEY_INDEXES`, and `SLOTS` are all array types. If you are certain that a command has only one key, you can directly use `KEYS[1]`, `KEY_INDEXES[1]`, and `SLOTS[1]`.
4254

4355
| Variable | Type | Example | Description |
44-
|-|-|-|-----|
45-
| DB | number | 1 | The `db` to which the command belongs |
46-
| GROUP | string | "LIST" | The `group` to which the command belongs, conforming to [Command key specifications](https://redis.io/docs/reference/key-specs/). You can check the `group` field for each command in [commands](https://github.com/tair-opensource/RedisShake/tree/v4/scripts/commands) |
47-
| CMD | string | "XGROUP-DELCONSUMER" | The name of the command |
48-
| KEYS | table | {"key1", "key2"} | All keys of the command |
49-
| KEY_INDEXES | table | {2, 4} | The indexes of all keys in `ARGV` |
50-
| SLOTS | table | {9189, 4998} | The [slots](https://redis.io/docs/reference/cluster-spec/#key-distribution-model) to which all keys of the current command belong |
51-
| ARGV | table | {"mset", "key1", "value1", "key2", "value2"} | All parameters of the command |
56+
| --- | --- | --- | --- |
57+
| `DB` | number | `1` | The database to which the command belongs. |
58+
| `CMD` | string | `"XGROUP-DELCONSUMER"` | The name of the command. |
59+
| `GROUP` | string | `"LIST"` | The command group, conforming to [Command key specifications](https://redis.io/docs/reference/key-specs/). You can check the `group` field for each command in [commands](https://github.com/tair-opensource/RedisShake/tree/v4/scripts/commands). |
60+
| `KEYS` | table | `{"key1", "key2"}` | All keys of the command. |
61+
| `KEY_INDEXES` | table | `{2, 4}` | Indexes of all keys inside `ARGV`. |
62+
| `SLOTS` | table | `{9189, 4998}` | Hash slots of the keys (cluster mode). |
63+
| `ARGV` | table | `{"mset", "key1", "value1", "key2", "value2"}` | All command arguments, including the command name at index `1`. |
5264

5365
### Functions
54-
* `shake.call(DB, ARGV)`: Returns a Redis command that RedisShake will write to the destination.
55-
* `shake.log(msg)`: Prints logs.
66+
67+
* `shake.call(db, argv_table)`: Emits a command to the writer. The first element of `argv_table` must be the command name. You can call `shake.call` multiple times to split one input into several outputs (for example, expand `MSET` into multiple `SET`).
68+
* `shake.log(msg)`: Prints logs prefixed with `lua log:` in `shake.log`. Use this to verify script behaviour during testing.
5669

5770
## Best Practices
5871

72+
### General Recommendations
73+
74+
* **Keep scripts idempotent.** RedisShake may retry commands, so ensure the emitted commands do not rely on side effects.
75+
* **Guard against missing keys.** Always check whether `KEYS[1]` exists before slicing to avoid runtime errors with keyless commands such as `PING`.
76+
* **Prefer simple logic.** Complex loops increase Lua VM time and can slow down synchronization. Offload heavy transformations to upstream processes when possible.
5977

6078
### Filtering Keys
6179

6280
```lua
6381
local prefix = "user:"
6482
local prefix_len = #prefix
6583

66-
if string.sub(KEYS[1], 1, prefix_len) ~= prefix then
84+
if not KEYS[1] or string.sub(KEYS[1], 1, prefix_len) ~= prefix then
6785
return
6886
end
6987

7088
shake.call(DB, ARGV)
7189
```
7290

73-
The effect is to only write source data with keys starting with `user:` to the destination. This doesn't consider cases of multi-key commands like `mset`.
91+
The effect is to only write source data with keys starting with `user:` to the destination. This does not consider cases of multi-key commands like `MSET`.
7492

7593
### Filtering DB
7694

@@ -85,12 +103,12 @@ shake.call(DB, ARGV)
85103

86104
The effect is to discard data from source `db 0` and write data from other `db`s to the destination.
87105

88-
89106
### Filtering Certain Data Structures
90107

91-
You can use the `GROUP` variable to determine the data structure type. Supported data structure types include: `STRING`, `LIST`, `SET`, `ZSET`, `HASH`, `SCRIPTING`, etc.
108+
You can use the `GROUP` variable to determine the data structure type. Supported data structure types include `STRING`, `LIST`, `SET`, `ZSET`, `HASH`, `SCRIPTING`, and more.
92109

93110
#### Filtering Hash Type Data
111+
94112
```lua
95113
if GROUP == "HASH" then
96114
return
@@ -100,7 +118,7 @@ shake.call(DB, ARGV)
100118

101119
The effect is to discard `hash` type data from the source and write other data to the destination.
102120

103-
#### Filtering [LUA Scripts](https://redis.io/docs/interact/programmability/eval-intro/)
121+
#### Filtering [Lua Scripts](https://redis.io/docs/interact/programmability/eval-intro/)
104122

105123
```lua
106124
if GROUP == "SCRIPTING" then
@@ -109,7 +127,22 @@ end
109127
shake.call(DB, ARGV)
110128
```
111129

112-
The effect is to discard `lua` scripts from the source and write other data to the destination. This is common when synchronizing from master-slave to cluster, where there are LUA scripts not supported by the cluster.
130+
The effect is to discard Lua scripts from the source and write other data to the destination. This is common when synchronizing from master-slave to cluster, where there are Lua scripts not supported by the cluster.
131+
132+
### Splitting Commands
133+
134+
```lua
135+
if CMD == "MSET" then
136+
for i = 2, #ARGV, 2 do
137+
shake.call(DB, {"SET", ARGV[i], ARGV[i + 1]})
138+
end
139+
return
140+
end
141+
142+
shake.call(DB, ARGV)
143+
```
144+
145+
This pattern expands one `MSET` into several `SET` commands to improve compatibility with destinations that prefer single-key writes.
113146

114147
### Modifying Key Prefixes
115148

@@ -119,20 +152,21 @@ local prefix_new = "prefix_new_"
119152

120153
shake.log("old=" .. table.concat(ARGV, " "))
121154

122-
for i, index in ipairs(KEY_INDEXES) do
155+
for _, index in ipairs(KEY_INDEXES) do
123156
local key = ARGV[index]
124-
if string.sub(key, 1, #prefix_old) == prefix_old then
157+
if key and string.sub(key, 1, #prefix_old) == prefix_old then
125158
ARGV[index] = prefix_new .. string.sub(key, #prefix_old + 1)
126159
end
127160
end
128161

129162
shake.log("new=" .. table.concat(ARGV, " "))
130163
shake.call(DB, ARGV)
131164
```
165+
132166
The effect is to write the source key `prefix_old_key` to the destination key `prefix_new_key`.
133167

134168
### Swapping DBs
135-
169+
136170
```lua
137171
local db1 = 1
138172
local db2 = 2
@@ -146,3 +180,9 @@ shake.call(DB, ARGV)
146180
```
147181

148182
The effect is to write source `db 1` to destination `db 2`, write source `db 2` to destination `db 1`, and leave other `db`s unchanged.
183+
184+
## Troubleshooting
185+
186+
* **Script fails to compile:** RedisShake validates the Lua code during startup and panics on syntax errors. Check the configuration logs for the exact line number.
187+
* **No data reaches the destination:** Ensure that `shake.call` is invoked for every branch. Adding `shake.log` statements helps confirm which code path runs.
188+
* **Performance drops:** Heavy scripts may become CPU-bound. Consider narrowing the scope with filters or moving expensive operations out of RedisShake.

docs/src/en/guide/config.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,12 @@ RedisShake provides different Writers to interface with different targets, see t
3939

4040
## filter Configuration
4141

42-
You can set filter rules through the configuration file. Refer to [Filter and Processing](../filter/filter.md) and [function](../filter/function.md).
42+
The `[filter]` section contains two layers:
43+
44+
* **Rule engine:** Configure `allow_*` and `block_*` lists to keep or drop keys, databases, commands, and command groups. See [Filter and Processing](../filter/filter.md) for detailed semantics and examples.
45+
* **Lua function hook:** Provide inline Lua code via the `function` option to rewrite commands after they pass the rule engine. See [function](../filter/function.md) for API details and best practices.
46+
47+
Filters always run before the Lua hook. Commands blocked by the rule engine never enter the script or reach the writer, so you can reserve the Lua layer for the smaller, approved subset of traffic.
4348

4449
## advanced Configuration
4550

0 commit comments

Comments
 (0)