All configurations in URI Drain are done using uri_drain.ini file. Here is a specific demo.
Snapshot is used to serialize and store the analysis results that have been saved in the current system. Currently, it supports saving snapshots to the file system.
| Name | Type(Unit) | Environment Key | Default | Description |
|---|---|---|---|---|
| file_dir | string | SNAPSHOT_FILE_PATH | /tmp/ | The directory to save the snapshot, the persistent would disable when the value is empty. |
| snapshot_interval_minutes | int(minute) | SNAPSHOT_INTERVAL_MINUTES | 10 | The interval to save the snapshot. |
| compress_state | bool | SNAPSHOT_COMPRESS_STATE | True | Whether to compress the snapshot through zlib with base64. |
When aggregation methods are detected, Masking determines how to generate the aggregation information.
Currently, all similar content is replaced with {var} by default.
| Name | Type(Unit) | Environment Key | Default | Description |
|---|---|---|---|---|
| mask_prefix | string | MASKING_PREFIX | { | The prefix to mask the parameter. |
| mask_suffix | string | MASKING_SUFFIX | } | The suffix to mask the parameter. |
Drain is the core algorithm of URI Drain.
| Name | Type(Unit) | Environment Key | Default | Description |
|---|---|---|---|---|
| sim_th | float | DRAIN_SIM_TH | 0.4 | The similarity threshold to decide if a new sequence should be merged into an existing cluster. |
| depth | int | DRAIN_DEPTH | 4 | Max depth levels of pattern. Minimum is 2. |
| max_children | int | DRAIN_MAX_CHILDREN | 100 | Max number of children of an internal node. |
| max_clusters | int | DRAIN_MAX_CLUSTERS | 1024 | Max number of tracked clusters (unlimited by default). When this number is reached, model starts replacing old clusters with a new ones according to the LRU policy. |
| extra_delimiters | string | DRAIN_EXTRA_DELIMITERS | ["/"] | The extra delimiters to split the sequence. |
| analysis_min_url_count | int | DRAIN_ANALYSIS_MIN_URL_COUNT | 20 | The minimum number of unique URLs(each service) to trigger the analysis. |
| combine_min_url_count | int | DRAIN_COMBINE_MIN_URL_COUNT | 3 | The minimum number of unique URLs(candidate of each service) to mask as variable URL(encase some similar URL are not restful, such as /test/one and test/two). |
| customized_words_file | string | DRAIN_CUSTOMIZED_WORDS_FILE | The file path of customized words for analysis. Each line is a customized word. |
Profiling is used to enable the profiling of the algorithm.
| Name | Type(Unit) | Environment Key | Default | Description |
|---|---|---|---|---|
| enabled | bool | PROFILING_ENABLED | False | Whether to enable the profiling. |
| report_sec | int(second) | PROFILING_REPORT_SEC | 30 | The interval to report the profiling information. |
Logging configuration controls the verbosity of application logs.
| Name | Type | Environment Key | Default | Description |
|---|---|---|---|---|
| log_level | string | LOG_LEVEL | INFO | The logging level for the application. Valid values: DEBUG, INFO, WARNING, ERROR, CRITICAL. Use ERROR in production to reduce log volume. |
Note: In production environments, setting
LOG_LEVEL=ERRORis recommended to prevent excessive log accumulation, which can lead to significant disk space consumption in Docker containers over time.
In the configuration, you can see that most of the configurations are in the format ${xxx:config_value}.
It means that when the program starts, the agent would first read the xxx from the system environment variables in the runtime.
If it cannot be found, the value would be used as the config_value as value.