dead_letter_queue.flush_check_interval new config for flushing staled segment files.#19036
Conversation
…new config for flushing staled segment files.
🤖 GitHub commentsJust comment with:
|
|
This pull request does not have a backport label. Could you fix it @mashhurs? 🙏
|
🔍 Preview links for changed docs |
✅ Vale Linting ResultsNo issues found on modified lines! The Vale linter checks documentation changes against the Elastic Docs style guide. To use Vale locally or report issues, refer to Elastic style guide for Vale. |
…in the threads API results. Add suggestions from the docs review. Re-organize the duration clam logic in a way for better maintainable and fix the unit tests.
andsel
left a comment
There was a problem hiding this comment.
Left some comment on how to make the changes more readable and testable.
| Setting::BooleanSetting.new("dead_letter_queue.enable", false), | ||
| Setting::BytesSetting.new("dead_letter_queue.max_bytes", "1024mb"), | ||
| Setting::NumericSetting.new("dead_letter_queue.flush_interval", 5000), | ||
| Setting::NumericSetting.new("dead_letter_queue.flush_check_interval", 1000), |
There was a problem hiding this comment.
There is a specific reason to do not use TimeValueSetting which should be preferred for time intervals?
There was a problem hiding this comment.
Don't we need to keep the consistency with flush_interval ? I initially tried and in the docs it looked strange that flush_interval guides in milliseconds like 5000 and flush_check_interval in string format like 1s.
There was a problem hiding this comment.
I understand your concern, but the setting is an interval and there is a specific setting type for this case, would be nice to use that because is less error prone. If we use a numeric value that are milliseconds, we have to specify in the doc the unit of time, while with TimeValueSetting is explicit. However, like you said, it created inconsistency with the existing, but this is a new setting.
Prefer consistency or use the appropriate setting? I don't have a strong motivation for one or the other.
There was a problem hiding this comment.
Prefer consistency or use the appropriate setting?
I am also okay with either one. I was thinking from the user experience point of view that they are belong to the same domain, same type but in a different format.
There was a problem hiding this comment.
@jsvd WDYT? about this thread? Prefer consistency or use the setting type that's more aligned with the use (it's an interval => TimeValue) ?
There was a problem hiding this comment.
++ on using the more aligned one. if we want we can change the type for 10.x
…ueueWriter.java Remove unused method. Co-authored-by: Andrea Selva <selva.andre@gmail.com>
…emove confusing scheduler from the docs explanations. unit tests for the only newly introduced conditions.
andsel
left a comment
There was a problem hiding this comment.
Left some suggestion to improve test code and a little rewording of the doc to avoid using "scheduler" without introducing the concept first.
Doc consistency and test rename suggestions accepted. Co-authored-by: Andrea Selva <selva.andre@gmail.com>
💛 Build succeeded, but was flaky
Failed CI StepsHistory
cc @mashhurs |
|
@Mergifyio backport 9.4 |
✅ Backports have been createdDetails
|
…ed segment files. (#19036) (#19090) * Validates to be min 1s to keep consistency with the docs. Introduces new config for flushing staled segment files. * Add pipeline name to the DLQ flush thread name for better visibility in the threads API results. Add suggestions from the docs review. Re-organize the duration clam logic in a way for better maintainable and fix the unit tests. * Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueWriter.java Remove unused method. * Move the flush chech interval to the DeadLetterQueueWriter.Builder. Remove confusing scheduler from the docs explanations. unit tests for the only newly introduced conditions. * Apply suggestions from code review Doc consistency and test rename suggestions accepted. * Keep the interval type as a Duration, rename and simplify test suites. --------- (cherry picked from commit f2f0d3f) Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> Co-authored-by: Andrea Selva <selva.andre@gmail.com>
…config for flushing staled segment files. (#19089) * `dead_letter_queue.flush_check_interval` new config for flushing staled segment files. (#19036) * Validates to be min 1s to keep consistency with the docs. Introduces new config for flushing staled segment files. * Add pipeline name to the DLQ flush thread name for better visibility in the threads API results. Add suggestions from the docs review. Re-organize the duration clam logic in a way for better maintainable and fix the unit tests. * Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueWriter.java Remove unused method. Co-authored-by: Andrea Selva <selva.andre@gmail.com> * Move the flush chech interval to the DeadLetterQueueWriter.Builder. Remove confusing scheduler from the docs explanations. unit tests for the only newly introduced conditions. * Apply suggestions from code review Doc consistency and test rename suggestions accepted. Co-authored-by: Andrea Selva <selva.andre@gmail.com> * Keep the interval type as a Duration, rename and simplify test suites. --------- Co-authored-by: Andrea Selva <selva.andre@gmail.com> (cherry picked from commit f2f0d3f) # Conflicts: # logstash-core/src/main/java/org/logstash/execution/AbstractPipelineExt.java * Resolve the conflict. --------- Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> Co-authored-by: Mashhur <mashhur.sattorov@elastic.co>
Release notes
Introduces new
dead_letter_queue.flush_check_intervalconfig for flushing the staled segment files scheduler which can reduce frequent check overhead.What does this PR do?
dead_letter_queue.flush_check_intervalparam for the segment file stale check scheduler. See the problem description - Introduce a period for file flushing staled segment files #19037dead_letter_queue.flush_intervaldead_letter_queue.flush_intervalfor min 1s to keep consistency with the docs - https://www.elastic.co/docs/reference/logstash/dead-letter-queues: "Note that this value cannot be set to lower than 1000ms."High level results
5-pipelines runs with 1-worker (to get comparable output), their configurations:
flush_interval: 5000andflush_check_interval: 1000flush_interval: 5000andflush_check_interval: 2000flush_interval: 5000andflush_check_interval: 5000flush_interval: 10000andflush_check_interval: 5000flush_interval: 10000andflush_check_interval: 7000Following is the result table which shows the efficiency:
Why is it important/What is the impact to the user?
The users who are using intensive DLQ operations (write/read), the frequent flush check scheduler might give overhead to the pipeline, means uses much CPU. Introducing configurable scheduler cadence improves the pipeline efficiency by removing frequent operations.
Checklist
Author's Checklist
How to test this PR locally
flush_intervalandflush_check_intervalparams, below is the example.ES configured in
config/tests/elasticsearch-output.confneeds to reject events either 400 or 404 to be routed to the DLQ. Used the following config:Related issues
Use cases
Screenshots
Logs