feat(sinks): add dlq logic on sink and support for elasticsearch #24930
feat(sinks): add dlq logic on sink and support for elasticsearch #24930tanganellilore wants to merge 1 commit intovectordotdev:masterfrom
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
For the reviewer: I am not a Rust expert, but I am familiar with the logic behind Elasticsearch and DLQs. I have maintained the original behavior and logic for sinks when the DLQ is disabled. However, I have some concerns regarding the error logs: when an error occurs in the Elastic sink without a DLQ, a log line is emitted multiple times—once when retries are exhausted and again when events are discarded. I am concerned this might be highly verbose for the user. Generally speaking, I believe the metrics implementation is slightly misleading for Elasticsearch. Currently, if a bulk request fails without a DLQ, vector top behaves as follows:
The metrics with DLQ follow this same logic. For example, if mapping errors occur while the DLQ is enabled, events are correctly routed to the DLQ; the Elastic sink does not show an error because the failure is handled by the DLQ forwarder. I haven't changed this metrics logic in this PR as it would require a significant code refactoring, but I wanted to highlight it: if you test this, you might notice discrepancies that could look like bugs, but they are actually inherited from the current implementation. |
Summary
This PR introduces a Dead Letter Queue (DLQ) output port for sinks, allowing failed/rejected events to be routed to other components instead of being silently dropped.
The implementation includes:
SinkDlqutility (src/sinks/util/dlq.rs) that can be adopted by any sink<sink_id>.dlqoutput portThe
request_retry_partialflag controls behavior: retryable failures (429, 5xx) are retried normally; only truly non-retryable failures are sent to DLQ.Vector configuration
How did you test this PR?
Unit tests in src/sinks/elasticsearch/service.rs covering:
I have tested in two way:
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
N/A