The OpenTelemetry Arrow project is currently in Phase 2, where we are building an end-to-end dataflow engine in Rust. This architecture is expected to have substantially lower overhead than traditional row-oriented pipelines.
We run two types of automated benchmark tests for Phase 2:
- Continuous
Benchmarks
- Run with each commit to main
- Nightly
Benchmarks
- Comprehensive test suites run nightly
Both provide performance metrics for the OTAP dataflow engine for various scenarios. Unless otherwise specified, all tests run on a single CPU core.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/nightly/filter/
Tests a filter scenario where a filter processor drops 95% of logs. Processes approximately 100k logs/sec input with ~5k logs/sec output. The benchmark page includes a direct comparison with the equivalent OTel Collector performing the same filtering operation.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/nightly/backpressure/
Measures backpressure impact with wait_for_result set to true on the dataflow
engine receivers. Processes approximately 100k logs/sec input and output. The
pipeline includes an attribute processor configured to rename an attribute,
which forces the dataflow engine to perform in-memory representation and
conversion rather than operating in pass-through mode.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/nightly/syslog/
Tests syslog ingestion via UDP with two variations:
- Basic syslog message format
- CEF (Common Event Format) formatted messages
Processes approximately 5k logs/sec input and output.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/continuous/
Standard load test processing 100k records/sec input and output on a single CPU core. This test runs with each commit to main.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/nightly/standardload-batch-size/
Standard load test (100k logs/sec) with varying input batch sizes: 10, 100, 512, 1024, 4096, and 8192 records per request. Uses power-of-2 values that align with OTel SDK defaults (512 is the standard SDK batch size). Tests both OTAP->OTAP (native protocol) and OTLP->OTLP (standard protocol) configurations to evaluate the impact of batch size on CPU, memory, and network efficiency.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/nightly/saturation/
Tests performance at saturation across different CPU configurations: 1, 2, 4, 8, and 16 cores. Runs nightly to validate scaling characteristics.
TODO: Update test output to include scalability ratios in addition to raw throughput numbers.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/continuous-passthrough/
Tests maximum throughput in pass-through mode where the engine forwards data without transformation. This scenario represents the minimum engine overhead for load balancing and routing use cases. Unlike the saturation tests which include an attribute processor, pass-through mode allows the engine to forward data without materializing the internal representation, achieving significantly higher throughput.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/continuous-idle-state/
Measures resource consumption in idle state across multiple core configurations (1, 2, 4, 8, 16, 32 cores) to validate the linear memory scaling model:
Memory (MiB) = C + N * RWhere:
- C = Constant overhead (shared infrastructure)
- N = Number of cores
- R = Per-core memory overhead
This validates the share-nothing, thread-per-core architecture where each additional core adds a predictable amount of memory overhead.
URL: https://open-telemetry.github.io/otel-arrow/benchmarks/binary-size/
Tracks the binary size of the dataflow engine for Linux ARM64 and AMD64 architectures over time.
All benchmark tests measure the following metrics:
- Logs/sec input - Input throughput
- Logs/sec output - Output throughput
- RAM - Average and maximum memory usage
- Normalized CPU - Average and maximum CPU usage, normalized to 0-100% where 100% represents full utilization of all available cores. For example, in a 4-core test, 80% means 3.2 cores are being used (0.8 X 4 cores)
- Network bytes/sec - Input and output network bandwidth
For historical benchmark results from Phase 1 (the collector-to-collector traffic reduction implementation), please see Phase 1 Benchmark Results.
Phase 1 focused on facilitating traffic reduction between OpenTelemetry Collectors and is now complete. These components are available in the OpenTelemetry Collector-Contrib distribution.