The OTel-Arrow project is building a high-performance, end-to-end column-oriented telemetry pipeline for OpenTelemetry data based on Apache Arrow.
The OpenTelemetry Protocol with Apache Arrow (OTAP) is designed as the column-oriented equivalent of OpenTelemetry Protocol (OTLP) that dramatically reduces network usage, and our protocol specification ensures that OTAP and OTLP are always convertible, without loss, in both directions.
The OTAP Dataflow Engine is our new Rust OpenTelemetry code base, a pipeline engine that dramatically reduces network, memory, and CPU usage, gaining efficiency through a number of optimizations, including the shared-nothing and thread-per-core design patterns and extensive use of zero-copy data types.
The OTAP Dataflow Engine is embeddable software. Our repo builds a
demonstration df_engine artifact with core nodes included and
features YAML configuration, but really it's designed for safely
adding OpenTelemetry features and capabilities in other programs,
anywhere that Rust can be compiled, with fine-grained control over
memory and CPU resources.
The OTAP Dataflow Engine has built-in support for OTAP and OTLP, receivers and exporters, and built-in processors featuring batching, fanout, failover, retry, and routing by signal type. It has processors for common forms of filtering, transforming, sampling, and temporal aggregation. We have a durable buffer processor, based on the Arrow IPC format, introducing disk-based storage into pipelines for reliable delivery, and there are others such as a Syslog receiver and a Console exporter.
Our transform processor is built using Apache Datafusion, the industry-leading embedded query engine, itself based on Apache Arrow, and our Parquet exporter for OTAP makes OpenTelemetry data directly accessible to a wide range of tools, thanks to the Apache Parquet ecosystem.
We're self-instrumenting ourselves with an experimental OpenTelemetry SDK that emits OTAP directly, meaning we have an end-to-end column-oriented telemetry pipeline in Rust.
Our Golang Collector components otelarrowreceiver and
otelarrowexporter have been included in the
OpenTelemetry Collector-Contrib distribution since the July 2024
release of v0.104.0.
Our project is growing, new contributors are welcome. Join us in
#otel-arrow on the CNCF Slack!
The Apache Arrow is a major open-source project for in-memory and on-wire data exchange using a column-oriented representation. For OpenTelemetry readers, Apache Arrow is a lot like us, as the project encompasses a data format, a set of libraries, and an ecosystem.
The Apache Arrow format is a specification for the in-memory layout of a record batch, including details about the schema, the column names and types, and a length, and then a set of Arrays, one per column of the correct type and matching length. Arrow record batches support a number of types, including scalars of various width, strings and binary data, arrays, lists, structs, maps, as well as dictionary-encodings over the other types.
With Apache Arrow, we can build a record batch in one language and pass it to another language using shared memory. Apache Arrow specifies Arrow IPC, an encoding for column-oriented data that extends zero-copy to network and file-based communications.
OTAP is formally the OpenTelemetry Protocol with Apache Arrow, abbreviated OTAP for OTel-Arrow Protocol, a column-oriented representation for OpenTelemetry data supporting efficient in-memory and on-wire telemetry exchange. Where OpenTelemetry's OTLP is a row-oriented protocol, OTel-Arrow's OTAP protocol uses Apache Arrow to encode telemetry in a columnar format that is more efficient for CPUs to work with, because of vectorization, and compresses dramatically better, especially over long-lived streams with the use of Arrow IPC stream encoding.
OTAP maintains 100% compatibility with the OpenTelemetry data model for logs, traces, and metrics, with a straight-forward and non-lossy round trip from OTLP to OTAP and back, and support for the OpenTelemetry Profiles signal in OTAP is important future work.
In the OTAP Dataflow Engine, batches of OTAP data are represented using multiple record batches in an arrangement referred to as a "star schema". The number of record batches varies by OpenTelemetry signal type, see our data model documentation for details. OTAP Dataflow Engine also transports OTLP protocol bytes directly and efficiently by avoiding protocol message objects.
Adapter libraries for conversion between OTAP and OTLP representations are provided in Rust and Golang in this repository.
OTel-Arrow components ship in the OpenTelemetry
Collector-Contrib distribution. The two components
extend the configuration model and settings of the core OTLP receiver
and exporter. By design, you can swap otlp for otelarrow in your
Collector configuration. To locate one, see OpenTelemetry Collector
releases.
See the Exporter and Receiver docs for complete and up-to-date configuration details and Collector-specific examples.
We are not at this time providing pre-built OTAP Dataflow Engine releases. Developers can build the OTAP Dataflow Engine in a minimal configuration with the following:
git clone https://github.com/open-telemetry/otel-arrow.git
cd otel-arrow/rust/otap-dataflow
cargo build --bin df_engine --no-default-featuresA directory of example configurations provides a number of examples (e.g., syslog-console.yaml). For example, to receive syslog messages with our Syslog/CEF receiver on port 5140 and print them to the console:
./target/debug/df_engine -c ./configs/syslog-console.yamlLinux/MacOS users can test this with:
logger -n 127.0.0.1 -P 5140 -d --rfc3164 "hello world"PowerShell users can test this with:
$t=Get-Date -Format 'MMM dd HH:mm:ss';$u=New-Object Net.Sockets.UdpClient;$b=[Text.Encoding]::ASCII.GetBytes("<14>$t powershell test: hello world");$u.Send($b,$b.Length,'127.0.0.1',5140);$u.Close()See the admin console on port 8080, or visit
http://localhost:8080/metrics to see engine metrics in Prometheus
format.
See the OTAP Dataflow Engine documentation for more details.
See our project phases document for project goals and history. Phase 1 established the OTAP representation and proved that a column-oriented representation for OpenTelemetry is good for compression performance.
We are currently completing Phase 2, delivering the OTAP Dataflow engine. Phase 2 has demonstrated that a column-oriented, Arrow-based pipeline delivers new levels of performance for OpenTelemetry. See our live continuous benchmarks and nightly benchmark suite.
As a community, we are planning phase 3, see the links below to join us.
We meet weekly, alternating between Tuesday at 4:00 PM PT and Thursday at 8:00 AM PT. Check the OpenTelemetry community calendar for dates and Zoom links.
Whether you're a seasoned OpenTelemetry developer, just starting your journey, or simply curious about the work we do, you're more than welcome to participate!
- Albert Lockett, F5
- Drew Relmas, Microsoft
- Joshua MacDonald, Microsoft
- Laurent Quérel, F5
For more information about the maintainer role, see the community repository.
- Cijo Thomas, Microsoft
- Lalit Kumar Bhasin, Microsoft
- Lei Huang, Greptime
- Utkarsh Umesan Pillai, Microsoft
For more information about the approver role, see the community repository.
- Tom Tan, Microsoft
For more information about the approver role, see the community repository.
- Alex Boten, Approver
- Moh Osman, Approver
Here are some of our important documents. You can find more work-in-progress design documentation for the OTAP Dataflow Engine.
| Document | Description |
|---|---|
| OTAP Spec | Formal protocol specification |
| OTAP Basics | Introduction to the OTAP protocol |
| Data Model | Arrow schema mappings for OTLP entities |
| Phase 1 Overview | Wire protocol details and historical benchmarks |
| Phase 2 Design | End-to-end pipeline architecture |
| Engine Design | Engine architecture |
| Benchmarks | Current performance results |
| Validation Process | Encoding/decoding validation process |
| Dataflow Engine | Rust crate architecture and component reference |
