bpf: add gRPC/HTTP2 context propagation via sk_msg HPACK injection#1832
bpf: add gRPC/HTTP2 context propagation via sk_msg HPACK injection#1832mmat11 wants to merge 3 commits intoopen-telemetry:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds end-to-end gRPC/HTTP2 context propagation support by injecting/parsing traceparent in HTTP/2 (HPACK) frames and introducing an integration test relay chain to validate cross-language propagation and multiplexed stream isolation.
Changes:
- Implement HTTP/2 (gRPC)
traceparentinjection insk_msgplus HPACK parsing/adoption in kprobe HTTP/2/gRPC paths keyed by{ports, stream_id}. - Add a multi-hop (Go→Python→Go→Node.js→Java→Go) docker-compose-based integration test suite covering chain propagation and multiplexed concurrency.
- Document the gRPC/HTTP2 propagation architecture and mark gRPC context propagation as supported in feature docs.
Reviewed changes
Copilot reviewed 29 out of 31 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/test/integration/grpc_relay_test.go | New integration tests validating relay-chain propagation and multiplexed stream isolation via Jaeger queries |
| internal/test/integration/docker-compose-grpc-relay.yml | Adds a 7-hop cross-language relay chain plus OBI + Jaeger wiring for integration testing |
| internal/test/integration/configs/obi-config-grpc-relay.yml | Adds OBI discovery + OTLP exporter config for the relay test environment |
| internal/test/integration/components/jaeger/jaeger.go | Adds helper to filter spans by operation, service, and span.kind |
| internal/test/integration/components/grpc_relay/relay.proto | Defines the relay gRPC service used by the multi-language test components |
| internal/test/integration/components/grpc_relay/python/server.py | Python relay hop with health endpoint and downstream gRPC call |
| internal/test/integration/components/grpc_relay/python/requirements.txt | Pinned Python dependencies for Python relay container build |
| internal/test/integration/components/grpc_relay/python/Dockerfile | Builds the Python relay container and generates gRPC stubs |
| internal/test/integration/components/grpc_relay/nodejs/server.js | Node.js relay hop with persistent client connection to exercise HTTP/2 multiplexing |
| internal/test/integration/components/grpc_relay/nodejs/package.json | Node.js relay dependencies for gRPC/proto loading |
| internal/test/integration/components/grpc_relay/nodejs/Dockerfile | Builds the Node.js relay container |
| internal/test/integration/components/grpc_relay/java/src/main/proto/relay.proto | Java proto definition for the relay service |
| internal/test/integration/components/grpc_relay/java/src/main/java/relay/RelayServer.java | Java relay hop with shared Netty event loop and health endpoint |
| internal/test/integration/components/grpc_relay/java/pom.xml | Maven build for Java relay with protobuf/grpc plugins |
| internal/test/integration/components/grpc_relay/java/Dockerfile | Multi-stage build for Java relay container |
| internal/test/integration/components/grpc_relay/go/main.go | Go relay hop(s) and multiplex endpoint to generate concurrent streams on one connection |
| internal/test/integration/components/grpc_relay/go/go.sum | Go dependency locks for the Go relay component |
| internal/test/integration/components/grpc_relay/go/go.mod | Go module definition for the relay component |
| internal/test/integration/components/grpc_relay/go/Dockerfile | Builds the Go relay container |
| devdocs/grpc-context-propagation.md | New design doc for HPACK injection + TCP options propagation |
| devdocs/features.md | Updates feature matrix to indicate gRPC context propagation support |
| devdocs/README.md | Adds link to the new gRPC context propagation doc |
| bpf/tpinjector/tpinjector.c | Implements sk_msg HTTP/2 detection, HPACK injection, and trace adoption logic |
| bpf/tpinjector/maps/sk_h2_conn_flag.h | Adds SK_STORAGE marker to tag sockets as HTTP/2 |
| bpf/gotracer/maps/grpc.h | Adds conn_ptr → connection_info map to bind stream_id to correct TCP ports |
| bpf/gotracer/go_grpc.c | Writes per-stream outgoing trace context from Go uprobe to outgoing_trace_map |
| bpf/generictracer/protocol_http2.h | Adds HPACK traceparent parsing and adopts injected per-stream context |
| bpf/common/trace_lifecycle.h | Ensures egress_key_t.stream_id is initialized in non-H2 paths |
| bpf/common/h2_defs.h | Centralizes HTTP/2/HPACK constants and traceparent layout offsets |
| bpf/common/egress_key.h | Extends egress_key to include HTTP/2 stream_id for multiplex isolation |
Files not reviewed (1)
- internal/test/integration/components/grpc_relay/nodejs/package-lock.json: Language not supported
Comments suppressed due to low confidence (1)
internal/test/integration/grpc_relay_test.go:1
- If
compose.Up()or any laterrequire.*in this test fails,compose.Close()may not run, potentially leaking containers/resources in CI. Register cleanup immediately after creating/starting the suite (e.g., viat.Cleanup(func(){ _ = compose.Close() }), and optionally also call it afterUp()succeeds) so teardown happens even on early failures.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1832 +/- ##
==========================================
+ Coverage 69.36% 69.39% +0.03%
==========================================
Files 276 276
Lines 32692 32692
==========================================
+ Hits 22677 22688 +11
+ Misses 8807 8804 -3
+ Partials 1208 1200 -8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
CI Supervisor: Pull request integration tests (attempt 1)
|
CI Supervisor: Pull request integration tests (attempt 1)
|
CI Supervisor: Pull request integration tests (attempt 2)
|
CI Supervisor: Pull request checks (attempt 1)
|
CI Supervisor: Pull request checks (attempt 2)
|
CI Supervisor: Pull request integration tests ARM (attempt 1)
|
CI Supervisor: Test Docker build (attempt 1)
|
CI Supervisor: Pull request integration tests on VM (attempt 1)
|
CI Supervisor: Pull request K8s integration tests (attempt 1)
|
CI Supervisor: Pull request integration tests (attempt 1)
|
grcevski
left a comment
There was a problem hiding this comment.
This looks great! LGTM! I think we don't need to add the suggestion I made in this PR, we can follow-up with the additional check. I'm sure the CI will give you hell with the verifier :).
|
|
||
| bpf_tail_call_static(msg, &extender_jump_table, k_tail_find_existing_tp); | ||
| // HTTP/2 detection: known H2 socket or "PRI " preface. | ||
| if ((inject_flags & k_inject_http_headers) && msg->size >= k_h2_frame_header_len) { |
There was a problem hiding this comment.
I think here we need to check if it's PRI or it's recorded in ongoing_http2_connections. PRI is only seen if it's the first communication between sides, but grpc clients tend to hold on to the same connection. When this happens we'll misclassify the request as TCP, we'll ship it to user-space and OBI user-space will detect it's HTTP2. Then from userspace we record the connection as HTTP2 into ongoing_http2_connections. This is how protocol_handler.h checks this. We must also check if it was HTTP2 SSL and not do anything, SSL and HTTP2 can easily be mixed up.
http2_conn_info_data_t *h2g = bpf_map_lookup_elem(&ongoing_http2_connections, &args->pid_conn);
if (h2g && (http2_flag_ssl(h2g->flags) == args->ssl)) {
CI Supervisor: PR OATS test (attempt 1)
|
CI Supervisor: Pull request integration tests ARM (attempt 2)
|
CI Supervisor: Test Docker build (attempt 2)
|
CI Supervisor: Pull request integration tests (attempt 2)
|
CI Supervisor: Pull request integration tests on VM (attempt 2)
|
CI Supervisor: Pull request K8s integration tests (attempt 2)
|
CI Supervisor: PR OATS test (attempt 2)
|
Signed-off-by: Mattia Meleleo <mattia.meleleo@coralogix.com>
CI Supervisor: PR OATS test (attempt 1)
|
CI Supervisor: Pull request integration tests (attempt 1)
|
CI Supervisor: Pull request integration tests (attempt 2)
|
Summary
This PR implements http2/grpc context propagation
resolves #1095
Validation