Add test for broken retries with netty backend#1417
Add test for broken retries with netty backend#1417
Conversation
withConnectionAttempts() does not work as expected with the Netty backend: if a request is sent before the server is started, the caller sees a failure immediately and no retries are attempted. with the Akka-Http backend, the same test works as expected, i.e. one of the retries will reach the server once it's up and the caller will see a succes result.
|
Hi @cb-nezasa, Thank you for your contribution! We really value the time you've taken to put this together. Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement: |
|
One of my colleagues found a tentative fix / workaround to make my test case pass even for Netty - see below. The fix (or workaround) is inspired by this post grpc/grpc-java#5724 The fix really tweaks a lot of internas on the Note: |
It would need some digging - I think we have some retry logic in Akka gRPC, and there is retry logic in Netty itself. We should probably:
One thing that's important to keep in mind (and probably already under test) is that we should make sure not to only retry making the actual connection, but also possibly re-trigger the discovery, in case we need to refresh the service information to learn about an endpoint that will actually work. |
withConnectionAttempts()does not work as expected with the Netty backend.i.e. if a request is sent before the server is started, the caller sees a
failure immediately and no retries are attempted.
With the Akka-Http backend, the same test works as expected, i.e. one of
the retries will reach the server once it's up and the caller will see
a success result.
This PR does not introduce any changes to code but adds a test case exposing the issue and the different behaviour of the Netty and Akka-Http backend.
notes:
Unfortunately I could not find the root cause of this problem. In a different project I observed the debug log messages from
akka.grpc.internal.ChannelUtils#monitorChannelwhich suggested that exponential backoff is working but nothing actually gets retried and the caller already got back a failed Future anyway.