feat: Implement graceful shutdown with connection tracking#578
Open
dcoppa wants to merge 1 commit intoContentSquare:masterfrom
Open
feat: Implement graceful shutdown with connection tracking#578dcoppa wants to merge 1 commit intoContentSquare:masterfrom
dcoppa wants to merge 1 commit intoContentSquare:masterfrom
Conversation
Add graceful shutdown support to properly handle SIGTERM and SIGINT signals, allowing active connections to complete before termination. Key changes: - Track active HTTP connections via ConnState callback - Implement configurable graceful_shutdown_timeout (default: 25s) - Concurrently shutdown HTTP and HTTPS servers - Wait for in-flight requests to complete or timeout - Clean up proxy resources (caches, heartbeat goroutines) - Add test coverage for shutdown scenarios - Add debug logging for connection lifecycle events The 25s default timeout fits within Kubernetes' 30s grace period, ensuring pods can terminate cleanly without killing active queries.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR implements graceful shutdown support for chproxy, allowing the proxy to properly handle SIGTERM and SIGINT signals and wait for active connections to complete before terminating.
Motivation: When chproxy is deployed in orchestrated environments like Kubernetes, abrupt termination can kill in-flight queries, leading to failed client requests and incomplete data operations. This implementation ensures clean shutdowns by waiting for active connections to complete (or timeout) before process termination.
Key changes:
signal.NotifyContextfor SIGTERM/SIGINThttp.Server.ConnStatecallback with atomic countergraceful_shutdown_timeout(default: 25s, fits within K8s 30s grace period)Example shutdown flow:
Pull request type
Please check the type of change your PR introduces:
Checklist
Does this introduce a breaking change?
Further comments
Implementation Highlights:
Connection Tracking: Uses
http.Server.ConnStatecallback. This correctly tracks TCP connections and handles HTTP keep-alive.Concurrent Server Shutdown: HTTP and HTTPS servers shut down in parallel using
sync.WaitGroupfor efficiency.Thread Safety:
configLock->lock) to prevent deadlocksreloadSignalError Handling: Errors collected via buffered channel and aggregated with
errors.JoinTesting Strategy: Uses subprocess pattern to test shutdown behavior in isolation
Files Changed:
main.go: Signal handling, connection tracking, graceful shutdown logicproxy.go: Resource cleanup (caches, heartbeat goroutines, transport)config/config.go: AddedGracefulShutdownTimeoutfieldmain_test.go: Two comprehensive test cases (normal shutdown + timeout)Testing: