-
Notifications
You must be signed in to change notification settings - Fork 246
feat: DA hints in p2p #2891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: DA hints in p2p #2891
Conversation
|
Claude finished @alpe's task in 3m 53s —— View job Comprehensive Code Review: DA Hints in P2PI've completed a thorough review of this PR focusing on code quality, architecture, security, performance, testing, and the migration strategy. Overall, this is a well-designed feature with good test coverage (70.17%), but there are several areas that need attention before merging. 🔴 Critical Issues1. Error Handling - Silent Failures in DA Hint UpdatesLocation: The code silently ignores errors when updating DA height hints in the p2p store: if err := s.headerDAHintAppender.AppendDAHint(ctx, res.Height, hashes...); err != nil {
s.logger.Error().Err(err).Msg("failed to append da height hint in header p2p store")
// ignoring error here, since we don't want to block the block submission'
}Problem: While the comment justifies not blocking submission, this creates a silent failure mode where:
Recommendation:
2. Concurrency - Race Condition in AsyncDARetrieverLocation: The func (r *AsyncDARetriever) RequestRetrieval(height uint64) {
r.mu.Lock()
defer r.mu.Unlock()
if _, exists := r.inFlight[height]; exists {
return
}
select {
case r.workCh <- height:
r.inFlight[height] = struct{}{} // ✅ Protected
// ...
default:
// ❌ Height not marked in-flight, can be re-requested
}
}Problem: If the work channel is full (default case), the height is not marked as in-flight, allowing duplicate requests to queue up when capacity becomes available. Recommendation: select {
case r.workCh <- height:
r.inFlight[height] = struct{}{}
r.logger.Debug().Uint64("height", height).Msg("queued DA retrieval request")
default:
r.inFlight[height] = struct{}{} // Mark as "attempted" to prevent immediate retry
r.logger.Warn().Uint64("height", height).Msg("DA retrieval worker pool full, height marked for later retry")
}3. Context Handling - Missing Context in StopLocation: func (r *AsyncDARetriever) Stop() {
if r.cancel != nil {
r.cancel()
}
r.wg.Wait() // ❌ No timeout, can hang indefinitely
r.logger.Info().Msg("AsyncDARetriever stopped")
}Problem: If workers are blocked on network I/O during shutdown, Recommendation: func (r *AsyncDARetriever) Stop() error {
if r.cancel != nil {
r.cancel()
}
done := make(chan struct{})
go func() {
r.wg.Wait()
close(done)
}()
select {
case <-done:
r.logger.Info().Msg("AsyncDARetriever stopped cleanly")
return nil
case <-time.After(5 * time.Second):
r.logger.Warn().Msg("AsyncDARetriever stop timed out, some workers may still be running")
return fmt.Errorf("stop timed out")
}
}
|
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2891 +/- ##
==========================================
+ Coverage 58.58% 58.95% +0.37%
==========================================
Files 110 112 +2
Lines 10396 10625 +229
==========================================
+ Hits 6090 6264 +174
- Misses 3662 3698 +36
- Partials 644 663 +19
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| cache.SetHeaderDAIncluded(headerHash.String(), res.Height, header.Height()) | ||
| hashes[i] = headerHash | ||
| } | ||
| if err := s.headerDAHintAppender.AppendDAHint(ctx, res.Height, hashes...); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the DA height is passed to the sync service to update the p2p store
| Msg("P2P event with DA height hint, triggering targeted DA retrieval") | ||
|
|
||
| // Trigger targeted DA retrieval in background via worker pool | ||
| s.asyncDARetriever.RequestRetrieval(daHeightHint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the "fetch from DA" is triggered for the current block event height
pkg/sync/da_hint_container.go
Outdated
| type SignedHeaderWithDAHint = DAHeightHintContainer[*types.SignedHeader] | ||
| type DataWithDAHint = DAHeightHintContainer[*types.Data] | ||
|
|
||
| type DAHeightHintContainer[H header.Header[H]] struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a data container to persist the DA hint together with the block header or data.
types.SignedHeader and types.Data are used all over the place so I did not modify them but added introduced this type for the p2p store and transfer only.
It may make sense to do make this a Proto type. WDYT?
pkg/sync/sync_service.go
Outdated
| return nil | ||
| } | ||
|
|
||
| func (s *SyncService[V]) AppendDAHint(ctx context.Context, daHeight uint64, hashes ...types.Hash) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stores the DA height hints
|
if da hint is not in the proto how do other nodes get knowledge of the hint? also how would an existing network handle using this feature? its breaking so is it safe to upgrade? |
block/internal/syncing/syncer.go
Outdated
| "github.com/evstack/ev-node/block/internal/cache" | ||
| "github.com/evstack/ev-node/block/internal/common" | ||
| "github.com/evstack/ev-node/block/internal/da" | ||
| coreda "github.com/evstack/ev-node/core/da" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: gci linter
julienrbrt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! It really makes sense.
I share the same concern as @tac0turtle however about the upgrade strategy given it is p2p breaking.
The sync_service wraps the header/data payload in a
It is a breaking change. Instead of signed header or data types, the p2p network exchanges DAHeightHintContainer. This would be incompatible. Also the existing p2p stores would need migration to work. |
|
Could we broadcast both until every networks are updated? Then for final we can basically discard the previous one. |
|
fyi: This PR is missing a migration strategy for the p2p store ( and ideally network) |
* main: refactor(sequencers): persist prepended batch (#2907) feat(evm): add force inclusion command (#2888) feat: DA client, remove interface part 1: copy subset of types needed for the client using blob rpc. (#2905) feat: forced inclusion (#2797) fix: fix and cleanup metrics (sequencers + block) (#2904) build(deps): Bump mdast-util-to-hast from 13.2.0 to 13.2.1 in /docs in the npm_and_yarn group across 1 directory (#2900) refactor(block): centralize timeout in client (#2903) build(deps): Bump the all-go group across 2 directories with 3 updates (#2898) chore: bump default timeout (#2902) fix: revert default db (#2897) refactor: remove obsolete // +build tag (#2899) fix:da visualiser namespace (#2895)
* main: chore: execute goimports to format the code (#2924) refactor(block)!: remove GetLastState from components (#2923) feat(syncing): add grace period for missing force txs inclusion (#2915) chore: minor improvement for docs (#2918) feat: DA Client remove interface part 2, add client for celestia blob api (#2909) chore: update rust deps (#2917) feat(sequencers/based): add based batch time (#2911) build(deps): Bump golangci/golangci-lint-action from 9.1.0 to 9.2.0 (#2914) refactor(sequencers): implement batch position persistance (#2908)
<!-- Please read and fill out this form before submitting your PR. Please make sure you have reviewed our contributors guide before submitting your first PR. NOTE: PR titles should follow semantic commits: https://www.conventionalcommits.org/en/v1.0.0/ --> ## Overview Temporary fix until #2891. After #2891 the verification for p2p blocks will be done in the background. ref: #2906 <!-- Please provide an explanation of the PR, including the appropriate context, background, goal, and rationale. If there is an issue with this information, please provide a tl;dr and link the issue. Ex: Closes #<issue number> -->
|
I have added 2 new types for the p2p store that are binary compatible to the types.Data and SignedHeader. With this, we should be able to roll this out without breaking the in-flight p2p data and store. |
julienrbrt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! I can see how useful the async retriever will be for force inclusion verification as well. We should have @auricom verify if p2p still works with Eden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to be really useful for force inclusion checks as well.
* main: build(deps): Bump actions/cache from 4 to 5 (#2934) build(deps): Bump actions/download-artifact from 6 to 7 (#2933) build(deps): Bump actions/upload-artifact from 5 to 6 (#2932) feat: DA Client remove interface part 3, replace types with new code (#2910) DA Client remove interface: Part 2.5, create e2e test to validate that a blob is posted in DA layer. (#2920)
(cherry picked from commit ad3e21b)
Introduce envelope for headers on DA to fail fast on unauthorized content. Similar approach as in #2891 with a binary compatible sibling type that carries the additional information. * Add DAHeaderEnvelope type to wrap signed headers on DA * Binary compatible to `SignedHeader` proto type * Includes signature of of the plain content * DARetriever checks for valid signature early in the process * Supports `SignedHeader` for legacy support until first signed envelope read
* main: chore: fix some minor issues in the comments (#2955) feat: make reaper poll duration configurable (#2951) chore!: move sequencers to pkg (#2931) feat: Ensure Header integrity on DA (#2948) feat(testda): add header support with GetHeaderByHeight method (#2946) chore: improve code comments clarity (#2947) chore(sequencers): optimize store check (#2945)
|
ci seems to be having some issues, can these be fixed. Also was this tested on an existing network? If not, please do that before merging |
* main: fix: inconsistent state detection and rollback (#2983) chore: improve graceful shutdown restarts (#2985) feat(submitting): add posting strategies (#2973) chore: adding syncing tracing (#2981) feat(tracing): adding block production tracing (#2980) feat(tracing): Add Store, P2P and Config tracing (#2972) chore: fix upgrade test (#2979) build(deps): Bump github.com/ethereum/go-ethereum from 1.16.7 to 1.16.8 in /execution/evm/test in the go_modules group across 1 directory (#2974) feat(tracing): adding tracing to DA client (#2968) chore: create onboarding skill (#2971) test: add e2e tests for force inclusion (part 2) (#2970) feat(tracing): adding eth client tracing (#2960) test: add e2e tests for force inclusion (#2964) build(deps): Bump the all-go group across 4 directories with 10 updates (#2969) fix: Fail fast when executor ahead (#2966) feat(block): async epoch fetching (#2952) perf: tune badger defaults and add db bench (#2950) feat(tracing): add tracing to EngineClient (#2959) chore: inject W3C headers into engine client and eth client (#2958) feat: adding tracing for Executor and added initial configuration (#2957)
Overview
Resolves #2609
The basic idea is to store an additional DAHightHint field within the p2p store.
As SignedHeader and Data are used in other places, too. I added an an
DAHeightHintContainertype to wrap the tuple for the store only.The DAHeight Hint is added by the da_submitter and read in the syncer to fetch the missing DA header/data for the most recent block as required.
Please note: this is a breaking change to the p2p network and store