-
Notifications
You must be signed in to change notification settings - Fork 4
feat(extractors/solana): add block rewards #1664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Protobuf scheme has a variant for the `RewardType` enumeration (Unspecified) that only exists in the protobuf scheme and is not defined by the Solana spec. It was not covered by our deserialization logic either which started being an issue once the block rewards table got added. This commit fixes that.
ae83949 to
c027b42
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Summary
This PR adds support for extracting Solana block rewards, which is a valuable addition to the data extraction pipeline. The implementation follows the existing patterns in the codebase well.
Key Changes Reviewed
- New
block_rewardstable module with proper Arrow schema definition - Integration of block rewards into both Old Faithful (OF1) and RPC code paths
- Refactoring of
RewardandRewardTypetypes to be shared between block rewards and transaction rewards - Improved error handling with
anyhow::Resultreturn types where parsing can fail
Areas for Consideration
- Error context accuracy: Some
.context()messages mention "missing" rewards when the actual failure modes are about malformed data - Builder capacity hints: The
BlockRewardsRowsBuilderuses capacity of 1, which may cause unnecessary reallocations for blocks with many rewards - Node lookup ordering: The change from
.pop()to.last()inof1_client.rsassumes consistent iteration order from the node map - Error variant semantics: Using
MissingNodeerror when the node exists but is the wrong type could be confusing
What Looks Good
- Clean separation of block rewards into its own module
- Proper handling of both Proto and Bincode formats
- Good slot mismatch validation for data integrity
- Consistent use of the
ok_or_bail!macro for stream error handling - Test manifest updated appropriately
| pubkey: value.pubkey, | ||
| lamports: value.lamports, | ||
| post_balance: value.post_balance, | ||
| reward_type: value.reward_type.map(Into::into), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The From<solana_storage_proto::StoredExtendedReward> implementation here uses value.reward_type.map(Into::into), but value.reward_type is already Option<solana_transaction_status::RewardType>.
Looking at the impl below for From<rpc_client::RewardType>, it handles the standard Solana reward types (Fee, Rent, Staking, Voting), but there's no From<solana_transaction_status::RewardType> implementation defined.
This will cause a compile error unless there's an implicit conversion or blanket impl I'm missing. Could you verify this compiles correctly, or add the missing From impl for solana_transaction_status::RewardType?
| /// Creates a new [BlockRewardsRowsBuilder]. | ||
| pub(crate) fn new() -> Self { | ||
| Self { | ||
| special_block_num: UInt64Builder::with_capacity(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor concern: The builder's initial capacity of 1 for each field seems too small for the typical use case where a block has multiple rewards. Consider accepting a capacity hint parameter or using a larger default:
pub(crate) fn with_capacity(num_rewards: usize) -> Self {
Self {
special_block_num: UInt64Builder::with_capacity(num_rewards),
slot: UInt64Builder::with_capacity(num_rewards),
// ...
}
}This would help avoid reallocations when appending rewards, similar to how TransactionRowsBuilder::with_capacity works.
| } | ||
|
|
||
| impl BlockRewards { | ||
| pub(crate) fn from_of1_rewards( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error context here says "missing of1 block rewards", but from_of1_rewards handles Option<DecodedBlockRewards> and returns Ok(Self { slot, rewards: vec![] }) when rewards are None.
So this function doesn't actually fail on missing rewards - it gracefully handles them. The .context() call at the call site in extractor.rs:369 is misleading. Consider either:
- Removing the context from the call site since the error wouldn't be about "missing" rewards
- Or documenting that errors here are about malformed rewards data, not missing rewards
| .map_err(Of1StreamError::NodeParse)?; | ||
|
|
||
| let block = match nodes.nodes.pop() { | ||
| let block = match nodes.nodes.last() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential edge case here: nodes.nodes.last() returns a reference to the last element in iteration order of the HashMap (or whatever map type Nodes uses internally). This assumes the block node is always the last one added during read_until_block.
Previously with .pop(), the code was explicitly removing and consuming the last element from a mutable nodes. Now with .last(), you're relying on iteration order being consistent.
If Nodes uses a HashMap, the iteration order is not guaranteed and could vary. Could you confirm that Nodes::nodes preserves insertion order (e.g., uses IndexMap or similar)?
| cid: block.rewards.to_string(), | ||
| }); | ||
| }; | ||
| if rewards.slot != block.slot { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The slot mismatch check is a good data integrity validation. However, the error message could be more actionable by including the CID or other context about which block/rewards node caused the mismatch:
#[error("reward slot mismatch for cid {cid}: expected {expected}, found {found}")]
RewardSlotMismatch { expected: u64, found: u64, cid: String },This would help with debugging if this error occurs in production.
| .map(|rewards| { | ||
| let car_parser::node::Node::Rewards(rewards) = rewards else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error variant for when the node exists but isn't a Rewards node reuses MissingNode, which is semantically incorrect - the node isn't missing, it's the wrong type. This could be confusing for debugging.
Consider using UnexpectedNode instead (similar to line 570-573), or adding a dedicated WrongNodeType error variant:
let Some(car_parser::node::Node::Rewards(rewards)) = nodes.nodes.get(&block.rewards) else {
return Err(Of1StreamError::UnexpectedNode {
kind: "non-rewards",
cid: block.rewards.to_string(),
});
};| let rewards = rpc_tx_meta.rewards.map(|rewards| { | ||
| rewards | ||
| .into_iter() | ||
| .map(|reward| Reward { | ||
| .map(|reward| tables::block_rewards::Reward { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In from_stored_meta, the rewards conversion manually constructs tables::block_rewards::Reward with reward_type: reward.reward_type.map(From::from).
However, rpc_client::Reward already has a From impl into tables::block_rewards::Reward (line 106-115 in block_rewards.rs). Consider using that for consistency:
let rewards = rpc_tx_meta.rewards.map(|rewards| {
rewards.into_iter().map(Into::into).collect()
});This matches the pattern used in from_rpc_meta (lines 421-423) and reduces code duplication.
| // Helper macro to simplify error handling and early returns in the stream. | ||
| macro_rules! ok_or_bail { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ok_or_bail! macro is a useful pattern for stream error handling. However, defining it inside the async_stream::stream! block means it's recreated for every stream instantiation.
Consider moving this to a module-level macro or extracting the error-and-return pattern into a helper. That said, macro hygiene within async_stream::stream! can be tricky, so the current approach is pragmatic if moving it causes issues.
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error context "missing of1 block rewards" is misleading. Looking at BlockRewards::from_of1_rewards, it accepts Option<DecodedBlockRewards> and returns an empty rewards list when None is passed.
The actual errors that could occur here are related to malformed reward data (e.g., invalid reward type values, unparseable commission strings). Consider a more accurate context:
let block_rewards = tables::block_rewards::BlockRewards::from_of1_rewards(slot, block_rewards)
.context("parsing of1 block rewards")?;| .map_err(RowConversionError::TableBuild)? | ||
| }; | ||
|
|
||
| let block_rewards_row = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For blocks with many rewards, you could pass a capacity hint to the builder:
let block_rewards_row = {
let mut builder = block_rewards::BlockRewardsRowsBuilder::with_capacity(block_rewards.rewards.len());
// ...
};This would require adding a with_capacity method to BlockRewardsRowsBuilder as mentioned in the other comment.
Closes #1629