[AURON #1891] Implement randn() function#1938
Open
robreeves wants to merge 16 commits intoapache:masterfrom
Open
[AURON #1891] Implement randn() function#1938robreeves wants to merge 16 commits intoapache:masterfrom
robreeves wants to merge 16 commits intoapache:masterfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR implements the randn() function to improve Spark function coverage in Auron. The function generates random values from a standard normal distribution with optional seed support.
Changes:
- Added Rust implementation of
spark_randnfunction with seed handling - Registered the new function in the Scala converter and Rust function registry
- Added
rand_distrdependency for normal distribution sampling
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| spark-extension/src/main/scala/org/apache/spark/sql/auron/NativeConverters.scala | Added case handler for Randn expression to route to native implementation |
| native-engine/datafusion-ext-functions/src/spark_randn.rs | New implementation of randn function with seed handling and unit tests |
| native-engine/datafusion-ext-functions/src/lib.rs | Registered Spark_Randn function in the extension function factory |
| native-engine/datafusion-ext-functions/Cargo.toml | Added rand and rand_distr dependencies |
| Cargo.toml | Added rand_distr workspace dependency |
| Cargo.lock | Updated lock file with rand_distr package metadata |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Resolve conflicts between randn and spark_partition_id features: - Proto: spark_partition_id_expr at 20101, randn_expr at 20102 - Planner: include both expression handlers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Contributor
|
@robreeves Nice work! LGTM. |
Resolved conflicts by assigning separate IDs to randn and monotonically_increasing_id: - MonotonicIncreasingIdExprNode: ID 20102 - RandnExprNode: ID 20103 Both expressions are now supported in the proto definitions and planner. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add test to AuronFunctionSuite to verify randn functionality with seeds. The test validates that Auron's native randn implementation produces the same reproducible results as Spark's baseline when using explicit seeds. Test covers: - randn with seed 42 - randn with seed 100 - Validates against Spark baseline using checkSparkAnswerAndOperator Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Author
I added a |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #1891
Rationale for this change
This improves function coverage in Auron by creating a native randn implementation.
What changes are included in this PR?
Adds a native randn implementation.
Are there any user-facing changes?
Yes, it adds the randn function.
How was this patch tested?
Added unit tests and manually tested in spark-shell.
Output: