[#1750] feat(remote merge): Support Spark. by zhengchenyu · Pull Request #2405 · apache/uniffle

zhengchenyu · 2025-03-14T04:37:26Z

What changes were proposed in this pull request?

Support spark Framework

Why are the changes needed?

#1750

Does this PR introduce any user-facing change?

No.

How was this patch tested?

unit test, integration test, real job in cluster.

github-actions · 2025-03-14T05:02:40Z

Test Results

3 029 files +18 3 029 suites +18 6h 41m 32s ⏱️ + 1m 40s
1 182 tests + 7 1 180 ✅ + 7 2 💤 ±0 0 ❌ ±0
14 961 runs +62 14 931 ✅ +62 30 💤 ±0 0 ❌ ±0

Results for commit 0c89203. ± Comparison against base commit 4300f93.

♻️ This comment has been updated with latest results.

zuston

From my sight, this feature now can't be used in Spark SQL. Maybe RDD could use this.

zuston · 2025-03-18T07:44:37Z

integration-test/spark3/src/test/scala/org/apache/uniffle/test/RMSparkSQLTest.scala

+import java.util
+import java.util.Map
+
+object RMSparkSQLTest {


Aha! . How about renaming the RemoteMergeSparkSQLTest . RM looks Yarn ResourceManager

For me, when I mention RM, the first thing that comes to my mind is Yarn ResourceManager. But I don't have any other good names.

Or I will rename RM to RMS or RS. In fact, merge sort is implemented on the server side, but there is no combine, and remote sort is fine.
If so, the previous code and documentation need to be modified. Maybe a new PR is needed.

zhengchenyu · 2025-03-18T07:56:04Z

From my sight, this feature now can't be used in Spark SQL. Maybe RDD could use this.

This test is based on draft pr apache/spark#50248.

roryqi · 2025-03-18T07:56:47Z

From my sight, this feature now can't be used in Spark SQL. Maybe RDD could use this.

This test is based on draft pr apache/spark#50248.

cc @LuciferYang

roryqi · 2025-03-18T08:10:13Z

From my sight, this feature now can't be used in Spark SQL. Maybe RDD could use this.

This test is based on draft pr apache/spark#50248.

This will break the code implement of Spark. You would better to insert a new logic plan represents the distribution and partitioning after shuffling. You only need to implement some optimization rules.

zhengchenyu · 2025-03-18T08:27:09Z

From my sight, this feature now can't be used in Spark SQL. Maybe RDD could use this.

This test is based on draft pr apache/spark#50248.

This will break the code implement of Spark. You would better to insert a new logic plan represents the distribution and partitioning after shuffling. You only need to implement some optimization rules.

Are you talking about changes to Spark? My initial idea was also to see if I could add a new rule. Maybe for map side, I could add new rules. But for reduce, adding a new SortExec is determined by determining whether distribution and partitioning match, which is not easy to do by adding a new Rule.
For the draft pr about changes to spark. It is only a draft to verify the feasibility of this proposal. There are still some code architectures that need to be refactored. For example, some partial aggregation in memory logic, add some logic to the rule.

LuciferYang · 2025-03-18T08:56:34Z

also cc @summaryzb

LuciferYang · 2025-03-18T09:15:44Z

From my sight, this feature now can't be used in Spark SQL. Maybe RDD could use this.

This test is based on draft pr apache/spark#50248.

Does this one depend on SPARK-51398 being merged first?

roryqi · 2025-03-18T09:24:45Z

From my sight, this feature now can't be used in Spark SQL. Maybe RDD could use this.

This test is based on draft pr apache/spark#50248.

This will break the code implement of Spark. You would better to insert a new logic plan represents the distribution and partitioning after shuffling. You only need to implement some optimization rules.

Are you talking about changes to Spark? My initial idea was also to see if I could add a new rule. Maybe for map side, I could add new rules. But for reduce, adding a new SortExec is determined by determining whether distribution and partitioning match, which is not easy to do by adding a new Rule. For the draft pr about changes to spark. It is only a draft to verify the feasibility of this proposal. There are still some code architectures that need to be refactored. For example, some partial aggregation in memory logic, add some logic to the rule.

Yes, Meta Cosco ever did similar things. You can see
apache/spark#32944
apache/spark#34702

cc @c21 Excuse me, sorry to bother you. Is it possible that we don't change the code of Spark and only add some rules to implement this feature? Could you give us some suggestion?

zuston · 2025-04-02T06:35:37Z

client-spark/common/src/main/java/org/apache/uniffle/shuffle/manager/RssShuffleManagerBase.java

+                      : dependency.valueClassName())
+              .setComparatorClass(
+                  dependency
+                      .keyOrdering()


if the keyOrdering is empty, the remote merge could be disabled? @zhengchenyu

The current implementation cannot enable remote merge based on whether keyOrdering is empty, and can only be done through configuration.
For spark rdd, there may be such a situation (not for spark sql), keyOrdering is empty, but combineclass is not empty, it will be sorted by hash and then combined in memory.
This can be further improved, especially for spark sql.

xianjingfeng · 2025-05-07T08:09:30Z

client-spark/common/src/main/java/org/apache/uniffle/shuffle/manager/RssShuffleManagerBase.java

+                              return encode(o);
+                            }
+                          })
+                      .get())


if dependency.keyOrdering().isDefined() == false, an exception will be thrown here

[apache#1750] feat(remote merge): Support Spark.

5711d2a

zhengchenyu requested review from roryqi and xianjingfeng March 14, 2025 04:37

zhengchenyu added 4 commits March 14, 2025 14:21

update

64acfb9

fix

cd0bf17

fix

3805e7c

remove chill dependency

0c89203

roryqi requested a review from zuston March 17, 2025 03:23

zuston reviewed Mar 18, 2025

View reviewed changes

zuston reviewed Apr 2, 2025

View reviewed changes

xianjingfeng reviewed May 7, 2025

View reviewed changes

Conversation

zhengchenyu commented Mar 14, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

zuston left a comment

Choose a reason for hiding this comment

Uh oh!

zuston Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

zhengchenyu Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhengchenyu Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

zhengchenyu commented Mar 18, 2025

Uh oh!

roryqi commented Mar 18, 2025

Uh oh!

roryqi commented Mar 18, 2025

Uh oh!

zhengchenyu commented Mar 18, 2025

Uh oh!

LuciferYang commented Mar 18, 2025

Uh oh!

LuciferYang commented Mar 18, 2025

Uh oh!

roryqi commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zuston Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

zhengchenyu Apr 3, 2025

Choose a reason for hiding this comment

Uh oh!

xianjingfeng May 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Mar 14, 2025 •

edited

Loading

zhengchenyu Mar 18, 2025 •

edited

Loading

roryqi commented Mar 18, 2025 •

edited

Loading