⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

[CK] Add fwd conv group merging to v3 conv instances#3675

Closed
vpietila-amd wants to merge 6 commits intodevelopfrom
vpietila/improved-fwd-merged-conv-group-instances
Closed

[CK] Add fwd conv group merging to v3 conv instances#3675
vpietila-amd wants to merge 6 commits intodevelopfrom
vpietila/improved-fwd-merged-conv-group-instances

Conversation

@vpietila-amd
Copy link
Contributor

@vpietila-amd vpietila-amd commented Jan 29, 2026

Proposed changes

Added conv group merging to the (universal) V3 fwd conv pipeline. The new instance improves fwd conv performance when the number of input/output channel per group is low.

On MI300 (gfx942) we get

CK prof command Baseline (TFLOPS) V3 group merging (TFLOPS)
grouped_conv_fwd 1 1 1 0 1 0 1 2 32 32 4 4 3 3 200 200 1 1 1 1 1 1 1 1 3.86035 8.36796
grouped_conv_fwd 1 1 1 0 1 0 1 2 32 32 8 8 3 3 200 200 2 2 1 1 1 1 1 1 10.1867 13.4677
grouped_conv_fwd 1 1 1 0 1 0 1 2 32 32 8 8 3 3 100 100 1 2 1 1 1 1 1 1 11.7875 16.3657

@ammallya
Copy link
Contributor

ammallya commented Feb 3, 2026

Imported to ROCm/rocm-libraries

@ammallya ammallya closed this Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants