⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@YixingZhang007
Copy link
Contributor

@YixingZhang007 YixingZhang007 commented Jan 12, 2026

This PR modifies the backend compiler options passed to clang-linker-wrapper and removes FPGA descriptions from OffloadDesign.md. Detailed explanations are below:

  1. In PR [NewOffloadModel] Pass link-time options through device-compiler and device-linker argument for ClangLinkerWrapper #20691, we modified the link-time compiler option to be passed through --device-compiler instead of through --cpu-tool-arg and --gpu-tool-arg. We update OffloadDesign.md to include the usage and format of --device-compiler.
  2. As described in Remove FPGA features from DPC++ #16929 and PR [SYCL][Driver][FPGA] Remove support for FPGA related options #16864, we are removing support for FPGA features and their related options from DPC++. We update OffloadDesign.md to remove any FPGA-related descriptions.

resemble `--gpu-tool-arg=<arch> <arg>`. This corresponds to the existing
resemble `--device-compiler=sycl:spir64_gen-unknown-unknown==<arch> <arg>`. This corresponds to the existing
option syntax of `-fsycl-targets=intel_gpu_arch` where `arch` can be a fixed
set of targets.
Copy link
Contributor Author

@YixingZhang007 YixingZhang007 Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is still what we want to support, because currently, the backend compiler arguments for all architectures are passed together through a single --device-compiler= argument. For the example shown earlier in this file, if we have the following:

clang++ -fsycl -fsycl-targets=intel_gpu_skl,spir64_gen \
  -Xsycl-target-backend=spir64_gen "-device pvc -options -extraopt_pvc" \
  -Xsycl-target-backend=intel_gpu_skl "-options -extraopt_skl"

the clang-linker-wrapper command right now looks like:

clang-linker-wrapper ... \
  --device-compiler=sycl:spir64_gen-unknown-unknown \
  =-device pvc -options -extraopt_pvc -options -extraopt_skl ...

Then in clang-linker-wrapper, it will execute ocloc with both -device pvc -options -extraopt_pvc and -options -extraopt_skl for both PVC and SKL.

If we still want to keep the original proposed solution of separating the arguments for different architectures in clang-linker-wrapper, this will be something we need to implement next.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, should not we call ocloc specifying both pvc and skl as -device options?
What does old offloading model do for this scenario?
@mdtoguchi , I believe, original design came from you, could you please comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to retain this capability to allow for passing along specific values for each potential arch target. Each individual target arch provided performs a separate ocloc call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But does it make sense to you that we are calling ocloc with such options? -device pvc -options -extraopt_pvc -options -extraopt_skl
should not it be something like: -device pvc -options -extraopt_pvc -device skl -options -extraopt_skl?
or maybe 2 calls to ocloc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other words, it looks like we are calling ocloc to compile for pvc target, while inital clang++ command line asks to compile for 2 targets: pvc and skl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdtoguchi For sure! I think we should be able to do that. I can create a PR for this change if we decide that we need to support passing different options for individual arch :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking if --device-compiler already has everything necessary to support passing unique options for each individual arch? I mean can we do something like this:

clang++ -fsycl -fsycl-targets=intel_gpu_skl,spir64_gen -Xsycl-target-backend=spir64_gen "-device pvc -options -extraopt_pvc" -Xsycl-target-backend=intel_gpu_skl "-options -extraopt_skl" AOT/multiple-devices.cpp --offload-new-driver -v

would be translated to something like this:

clang-linker-wrapper ... "--device-compiler=sycl:spir64_gen-unknown-unknown=-device pvc -options -extraopt_pvc" "--device-compiler=sycl:spir64_gen-unknown-unknown=-device skl -options -extraopt_skl"

end then it would end up in these ocloc commands:

ocloc ... -device skl -options -extraopt_skl ... 
ocloc ... -device pvc -options -extraopt_pvc ...

Copy link
Contributor Author

@YixingZhang007 YixingZhang007 Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried modifying the clang-linker-wrapper with two separate --device-compiler options, one for each architecture, as shown below (right now the arguments for both arch are passed through a single --device-compiler option) :

clang-linker-wrapper ... \
  "--device-compiler=sycl:spir64_gen-unknown-unknown=-device pvc -options -extraopt_pvc" \
  "--device-compiler=sycl:spir64_gen-unknown-unknown=-device skl -options -extraopt_skl"

The ocloc commands got called is shown below.

ocloc ... -device skl -device_options pvc -device pvc -options -extraopt_pvc -device skl -options -extraopt_skl ...
ocloc ... -device pvc -device_options pvc -device pvc -options -extraopt_pvc -device skl -options -extraopt_skl ...

I think we may still need to implement filtering logic in clang-linker-wrapper so that each --device-compiler option is only applied to its corresponding architecture @YuriPlyakhin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, as we discussed on the meeting, we also need to do more experiments to better understand implemented behavior for old offloading model as well.

Copy link
Contributor Author

@YixingZhang007 YixingZhang007 Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked into the behavior of the old offloading model for multiple devices. The argument passing into the ocloc command is different for old and new offloading models.

For example, we run the following clang command with the old offloading model:

clang++ ... -fsycl-targets=intel_gpu_dg1,spir64_gen -Xsycl-target-backend=spir64_gen "-device pvc -options -extraopt_pvc" -Xsycl-target-backend=intel_gpu_dg1 "-options -extraopt_dg1" ...

The ocloc commands run for the old offloading model are:

ocloc ... -device dg1 -device_options pvc ... -options -extraopt_dg1 ...
ocloc ... -device_options pvc -device pvc ... -options -extraopt_pvc -options -extraopt_dg1 ...

@YuriPlyakhin @mdtoguchi I don't think the ocloc commands are correct for the old offloading model, because the backend option that was passed for dg1 is also passed to pvc as well (however, the options passed to ocloc for dg1 is correct).

@YixingZhang007 YixingZhang007 marked this pull request as ready for review January 13, 2026 23:13
@YixingZhang007 YixingZhang007 requested a review from a team as a code owner January 13, 2026 23:13
the `spir64_gen` architecture triple, the resulting extracted binary is linked,
post-link processed and converted to SPIR-V before being passed to `ocloc` to
generate the final device binary. Options passed via `--gpu-tool-arg=` will
generate the final device binary. Options passed via `--device-compiler=` will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --device-compiler usage here should be extended to include the spir64_gen target as it is specific for options to ocloc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I have update this to be --device-compiler=sycl:spir64_gen-unknown-unknown=<arg>

> --gpu-tool-arg="-device pvc -options extraopt_pvc"
--gpu-tool-arg="-options -extraopt_skl"
> "--device-compiler=sycl:spir64_gen-unknown-unknown=-device pvc -options extraopt_pvc"
"--device-compiler=sycl:spir64_gen-unknown-unknown=-options -extraopt_skl"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the syntax of the options passed is slightly different (quotes around the entire option as opposed to just the arg). Was the original usage of --gpu-tool-arg not correct here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the documentation of --gpu-tool-arg here is different from what it generates when I do clang++ ... -v. Without the recent changes for --device-compiler, I see the clang-linker-wrapper command that got generated when we do clang++ ... -v is clang-linker-wrapper ... "--gpu-tool-arg=-device pvc -options -extraopt_pvc -options -extraopt_skl" ... which the quotation mark is wrapped around the whole --gpu-tool-arg option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - as long as the clang-linker-wrapper is parsing the information correctly how it is represented here is inconsequential.

resemble `--gpu-tool-arg=<arch> <arg>`. This corresponds to the existing
resemble `--device-compiler=sycl:spir64_gen-unknown-unknown==<arch> <arg>`. This corresponds to the existing
option syntax of `-fsycl-targets=intel_gpu_arch` where `arch` can be a fixed
set of targets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to retain this capability to allow for passing along specific values for each potential arch target. Each individual target arch provided performs a separate ocloc call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants