Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DeferRelocatableCompilationCompilationProvider #19831

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

copybara-service[bot]
Copy link

Add DeferRelocatableCompilationCompilationProvider

This adds a compilation provider which adds limited support
for parallel compilation even when the delegate compilation
provider doesn't support compilation into a relocatable module.

Parallel compilation works by:

  1. Splitting the LLVM module into smaller modules at function boundaries
  2. Lowering each of the smaller modules in parallel in a thread pool
  3. and compiling the PTX into relocatable CUBIN modules in parallel.
  4. Linking everything together

Only ptxas and nvptxcompiler allow compilation into relocatable modules,
but both of these two methods are not always available.

To still benefit from parallel LLVM lowering while not writing an entirely new compilation pipeline this compilation provider defers PTX compilation to the linking step.

PTX compilation will then not happen in parallel, but at least LLVM lowering will.

The implementation is not a new one. The same workaround is currently used in nvptxcompiler. This component will replace it.

@copybara-service copybara-service bot force-pushed the test_699071890 branch 5 times, most recently from 485a771 to 38e8c51 Compare November 26, 2024 11:35
This adds a compilation provider which adds limited support
for parallel compilation even when the delegate compilation
provider doesn't support compilation into a relocatable module.

Parallel compilation works by:

1. Splitting the LLVM module into smaller modules at function boundaries
2. Lowering each of the smaller modules in parallel in a thread pool
3. and compiling the PTX into relocatable CUBIN modules in parallel.
4. Linking everything together

Only ptxas and nvptxcompiler allow compilation into relocatable modules,
but both of these two methods are not always available.

To still benefit from parallel LLVM lowering while not writing an entirely new compilation pipeline this compilation provider defers PTX compilation to the linking step.

PTX compilation will then not happen in parallel, but at least LLVM lowering will.

The implementation is not a new one. The same workaround is currently used in nvptxcompiler. This component will replace it.

PiperOrigin-RevId: 700285524
@copybara-service copybara-service bot merged commit dba3c67 into main Nov 26, 2024
@copybara-service copybara-service bot deleted the test_699071890 branch November 26, 2024 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant