New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Adds a flash attention layer. #26

Merged

markblee merged 1 commit into apple:main from markblee:markblee/triton

Aug 4, 2023

Contributor

markblee commented Aug 4, 2023

Adds basic support for attention logit biases in the pallas attention op.
Adds a benchmarking script for fwd+bwd.
Adds FlashAttention layer which is (mostly) a drop-in replacement for MultiheadAttention (with caveats, see docstrings).
Updates licenses/acknowledgements.


          Adds a flash attention layer.

c8cbbf2

markblee requested a review from ruomingp

August 4, 2023 17:09

markblee merged commit 7e45e86 into apple:main

markblee deleted the markblee/triton branch

August 4, 2023 18:10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet