Skip to content

Pull requests: apple/axlearn

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add support to infer -1 in mesh shape
#113 by lyttonhao was merged Oct 12, 2023 Loading…
Exclude arm images for booting VM.
#123 by markblee was merged Oct 14, 2023 Loading…
Add query_scale and key_scale to MultiheadAttention.
#122 by apghml was merged Oct 20, 2023 Loading…
upgrade transformers version
#121 by gyin94 was closed Oct 14, 2023 Loading…
Fix typo in /docs/01-start.md
#120 by malivinayak was merged Oct 15, 2023 Loading…
Bump jax/jaxlib to 0.4.18.
#119 by tgunter was merged Oct 14, 2023 Loading…
[3/N] E2E comparison against LLAMA attention.
#118 by zbwglory was merged Oct 13, 2023 Loading…
Add Nested type.
#117 by apghml was merged Oct 12, 2023 Loading…
Make theta configurable in RoPE
#116 by alex8937 was merged Oct 13, 2023 Loading…
Tighten cryptography/pyopenssl pins.
#112 by markblee was merged Oct 11, 2023 Loading…
Fix RoPE dtype
#110 by alex8937 was merged Oct 11, 2023 Loading…
Make repeat in RepeatedTransformerLayer configurable.
#109 by xianzhidu was merged Oct 11, 2023 Loading…
Decouples bastion from GCP.
#108 by markblee was merged Oct 11, 2023 Loading…
Workaround JAX PJIT bug that was breaking image summaries.
#107 by apghml was merged Oct 11, 2023 Loading…
Splits output summaries of repeat layers
#106 by ruomingp was merged Oct 10, 2023 Loading…
Inherit GCP pins in TPU extras.
#103 by markblee was merged Oct 7, 2023 Loading…
ProTip! Exclude everything labeled bug with -label:bug.