Skip to content

Pull requests: apple/axlearn

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Support skipping warmup in cosine schedule.
#697 by xianzhidu was merged Sep 11, 2024 Loading…
Generalized top-k gating for MoE.
#679 by xianzhidu was merged Aug 29, 2024 Loading…
A few updates to MoE and test_utils
#654 by xianzhidu was merged Aug 16, 2024 Loading…
Support naive mesh in multi-slice env.
#234 by xianzhidu was merged Dec 11, 2023 Loading…
Support aux loss in causal_lm.Model
#198 by xianzhidu was merged Nov 26, 2023 Loading…
Keep constant LR in cosine schedule
#192 by xianzhidu was merged Nov 22, 2023 Loading…
Update set_double_shard_weights_config
#146 by xianzhidu was merged Oct 27, 2023 Loading…
Make repeat in RepeatedTransformerLayer configurable.
#109 by xianzhidu was merged Oct 11, 2023 Loading…
Make logits partition specs configurable in decoder.
#41 by xianzhidu was merged Aug 23, 2023 Loading…
Fix some permlinks
#5 by xianzhidu was merged Jul 22, 2023 Loading…
Quick fix
#3 by xianzhidu was merged Jul 19, 2023 Loading…
Add l2 norms for the VQ-VAE quantizer.
#2 by xianzhidu was merged Jul 18, 2023 Loading…
ProTip! Updated in the last three days: updated:>2025-01-15.