-
Notifications
You must be signed in to change notification settings - Fork 280
Pull requests: apple/axlearn
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Similar to a previous change to repeat.py, changes pipeline.py to split stacked summary values.
#125
by ruomingp
was merged Oct 16, 2023
Loading…
Updates chex version to >= 0.1.7 to be compatible with jax 0.4.18.
#124
by ruomingp
was merged Oct 15, 2023
Loading…
Add query_scale and key_scale to MultiheadAttention.
#122
by apghml
was merged Oct 20, 2023
Loading…
Replaces uses of
utils.host_to_global_device_array
with map_utils.host_local_array_to_global_array
.
#115
by ruomingp
was closed Oct 13, 2023
Loading…
Add fan_axes to ParameterSpec and to create_parameter_specs_recursively().
#101
by apghml
was merged Oct 6, 2023
Loading…
Add option of whether to apply rotary position embeddings on value
#111
by alex8937
was merged Oct 11, 2023
Loading…
Make repeat in RepeatedTransformerLayer configurable.
#109
by xianzhidu
was merged Oct 11, 2023
Loading…
Workaround JAX PJIT bug that was breaking image summaries.
#107
by apghml
was merged Oct 11, 2023
Loading…
[1/N] Adding Rotary position embedding test against LLAMA implementation
#105
by zbwglory
was merged Oct 10, 2023
Loading…
Adds
SpmdTrainer.Config.mesh_rules
and --mesh_selector
to override trainer mesh configuration.
#104
by ruomingp
was merged Oct 8, 2023
Loading…
fix issue where flax structs were not recursed into by tree_paths
#102
by apghml
was merged Oct 6, 2023
Loading…
ProTip!
Exclude everything labeled
bug
with -label:bug.