Skip to content

Commit

Permalink
DeepSeek-V2-MoE transformer config adapt to mcore0.6.0 (alibaba#253)
Browse files Browse the repository at this point in the history
Co-authored-by: one_game <[email protected]>
  • Loading branch information
one-game and one_game authored Jun 10, 2024
1 parent b9d128d commit 942c562
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions megatron_patch/model/deepseek_v2/transformer_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,7 @@ class DeepSeekV2TransformerConfig(TransformerConfig):
rotary_base: int = None

rotary_scaling_factor: int = None

max_position_embeddings: int = None

moe_aux_loss_coeff: float = 0.0

0 comments on commit 942c562

Please sign in to comment.