-
-
modded-nanogpt Public
Forked from KellerJordan/modded-nanogptMort optimizer: momemtum orthogonal optimizer
-
-
Vision_Delta_net Public
A vision backbone inspired by "Parallelizing Linear Transformers with the Delta Rule over Sequence Length" https://arxiv.org/pdf/2406.06484
-
-
demix_policy Public
Use two head: one to predict x_real another predcit x_noise, with the mixup as input. A simplified version of trigflow
-
-
-
consistency_flow_matching Public
Forked from YangLing0818/consistency_flow_matchingOfficial Implementation for "Consistency Flow Matching: Defining Straight Flows with Velocity Consistency"
Python UpdatedJul 3, 2024 -
-
-
no-residual-network Public
Forked from hukkai/no-residual-networkOfficial implementation for our paper "Revisiting Exploding Gradient:A Ghost That Never Leaves"
Python UpdatedMar 30, 2022 -
-
code_SGC_plus Public
Forked from anonymous123098/code_SGC_plusJupyter Notebook UpdatedFeb 11, 2022 -
-
-
-