Stars
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
[CIKM'23] Official code for our paper "Spatio-Temporal Adaptive Embedding Makes Vanilla Transformer SOTA for Traffic Forecasting".
Natural Language Processing Tutorial for Deep Learning Researchers
Code repository for "Continuous Improvement of Self-Driving Cars using Dynamic Confidence-Aware Reinforcement Learning"