Skip to content
View alexkstern's full-sized avatar

Block or report alexkstern

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The Abstraction and Reasoning Corpus

JavaScript 4,682 700 Updated Apr 4, 2025

Voxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GPT4o (closed) or Moshi (complex), it's open, simple, natural.

Python 39 5 Updated Mar 7, 2025

A PyTorch implementation of a Bigram Language Model using Transformer architecture for character-level text generation.

Python 1 Updated Jul 28, 2024

Trains small LMs. Designed for training on SimpleStories

Python 12 4 Updated Sep 15, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,470 1,998 Updated Nov 1, 2025
Python 108 9 Updated Sep 13, 2025

Code for 'Emergent Analogical Reasoning in Large Language Models'

Python 51 9 Updated May 5, 2024

Tools for working with the Abstraction & Reasoning Corpus

Python 212 28 Updated Aug 22, 2025

Towards Human-Sounding Speech

Python 5,834 500 Updated Dec 5, 2025

Self-contained, minimalistic implementation of diffusion models with Pytorch.

Python 1,135 142 Updated Jun 28, 2022

Code for "Reasoning to Learn from Latent Thoughts"

Python 124 4 Updated Mar 28, 2025
Python 130 12 Updated Dec 23, 2024
Python 3 Updated Dec 9, 2025
Rust 1 Updated Mar 10, 2025
Python 3 Updated Feb 12, 2023

A proof of concept for Calendar integration within a Svelte/MeteorJS app

JavaScript 1 Updated Mar 3, 2023
JavaScript 2 Updated Mar 6, 2025
Roff 1 Updated Jan 31, 2024
1 Updated Mar 20, 2024
HTML 1 Updated Aug 23, 2025

Sparsify transformers with SAEs and transcoders

Python 676 90 Updated Dec 22, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 49,907 4,118 Updated Dec 23, 2025
Python 4 Updated Sep 25, 2023
Python 98 27 Updated Aug 29, 2024

PyTorch implementation of TimesFM model.

Python 11 1 Updated Jul 5, 2024

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Python 8,285 963 Updated Feb 25, 2022

A small language model in PyTorch implementing the Tiny Stories paper.

Python 5 Updated Jan 14, 2025

Large Concept Models: Language modeling in a sentence representation space

Python 2,313 204 Updated Jan 29, 2025

[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)

Python 679 90 Updated Feb 29, 2024
Next