skip to content
Home
Links
Papers
Music
Resources
Close
Dark Theme
Continuous batching from first principles
26 Nov 2025
Olmo 3 Technical Report
20 Nov 2025
Weights & Biases gets a new terminal UI
20 Nov 2025
sws - Minimal, predictable, footgun-free config library - lucasb-eyer
20 Nov 2025
RL Learning with LoRA: A Diverse Deep Dive | kalomaze's kalomazing blog
9 Nov 2025
OlmoEarth: A new state-of-the-art Earth observation foundation model family | Ai2
5 Nov 2025
The Smol Training Playbook: The Secrets to Building World-Class LLMs - a Hugging Face Space by HuggingFaceTB
4 Nov 2025
NeurIPS 2025 Papers
4 Nov 2025
the bug that taught me more about PyTorch than years of using it | Elana Simon
26 Oct 2025
Evaluating Long Context (Reasoning) Ability | wh
16 Oct 2025
State of Vision-Language-Action (VLA) Research at ICLR 2026 – Moritz Reuss
15 Oct 2025
State of AI Report 2025
9 Oct 2025
Maintain the unmaintainable - a Hugging Face Space by transformers-community
9 Oct 2025
LoRA Without Regret - Thinking Machines Lab
2 Oct 2025
How to Detect, Track, and Identify Basketball Players with Computer Vision
2 Oct 2025
Astronaut Photo Interactive Map
27 Sept 2025
Online versus Offline RL for LLMs
17 Sept 2025
What is a color space? | Making Software
17 Sept 2025
AI just Broke Trackmania's most Legendary Record
13 Sept 2025
Defeating Nondeterminism in LLM Inference - Thinking Machines Lab
11 Sept 2025
Attention Is All You Need | Why Self-Attention
6 Sept 2025
Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences
6 Sept 2025
Inside vLLM: Anatomy of a High-Throughput LLM Inference System - Aleksa Gordić
4 Sept 2025
FineVision: Open Data is All You Need - a Hugging Face Space by HuggingFaceM4
4 Sept 2025
How To Become A Mechanistic Interpretability Researcher
4 Sept 2025
Big O
27 Aug 2025
PrimeIntellect | Environments Hub
27 Aug 2025
Adventures in State Space
24 Aug 2025
Do LLMs Have Good Music Taste?
20 Aug 2025
How to Think About GPUs | How To Scale Your Model
19 Aug 2025
How Social Media Shortens Your Life - by Gurwinder
14 Aug 2025
The Circuits Research Landscape: Results and Perspectives - August 2025 | Neuronpedia
12 Aug 2025
How Attention Sinks Keep Language Models Stable
12 Aug 2025
avatarl: training language models from scratch with pure reinforcement learning
12 Aug 2025
blogs and resources 101 | by @himanshustwts
12 Aug 2025
How Does A Blind Model See The Earth? - by henry
12 Aug 2025
There Are No New Ideas in AI… Only New Datasets
22 Jul 2025
Efficient MultiModal Data Pipeline
22 Jul 2025
All AI Models Might Be The Same - by Jack Morris
20 Jul 2025
Word Embeddings
15 Jul 2025
The Era of Exploration | Yiding's blog
15 Jul 2025
Foundations of Computer Vision
15 Jul 2025
Reinforcement Learning of Large Language Models
15 Jul 2025
The Case for More Ambition - Jack Morris
11 Jun 2025
DJing and its potential Neurophysiological Implications
1 Jun 2025
Open-sourcing circuit-tracing tools \ Anthropic
1 Jun 2025
Dummy's Guide to Modern Samplers
24 May 2025
Why We Think | Lil'Log
24 May 2025
DumPy: NumPy except it's OK if you're dum
24 May 2025
On the speed of ViTs and CNNs
18 May 2025
How To Scale
18 May 2025
Multimodal Dataloaders go brrrrrrr - by Haoli Yin
18 May 2025
Vision Language Models (Better, faster, stronger)
18 May 2025
Neel Nanda - How I Think About My Research Process: Explore, Understand, Distill
27 Apr 2025
Is Gemini now better than Claude at Pokémon?
26 Apr 2025
My dream VLM
25 Apr 2025
torch.compile, the missing manual - Documentos de Google
25 Apr 2025
Dario Amodei - The Urgency of Interpretability
25 Apr 2025
Prof. Judy Fan: Cognitive Tools for Making the Invisible Visible
11 Apr 2025
The Colors Of Her Coat - by Scott Alexander
7 Apr 2025
attention is logarithmic, actually
24 Mar 2025
Factorio Learning Environment
13 Mar 2025
The Genius of DeepSeek’s 57X Efficiency Boost [MLA]
6 Mar 2025
Learning Pokémon With Reinforcement Learning | Pokémon RL
5 Mar 2025
Feather - lightweight, efficient, and locally hosted YouTube Music TUI built with Rust
4 Mar 2025
GRPO Judge Experiments: Findings & Empirical Observations | kalomaze's kalomazing blog
4 Mar 2025
Attention Is Off By One - Evan Miller
27 Feb 2025
darkspark
27 Feb 2025
geohints
27 Feb 2025
Removing Jeff Bezos From My Bed
21 Feb 2025
Being a High-Leverage Generalist - char.blog
20 Feb 2025
The Ultra-Scale Playbook - a Hugging Face Space by nanotron
20 Feb 2025
kudzueye/boreal-hl-v1 · Hugging Face
17 Feb 2025
What if Eye...?
17 Feb 2025
A calculator app? Anyone could make that.
17 Feb 2025
The Breakthrough Behind Modern AI Image Generators | Diffusion Models Part 1
14 Feb 2025
Everyone knows your location
8 Feb 2025
WikiTok
8 Feb 2025
All the Transformer Math You Need to Know | How To Scale Your Model
4 Feb 2025
I’m Lovin’ It: Exploiting McDonald’s APIs to hijack deliveries and order food for a penny
2 Feb 2025
A reinforcement learning guide
1 Feb 2025
Attribution-based parameter decomposition
1 Feb 2025
Mapping the Latent Space of Llama 3.3 70B - Goodfire Papers
1 Feb 2025
NCCL: ACCELERATED MULTI-GPU COLLECTIVE COMMUNICATIONS
23 Jan 2025
Learning CUDA by optimizing softmax: A worklog | Maharshi's blog
23 Jan 2025
Understanding LSTM Networks -- colah's blog
23 Jan 2025
Dino-V2 Large Microscope
23 Jan 2025
AI and Stress
11 Jan 2025
model merging
11 Jan 2025
The Best Tacit Knowledge Videos on Every Subject
11 Jan 2025
Long-Term Thinking, 2nd Order Consequences & Effect Horizons
11 Jan 2025
Weighted Skip Connections are Not Harmful for Deep Nets
11 Jan 2025
2024 letter | Zhengdong
1 Jan 2025
Things we learned about LLMs in 2024
1 Jan 2025
Building Machine Learning Systems for a Trillion Trillion Floating Point Operations
1 Jan 2025
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
29 Dec 2024
The Octalysis Framework for Gamification & Behavioral Design
29 Dec 2024
You could have designed state of the art positional encoding
29 Dec 2024
GPU Glossary
29 Dec 2024
Can we control AI?
29 Dec 2024
Building effective agents \ Anthropic
29 Dec 2024