Resources • Some things I like

skip to content

Home Links Papers Music Resources

Blogs

Latent Space
Writing • Eugene Yan
Dwarkesh Patel | Substack
Writing and mumblings - Jason Liu
Chip Huyen
Lilian Weng's Blog
Ludwig Abap's Blog
MLAbonne's Blog

CV

History of Residuals and a Word of Caution
YT - Advanced CV - UCF Center

Distributed Training

CMU Advanced NLP Spring 2025 (16): Parallelism and Scaling
The Ultra-Scale Playbook - a Hugging Face Space by nanotron
NCCL: ACCELERATED MULTI-GPU COLLECTIVE COMMUNICATIONS
Communication Patterns
Everything about Distributed Training and Efficient Finetuning
FSDP & CUDACachingAllocator: an outsider newb perspective
ML Engineering: Model Parallelism
NERSC SC23 DL Tutorial

Interpretability

Learning Multi-Level Features with Matryoshka SAEs — AI Alignment Forum
Lens
SAE Explorer
Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers — LessWrong
Transformer Circuits
dev.log - Gazing in the Latent Space with Sparse Autoencoders
Latent taxonomy

LLM

FP64, FP32, FP16, BFLOAT16, TF32, and other members of the ZOO
Fast LLM Inference From Scratch
How to make LLMs go fast
Physics of Language Models
Where do LLMs spend their FLOPs?
You could have designed state of the art positional encoding

Others

umarjamilai
WelchLabsVideo
GPU MODE
Building Blocks for Theoretical Computer Science
CUDA Tutorial
GPU Glossary
GenAI Handbook
How computers work - Building Scott's CPU
ML Resources
My ML Resources! | Water
YT - Kay Lach
YT - Quantum Sense
rsrch space

RL

Reinforcement Learning of Large Language Models
a reinforcement learning guide

Repositories

VizTracer
GitHub - huggingface/nanoVLM: The simplest, fastest repository for training/finetuning small-sized VLMs.
Instructor: Caching
LLM Training Puzzles
ML Engineering

Transformers

The Genius of DeepSeek’s 57X Efficiency Boost [MLA]
Transformers and Self-Attention (DL 19)
Intro to Transformers
A Mathematical Framework for Transformer Circuits
Attention in transformers, visually explained | Chapter 6, Deep Learning
LLM Visualization
Random Transformer
The Transformer Blueprint: A Holistic Guide
The illustrated Transformer
The annotated transformer
The math behind Attention: Keys, Queries, and Values matrices
Transformer Explainer: LLM Transformer Model Visually Explained

VLM

SkalskiP - VLMs zero to hero
Minimind-V
moondream/moondream/torch at main · vikhyat/moondream · GitHub

Categories

Blogs
CV
Distributed Training
Interpretability
LLM
Others
RL
Repositories
Transformers
VLM

LinkedIn

Home Links Papers Music Resources