🌱 Sanzhar B.

❯

❯

Neel Nanda's tutorial

Neel Nanda's tutorial

Feb 01, 20261 min read

Here, I will follow Neel Nanda’s tutorial on How to Become A Mechanistic Interpretability Researcher.

Keep in mind

Do not just read things. Mech interp is a fundamentally empirical science.

Stage 1: Learning the ropes

I already have a decent understanding of linear algebra and all the mathematical tools needed.

I’ll start with reading on Transformers: Chapter 12 of Understanding Deep Learning Textbook
and Anthropic’s paper on a Mathematical Framework for Transformer Circuits.
Code a simple Transformer like GPT-2 from scratch. He suggests using ARENA Chapter 1.1

Refer to Ferrando et al
Code yourself activation patching
linear probes
Using SAEs
Max Activating Dataset Examples

Graph View

Backlinks

No backlinks found

Created with Quartz v4.4.0 © 2026

GitHub
Community of Ask