Subscribe
Sign in
Home
Notes
Archive
About
Attention from First Principles - 6
DeltaNet: Error-Driven Memory Updates for Linear Attention
Apr 17
•
shashank sane
1
Code Is Cheap Now. Here’s What Actually Matters.
Orchestration is the new bottleneck. Imagination might be the last one.
Apr 10
•
shashank sane
Attention from First Principles - 5
Gated Linear Attention
Apr 4
•
shashank sane
1
March 2026
To Invent Is to Choose
Why AI is the new subconscious mind for scientific discovery
Mar 26
•
shashank sane
1
Attention from First Principles - 4
Linear Attention and the Memory Wall
Mar 22
•
shashank sane
2
1
Understanding (RoPE) Rotary Position Embeddings - 2
RoPE in Attention Mechanism and DeepSeek's decoupled RoPE
Mar 3
•
shashank sane
1
February 2026
Understanding (RoPE) Rotary Position Embeddings - 1
From Llama to DeepSeek, How Rotation Helps Models Remember Order!!
Feb 22
•
shashank sane
1
Attention from First Principles - 3
Grouped Query Attention (GQA) and Multi-Head Latent Attention (MHLA)
Feb 5
•
shashank sane
1
January 2026
Attention from first Principles - 2
Multi-Head Attention and Causal Self-Attention
Jan 26
•
shashank sane
3
1
Attention from First Principles - 1
The foundations - self attention & scaled dot product attention.
Jan 22
•
shashank sane
3
1
The Lattice of Deep Learning - Coming soon!!
About - The Lattice of Deep Learning
Jan 4
•
shashank sane
5
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts