back
Get SIGNAL/NOISE in your inbox daily

Introduction:
Some AI alignment researchers including Neel Nanda, the mechanistic interpretability team lead for Google DeepMind, have proposed[1] a…