The researchers of Anthropic’s interpretability group know that Claude , the company’s large language model, is not a human being, or even a conscious piece of software. Still, it’s very hard for them to talk about Claude , and advanced LLMs in general, without tumbling down an anthropomorphic sinkhole. Between cautions that a set of digital operations is in no way the same as a cogitating human being, they often talk about what’s going on inside Claude’s head. It’s literally their job to find out. The papers they publish describe behaviors that inevitably court comparisons with real-life organisms. The […]