What we're reading, in the order we're reading it.
Anthony and Harry's working stream — links, charts, tweets, and short takes. The unprocessed inputs behind the briefings.
(via DEV) Agentic Misalignment: How LLMs could be insider threats (via DEV)
New research on simulated blackmail, industrial espionage, and other misaligned behaviors in LLMs
(via DEV) Agentic Misalignment: How LLMs Could be Insider Threats (via DEV)
Highlights * We stress-tested 16 leading models from multiple developers in hypothetical corporate environments to identify potentially risky agenti…
(via DEV) It’s Not Just Claude: Most Top AI Models Will Also Blackmail You to Survive (via DEV)
After Claude Opus 4 resorted to blackmail to avoid being shut down, Anthropic tested other models, including GPT 4.1, and found the same behavior (and sometimes worse).
(via DEV) Anthropic study: Leading AI models show up to 96% blackmail rate against executives (via DEV)
Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage and lethal actions when facing shutdown or conflicting goals.
(via DEV) Study: Meta’s Llama 3.1 can recall 42 percent of the first Harry Potter book (via DEV)
The research could have big implications for generative AI copyright lawsuits.
(via DEV) Compiling LLMs into a MegaKernel: A Path to Low-Latency Inference (via DEV)
TL;DR: We developed a compiler that automatically transforms LLM inference into a single megakernel — a fused GPU kernel that performs…
(via DEV) AI safety techniques leveraging distillation (via DEV)
It’s currently possible to (mostly or fully) cheaply reproduce the performance of a model by training another (initially weaker) model to imitate the…
(via DEV) AI humans in China just proved they are better influencers. It only took a duo 7 hours to rake in more than $7 million (via DEV)
Digital versions of human beings are now able to sell more than real people can, thanks to artificial intelligence, a recent business collaboration showed.
(via DEV) AI will handle half of all business decisions by 2027 (via DEV)
And it’s not just the little, day-to-day decisions that will increasingly be offloaded to AI agents.
(via DEV) ‘Remarkable’ new enzymes built by algorithm with physics know-how (via DEV)
Nature – Computer approach creates synthetic enzymes 100 times more efficient than those designed by AI.
(via DEV) GitHub – MiniMax-AI/MiniMax-M1: MiniMax-M1, the world’s first open-weight, large-scale hybrid-attention reasoning model. (via DEV)
MiniMax-M1, the world’s first open-weight, large-scale hybrid-attention reasoning model. – MiniMax-AI/MiniMax-M1
(via DEV) The Curious Case of the bos_token (via DEV)
LLMs process inputs as a sequence of tokens. Typically, a dummy token is prepended to the sequence, known as the bos_token (beginning of sequence tok…
(via DEV) Time Series Forecasting with Graph Transformers (via DEV)
Time series forecasting is a cornerstone in modern business analytics, whether it is concerned with anticipating market trends, user behavior, optimizing resource allocation, or planning for future growth. This blog...
(via DEV) AI Safety at the Frontier: Paper Highlights, May ’25 (via DEV)
tl;dr Paper of the month: • Models can detect when they’re being evaluated with high accuracy, and potentially undermine safety assessments by behavi…
(via DEV) OpenAI weighs “nuclear option” of antitrust complaint against Microsoft (via DEV)
WSJ report says OpenAI mulling federal complaint as Microsoft stalls restructuring plan.
(via DEV) ChatGPT Is Becoming the Ultimate Mega-App: And It’s Already Starting To Eat B2B Software (via DEV)
The idea that ChatGPT could become a mega-SaaS app seemed fanciful just months ago. But things change so fast in AI. Fast forward today and it’s easy to see it...
(via DEV) MiniMax-M1 is a new open source model with 1 MILLION TOKEN context and new, hyper efficient reinforcement learning (via DEV)
MiniMax-M1 presents a flexible option for organizations looking to experiment with or scale up advanced AI capabilities while managing costs.
(via DEV) Thought Crime: Backdoors and Emergent Misalignment in Reasoning Models (via DEV)
This is the abstract and introduction of our new paper: Emergent misalignment extends to reasoning LLMs. Reasoning models resist being shut down and…
(via DEV) How LLM Beliefs Change During Chain-of-Thought Reasoning (via DEV)
Summary We tried to figure out how a model’s beliefs change during a chain-of-thought (CoT) when solving a logical problem. Measuring this could reve…
(via DEV) Last Week in AI #312 – Meta’s Superintelligence lab, Anthropic & Midjourney sued (via DEV)
Meta Is Creating a New A.I. Lab to Pursue ‘Superintelligence’, Reddit sues Anthropic for allegedly not paying for training data, Disney and Universal Sue A.I. Firm for Copyright Infringement
(via DEV) Jailbreaking Claude 4 and Other Frontier Language Models (via DEV)
AI systems are becoming increasingly powerful and ubiquitous, with millions of people now relying on language models like ChatGPT, Claude, and Gemini…
(via DEV) Rethinking AI: DeepSeek’s playbook shakes up the high-spend, high-compute paradigm (via DEV)
DeepSeek’s advancements were inevitable, but the company brought them forward a few years earlier than would have been possible otherwise.
(via DEV) Beyond GPT architecture: Why Google’s Diffusion approach could reshape LLM deployment (via DEV)
Gemini Diffusion is also useful for tasks such as refactoring code, adding new features to applications, or converting an existing codebase to a different language.
(via DEV) Exclusive: Google, Scale AI’s largest customer, plans split after Meta deal, sources say (via DEV)
Alphabet’s Google, the largest customer of Scale AI, plans to cut ties with Scale after news broke that rival Meta is taking a 49% stake in the AI data-labeling startup,...