MiniMax-M3 debuts, eclipsing GPT-5.5 and Gemini 3.1 Pro on key benchmark performance for just 5-10% of the cost
Big news in enterprise AI broke over the weekend as Chinese AI startup MiniMax released its highly anticipated M3 large language model on Sunday evening Eastern time, pairing frontier-tier coding and agentic performance with a 1-million-token context window and native multimodality for a fraction of the cost of leading proprietary models, with pricing starting at […]
Anthropic’s browser agent got hijacked 31.5% of the time before safeguards engaged
Across the frontier labs, the highest prompt injection figures published this spring are Anthropic’s. Point a red-teamer at its newest model in a browser, and the attacker hijacked it 31.5% of the time before safeguards engaged. OpenAI, Google, and Meta never gave security leaders a comparable number to set beside it. That figure looks like […]
AI doesn’t break security. Complexity does
Presented by Snowflake Too often, the history of enterprise security has been a history of making things harder to use. A new threat emerges, a new control gets bolted on, and somewhere in the process, people start working around the very systems designed to protect them. Over the course of my career, I’ve seen firsthand […]
Claude Mythos exposed a hard truth: Your enterprise patching process is way too slow
In 2024, researchers from the University of Illinois found that GPT-4, when provided with a common vulnerabilities and exposures (CVE) description, could autonomously exploit 87% of a curated 15-vulnerability one-day dataset. Without the description, it could only exploit 7%. This provided a “margin of safety” for the industry because while AI could exploit known vulnerabilities, […]
The AI agent bottleneck isn’t model performance — it’s permissions
Enterprise AI agents are stalling — not because of model performance, but because of permissioning. Every agentic workflow eventually hits the same wall: what is this agent allowed to touch, on whose behalf, and how does the system know? Workday’s answer is to make its existing system of record the governance layer for agents. Gerrit […]
MeMo’s memory model lets teams upgrade their LLM without retraining it — and performance jumps 26%
Enabling LLMs to acquire new knowledge after training remains a major hurdle for enterprise AI — current solutions are either too expensive, too slow, or constrained by context window limits. MeMo, a framework from researchers at multiple universities, encodes new knowledge into a dedicated smaller memory model that operates separately from the main LLM. The […]
Pinterest cut AI costs 90% by gutting a frontier model’s vision layer
At 620 million monthly users, calling a frontier model for every image recommendation isn’t a strategy — it’s a bill. Pinterest CTO Matt Madrigal solved it by gutting Qwen3-VL’s vision layer and rebuilding it with proprietary embeddings, cutting costs 90% and boosting accuracy 30%. Madrigal’s team has been heavily investing in customizing open-source models “foundationally […]
Mistral AI launches Vibe, expands into industrial AI and announces data center push to challenge OpenAI
Mistral AI used its inaugural conference on Wednesday to announce a sweeping expansion into industrial manufacturing, a new inference data center south of Paris, and a rebranding of its consumer-facing assistant — moves that collectively signal the three-year-old French startup’s ambition to become the enterprise AI provider of record for companies that refuse to hand […]
Anthropic’s Claude Opus 4.8 is here with 3X cheaper fast mode and near-Mythos level alignment
Anthropic today released Claude Opus 4.8, an upgrade to its flagship model that ships at the same price as its predecessor, alongside a dramatically cheaper “fast mode” tier and a new feature that lets the model spawn hundreds of parallel subagents for codebase-scale work. The model is available immediately across Anthropic’s surfaces — claude.ai, Claude […]
How DeepSeek’s radical architecture is shattering Silicon Valley’s token moat
DeepSeek’s announcement over the weekend that it has made its 75% price cut permanent on its flagship V4 Pro model is a disruptive assault on the capital-heavy business models of Silicon Valley’s frontier labs. The reduction on DeepSeek V4 Pro directly undercuts comparable Western models used as workhorses for enterprise production. It is 7x cheaper […]
