Why observable AI is the missing SRE layer enterprises need for reliable LLMs
As AI systems enter production, reliability and governance can’t depend on wishful thinking. Here’s how observability turns large language models (LLMs) into auditable, trustworthy enterprise systems. Why observability secures the future of enterprise AI The enterprise race to deploy LLM systems mirrors the early days of cloud adoption. Executives love the promise; compliance demands accountability; […]
Anthropic says it solved the long-running AI agent problem with a new multi-session Claude SDK
Agent memory remains a problem that enterprises want to fix, as agents forget some instructions or conversations the longer they run. Anthropic believes it has solved this issue for its Claude Agent SDK, developing a two-fold solution that allows an agent to work across different context windows. “The core challenge of long-running agents is that […]
Beyond math and coding: New RL framework helps train LLM agents for complex, real-world tasks
Researchers at the University of Science and Technology of China have developed a new reinforcement learning (RL) framework that helps train large language models (LLMs) for complex agentic tasks beyond well-defined problems such as math and coding. Their framework, Agent-R1, is compatible with popular RL algorithms and shows considerable improvement on reasoning tasks that require […]
What to be thankful for in AI in 2025
Hello, dear readers. Happy belated Thanksgiving and Black Friday! This year has felt like living inside a permanent DevDay. Every week, some lab drops a new model, a new agent framework, or a new “this changes everything” demo. It’s overwhelming. But it’s also the first year I’ve felt like AI is finally diversifying — not […]
Prompt Security’s Itamar Golan on why generative AI security requires building a category, not a feature
VentureBeat recently sat down (virtually) with Itamar Golan, co-founder and CEO of Prompt Security, to chat through the GenAI security challenges organizations of all sizes face. We talked about shadow AI sprawl, the strategic decisions that led Golan to pursue building a market-leading platform versus competing on features, and a real-world incident that crystallized why […]
Alibaba’s AgentEvolver lifts model performance in tool use by ~30% using synthetic, auto-generated tasks
Researchers at Alibaba’s Tongyi Lab have developed a new framework for self-evolving agents that create their own training data by exploring their application environments. The framework, AgentEvolver, uses the knowledge and reasoning capabilities of large language models for autonomous learning, addressing the high costs and manual effort typically required to gather task-specific datasets. Experiments show […]
A weekend ‘vibe code’ hack by Andrej Karpathy quietly sketches the missing layer of enterprise AI orchestration
This weekend, Andrej Karpathy, the former director of AI at Tesla and a founding member of OpenAI, decided he wanted to read a book. But he did not want to read it alone. He wanted to read it accompanied by a committee of artificial intelligences, each offering its own perspective, critiquing the others, and eventually […]
Black Forest Labs launches Flux.2 AI image models to challenge Nano Banana Pro and Midjourney
It’s not just Google’s Gemini 3, Nano Banana Pro, and Anthropic’s Claude Opus 4.5 we have to be thankful for this year around the Thanksgiving holiday here in the U.S. No, today the German AI startup Black Forest Labs released FLUX.2, a new image generation and editing system complete with four different models designed to […]
OpenAI now lets enterprises choose where to host their data
OpenAI expanded its data residency regions for ChatGPT and its API, giving enterprise users the option to store and process their data closest to their business operations and better comply with local regulations. This expansion removes one of the biggest compliance blockers preventing global enterprises from deploying ChatGPT at scale. Data residency, often an overlooked piece […]
Microsoft’s Fara-7B is a computer-use AI agent that rivals GPT-4o and works directly on your PC
Microsoft has introduced Fara-7B, a new 7-billion parameter model designed to act as a Computer Use Agent (CUA) capable of performing complex tasks directly on a user’s device. Fara-7B sets new state-of-the-art results for its size, providing a way to build AI agents that don’t rely on massive, cloud-dependent models and can run on compact […]
