LangSmith Engine closes the agent debugging loop automatically — but multi-model enterprises still need a neutral layer
Enterprises building and deploying agents have a problem: it’s taking their engineers too long to find out that an agent made a mistake, and the loop has continued to perpetuate, especially without a human at every step. LangSmith, the monitoring and evaluation platform from LangChain, launched a new capability in public beta that could make […]
Architectural patterns for graph-enhanced RAG: Moving beyond vector search in production
Retrieval-augmented generation (RAG) has become the de facto standard for grounding large language models (LLMs) in private data. The standard architecture — chunking documents, embedding them into a vector database, and retrieving top-k results via cosine similarity — is effective for unstructured semantic search. However, for enterprise domains characterized by highly interconnected data (supply chain, […]
The enterprise risk nobody is modeling: AI is replacing the very experts it needs to learn from
For AI systems to keep improving in knowledge work, they need either a reliable mechanism for autonomous self-improvement or human evaluators capable of catching errors and generating high-quality feedback. The industry has invested enormously in the first. It’s giving almost no thought to what’s happening to the second. I’d argue that we need to treat […]
Intercom, now called Fin, launches an AI agent whose only job is managing another AI agent
The company formerly known as Intercom just did something that no major customer service platform has attempted at scale: it built an AI agent whose sole job is to manage another AI agent. Fin Operator, announced Thursday at a live event in San Francisco, is a new AI-powered system designed specifically for the back-office teams […]
How RecursiveMAS speeds up multi-agent inference by 2.4x and reduces token usage by 75%
One of the key challenges of current multi-agent AI systems is that they communicate by generating and sharing text sequences, which introduces latency, drives up token costs, and makes it difficult to train the entire system as a cohesive unit. To overcome this challenge, researchers at University of Illinois Urbana-Champaign and Stanford University developed RecursiveMAS, […]
Claude’s next enterprise battle is not models: it’s the agent control plane
New VB Pulse data shows Microsoft and OpenAI leading enterprise agent orchestration, but Anthropic’s first measurable foothold points to a larger fight over who controls the infrastructure where AI agents run. For the last two years, the enterprise AI race has mostly been framed as a model war: OpenAI’s GPT series versus Anthropic’s Claude versus […]
Developers can now debug and evaluate AI agents locally with Raindrop’s open source tool Workshop
Observability startup Raindrop AI’s new open source, MIT Licensed “Workshop” tool, launched today, gives developers something that they’ve likely wanted, perhaps subconsciously, since the agentic AI era kicked off in earnest last year: a local debugger and evaluation tool specifically designed for AI agents, allowing devs to see all the traces of what their agent […]
Cerebras stock nearly doubles on day one as AI chipmaker hits $100 billion — what it means for AI infrastructure
Cerebras Systems, the Silicon Valley chipmaker that built the world’s largest commercial AI processor, erupted onto the Nasdaq on Wednesday, opening at $350 per share — nearly double its $185 IPO price — and rocketing past a $100 billion market capitalization in its first hours of trading. The debut instantly crowned Cerebras as one of […]
Agent authorization is broken — and authentication passing makes it worse
Anthony Grieco, Cisco’s SVP and chief security and trust officer, did not hesitate when VentureBeat asked whether rogue agent incidents are reaching Cisco’s customer base. “A hundred percent. We see them regularly,” Grieco told VentureBeat in an exclusive interview at RSAC 2026. “I’ve heard some that I can’t repeat, but they do get to the […]
Claude Code’s ‘/goals’ separates the agent that works from the one that decides it’s done
A code migration agent finishes its run, and the pipeline looks green. But several pieces were never compiled — and it took days to catch. That’s not a model failure; that’s an agent deciding it was done before it actually was. Many enterprises are now seeing that production AI agent pipelines fail not because of […]
