
Stanford Researchers Introduced MedAgentBench: A Real-World Benchmark for Healthcare AI Agents
A team of Stanford University researchers have released MedAgentBench, a new benchmark suite designed to evaluate large language model (LLM) agents in healthcare contexts. Unlike prior question-answering datasets, MedAgentBench provides a virtual electronic health record […]