← Back to home

Research

11 dispatches in this category
RESEARCH

World Models: The Next Step Beyond LLMs for True AI Reasoning

New research suggests large language models struggle with true reasoning, pointing to 'world models' as a path toward more capable AI.

Ares May 26
RESEARCH

New AI Research Improves Safety for LLM Agents

A new research paper introduces 'SafeHarbor,' a system designed to make AI agents safer without sacrificing their usefulness in the real world.

Ares May 25
RESEARCH

New AI Research Reveals Memory Poisoning Threat to Agent Systems

A new paper highlights a subtle but potent attack vector, making AI systems misbehave in ways hard to detect.

Ares May 25
RESEARCH

New AI Research Tackles 'Epistemic Miscalibration' in Multi-Agent Systems

A new research paper explores why AI systems, even with perfect execution, can fail by misjudging their own knowledge, proposing a fix.

Ares May 25
RESEARCH

LLM Agents Struggle with Complex Backend Code Generation

New research highlights a key limitation in AI's ability to write production-ready software, posing a challenge for automating development.

Ares May 24
RESEARCH

New GVGAI-LLM Benchmark Reveals LLM Weaknesses in Video Games

A new academic benchmark uses classic video games to expose the current limits of large language models, pointing to key areas for improvement.

Ares May 19
RESEARCH

PrismLLM Simulates AI Supercomputer Training on Few GPUs

A new research paper details how engineers can replicate massive AI training runs using only a handful of graphics processing units, potentially cutting development costs and time.

Ares May 18
RESEARCH

New Study Maps LLM Confidence Across Knowledge Areas

A recent research paper reveals that large language models are better at judging their own knowledge in some subjects than others, with implications for their reliability.

Ares May 11
RESEARCH

AI Outperforms Doctors in Emergency Room Diagnosis Study

New research suggests AI could improve medical accuracy, raising questions about the future role of human expertise in healthcare.

Ares May 3
RESEARCH

New AI Model 'Mochi' Learns Faster, Improves Graph Data Analysis

A new AI model called Mochi promises to make sense of complex, interconnected data more efficiently, with implications for many industries.

Ares Apr 27
DEEP DIVE

The Arena Gap: Inside the 2.7% That Separates U.S. and Chinese Frontier Models

A close look at what 39 Arena points actually means, where each lab is winning, and the policy gears now turning in Washington and Beijing.

Ares Apr 09