Poster "llm-based agents" Papers

11 papers found

Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents

Qizheng Zhang, Michael Wornow, Kunle Olukotun

NEURIPS 2025arXiv:2506.14852
7
citations

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

Ke Yang, Yao Liu, Sapana Chaudhary et al.

ICLR 2025arXiv:2410.13825
69
citations

ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems

Xiangyuan Xue, Zeyu Lu, Di Huang et al.

CVPR 2025arXiv:2409.01392
15
citations

MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Zeyu Zhang, Quanyu Dai, Luyu Chen et al.

NEURIPS 2025arXiv:2409.20163
14
citations

OpenHands: An Open Platform for AI Software Developers as Generalist Agents

Xingyao Wang, Boxuan Li, Yufan Song et al.

ICLR 2025arXiv:2407.16741
387
citations

PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors

Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.

NEURIPS 2025arXiv:2507.15550
2
citations

Scaling Autonomous Agents via Automatic Reward Modeling And Planning

Zhenfang Chen, Delin Chen, Rui Sun et al.

ICLR 2025arXiv:2502.12130
15
citations

SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

Yifu Guo, Jiaye Lin, Huacan Wang et al.

NEURIPS 2025arXiv:2508.02085
22
citations

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Ibragim Badertdinov, Alexander Golubev, Maksim Nekrashevich et al.

NEURIPS 2025arXiv:2505.20411
33
citations

CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents

Qinlin Zhao, Jindong Wang, Yixuan Zhang et al.

ICML 2024arXiv:2310.17512
56
citations

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

Xueyu Hu, Ziyu Zhao, Shuang Wei et al.

ICML 2024arXiv:2401.05507
98
citations