"monte carlo tree search" Papers

38 papers found

AFlow: Automating Agentic Workflow Generation

Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu et al.

ICLR 2025arXiv:2410.10762
153
citations

AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise

Dhruv Agarwal, Bodhisattwa Prasad Majumder, Reece Adamson et al.

NEURIPS 2025oralarXiv:2507.00310
3
citations

CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search

Xiao-Wen Yang, Zhi Zhou, Haiming Wang et al.

ICLR 2025
4
citations

Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks

Bowei He, Lihao Yin, Huiling Zhen et al.

ICLR 2025arXiv:2502.06892
4
citations

CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

Shengzhuang Chen, Yikai Liao, Xiaoxiao Sun et al.

ICLR 2025arXiv:2503.04655
1
citations

DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding

Geng Li, Jinglin Xu, Yunzhen Zhao et al.

CVPR 2025highlightarXiv:2504.14920
29
citations

ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning

Xiao Yu, Baolin Peng, Vineeth Vajipey et al.

ICLR 2025arXiv:2410.02052
37
citations

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Yifan Shen, Yuanzhe Liu, Jingyuan Zhu et al.

NEURIPS 2025arXiv:2506.21656
5
citations

Grammar Reinforcement Learning: path and cycle counting in graphs with a Context-Free Grammar and Transformer approach

Jason Piquenot, Maxime Berar, Romain Raveaux et al.

ICLR 2025

Graph-based Symbolic Regression with Invariance and Constraint Encoding

Ziyu Xiang, Kenna Ashen, Xiaofeng Qian et al.

NEURIPS 2025

HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning

Zhi Jing, Siyuan Yang, Jicong Ao et al.

NEURIPS 2025arXiv:2507.00833
6
citations

Implicit Search via Discrete Diffusion: A Study on Chess

Jiacheng Ye, Zhenyu Wu, Jiahui Gao et al.

ICLR 2025arXiv:2502.19805
14
citations

Improving Monte Carlo Tree Search for Symbolic Regression

Zhengyao Huang, Daniel Huang, Tiannan Xiao et al.

NEURIPS 2025arXiv:2509.15929

MALinZero: Efficient Low-Dimensional Search for Mastering Complex Multi-Agent Planning

Sizhe Tang, Jiayu Chen, Tian Lan

NEURIPS 2025arXiv:2511.06142
4
citations

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver

Zhenting Qi, Mingyuan MA, Jiahang Xu et al.

ICLR 2025arXiv:2408.06195
129
citations

PlanU: Large Language Model Reasoning through Planning under Uncertainty

Ziwei Deng, Mian Deng, Chenjing Liang et al.

NEURIPS 2025arXiv:2510.18442

RF-Agent: Automated Reward Function Design via Language Agent Tree Search

Ning Gao, Xiuhui Zhang, Xingyu Jiang et al.

NEURIPS 2025spotlight

Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction

Baiting Luo, Ava Pettet, Aron Laszka et al.

ICLR 2025oralarXiv:2502.21186
3
citations

SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

Yifu Guo, Jiaye Lin, Huacan Wang et al.

NEURIPS 2025arXiv:2508.02085
22
citations

SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents

Wanxin Tian, Shijie Zhang, Kevin Zhang et al.

NEURIPS 2025arXiv:2506.21669
6
citations

SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation

Yanwei Ren, Haotian Zhang, Fuxiang Wu et al.

NEURIPS 2025spotlightarXiv:2506.06470

SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search

Dong Li, Xujiang Zhao, Linlin Yu et al.

NEURIPS 2025arXiv:2510.16916
1
citations

Strength Estimation and Human-Like Strength Adjustment in Games

Chun Jung Chen, Chung-Chin Shih, Ti-Rong Wu

ICLR 2025arXiv:2502.17109
1
citations

Threshold UCT: Cost-Constrained Monte Carlo Tree Search with Pareto Curves

Martin Kurečka, Václav Nevyhoštěný, Petr Novotný et al.

AAAI 2025paperarXiv:2412.13962
1
citations

Uncertainty-Guided Exploration for Efficient AlphaZero Training

Scott Cheng, Meng-Yu Tsai, Ding-Yong Hong et al.

NEURIPS 2025

Understanding Methods for Scalable MCTS

Will Knipe

ICLR 2025

WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration

Yao Zhang, Zijian Ma, Yunpu Ma et al.

AAAI 2025paperarXiv:2408.15978
83
citations

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Yuichi Inoue, Kou Misaki, Yuki Imajuku et al.

NEURIPS 2025spotlightarXiv:2503.04412
24
citations

A Bayesian Approach to Online Planning

Nir Greshler, David Ben Eli, Carmel Rabinovitz et al.

ICML 2024arXiv:2406.02103
1
citations

DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)

Zongxin Yang, Guikun Chen, Xiaodi Li et al.

ICML 2024oralarXiv:2401.08392
64
citations

Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning

Yizhe Huang, Anji Liu, Fanqi Kong et al.

ICML 2024arXiv:2406.08002
5
citations

Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models

Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman et al.

ICML 2024

Monte Carlo Tree Search in the Presence of Transition Uncertainty

Farnaz Kohankhaki, Kiarash Aghakasiri, Hongming Zhang et al.

AAAI 2024paperarXiv:2312.11348
3
citations

Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing

Amutheezan Sivagnanam, Ava Pettet, Hunter Lee et al.

ICML 2024arXiv:2405.13205
7
citations

Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems

Yifan Xia, Xianliang Yang, Zichuan Liu et al.

ICML 2024arXiv:2406.03503
22
citations

Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization

Liam Schramm, Abdeslam Boularias

ICML 2024arXiv:2407.05511
1
citations

Sample-and-Bound for Non-convex Optimization

Yaoguang Zhai, Zhizhen Qin, Sicun Gao

AAAI 2024paperarXiv:2401.04812
1
citations

Scalable Safe Policy Improvement for Factored Multi-Agent MDPs

Federico Bianchi, Edoardo Zorzi, Alberto Castellini et al.

ICML 2024