"large language models" Papers

986 papers found • Page 10 of 20

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models

Bofei Gao, Feifan Song, Zhe Yang et al.

ICLR 2025arXiv:2410.07985
149
citations

On Effects of Steering Latent Representation for Large Language Model Unlearning

Huu-Tien Dang, Tin Pham, Hoang Thanh-Tung et al.

AAAI 2025paperarXiv:2408.06223

One Filters All: A Generalist Filter For State Estimation

Shiqi Liu, Wenhan Cao, Chang Liu et al.

NEURIPS 2025arXiv:2509.20051
2
citations

One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

Yutao Zhu, Zhaoheng Huang, Zhicheng Dou et al.

AAAI 2025paperarXiv:2405.19670
9
citations

On Large Language Model Continual Unlearning

Chongyang Gao, Lixu Wang, Kaize Ding et al.

ICLR 2025arXiv:2407.10223
30
citations

Online Mixture of Experts: No-Regret Learning for Optimal Collective Decision-Making

Larkin Liu, Jalal Etesami

NEURIPS 2025arXiv:2510.21788

Online Preference Alignment for Language Models via Count-based Exploration

Chenjia Bai, Yang Zhang, Shuang Qiu et al.

ICLR 2025arXiv:2501.12735
20
citations

On LLM Knowledge Distillation - A Comparison between Forward KL and Reverse KL

Yihan Cao, Yanbin Kang

ICLR 2025

On Speeding Up Language Model Evaluation

Jin Zhou, Christian Belardi, Ruihan Wu et al.

ICLR 2025arXiv:2407.06172
6
citations

On the Crucial Role of Initialization for Matrix Factorization

Bingcong Li, Liang Zhang, Aryan Mokhtari et al.

ICLR 2025arXiv:2410.18965
11
citations

On the Role of Attention Heads in Large Language Model Safety

Zhenhong Zhou, Haiyang Yu, Xinghua Zhang et al.

ICLR 2025arXiv:2410.13708
43
citations

On the self-verification limitations of large language models on reasoning and planning tasks

Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

ICLR 2025arXiv:2402.08115
109
citations

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Javier Rando, Tony Wang, Stewart Slocum et al.

ICLR 2025arXiv:2307.15217
750
citations

OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?

Junjielong Xu, Qinan Zhang, Zhiqing Zhong et al.

ICLR 2025
21
citations

Open-Source vs Close-Source: The Context Utilization Challenge

Litu Ou

ICLR 2025

OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling

Zhicheng YANG, Yiwei Wang, Yinya Huang et al.

ICLR 2025arXiv:2407.09887
31
citations

Optimization Inspired Few-Shot Adaptation for Large Language Models

Boyan Gao, Xin Wang, Yibo Yang et al.

NEURIPS 2025spotlightarXiv:2505.19107

OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents

Zhaolin Hu, Yixiao Zhou, Zhongan Wang et al.

ICLR 2025
6
citations

Overfill: Two-Stage Models for Efficient Language Model Decoding

Woojeong Kim, Junxiong Wang, Jing Nathan Yan et al.

COLM 2025paperarXiv:2508.08446

PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS

Yilong Li, Jingyu Liu, Hao Zhang et al.

ICLR 2025arXiv:2410.05315
7
citations

PANORAMA: A Dataset and Benchmarks Capturing Decision Trails and Rationales in Patent Examination

Hyunseung Lim, Sooyohn Nam, Sungmin Na et al.

NEURIPS 2025arXiv:2510.24774

Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost

Sheng Cao, Mingrui Wu, Karthik Prasad et al.

ICLR 2025

Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization

Zhanfeng Mo, Long-Kai Huang, Sinno Jialin Pan

ICLR 2025
13
citations

ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation

Pengcheng Huang, Zhenghao Liu, Yukun Yan et al.

NEURIPS 2025arXiv:2502.15543
4
citations

Pareto Prompt Optimization

Guang Zhao, Byung-Jun Yoon, Gilchan Park et al.

ICLR 2025
1
citations

PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks

Matthew Chang, Gunjan Chhablani, Alexander Clegg et al.

ICLR 2025oralarXiv:2411.00081
50
citations

Passing the Driving Knowledge Test

Maolin Wei, Wanzhou Liu, Eshed Ohn-Bar

ICCV 2025arXiv:2508.21824
2
citations

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Weifeng Lin, Xinyu Wei, Ruichuan An et al.

NEURIPS 2025arXiv:2506.05302
32
citations

PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models

Lancheng Zou, Shuo Yin, Zehua Pei et al.

NEURIPS 2025

PersoNo: Personalised Notification Urgency Classifier in Mixed Reality

Jingyao Zheng, Haodi Weng, Xian Wang et al.

ISMAR 2025paperarXiv:2508.19622
1
citations

Perturbation-Restrained Sequential Model Editing

Jun-Yu Ma, Hong Wang, Hao-Xiang Xu et al.

ICLR 2025arXiv:2405.16821
20
citations

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

Shi Qiu, Shaoyang Guo, Zhuo-Yang Song et al.

NEURIPS 2025arXiv:2504.16074
30
citations

PICASO: Permutation-Invariant Context Composition with State Space Models

Tian Yu Liu, Alessandro Achille, Matthew Trager et al.

ICLR 2025oralarXiv:2502.17605

Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs

Itay Itzhak, Yonatan Belinkov, Gabriel Stanovsky

COLM 2025paperarXiv:2507.07186
3
citations

PlanU: Large Language Model Reasoning through Planning under Uncertainty

Ziwei Deng, Mian Deng, Chenjing Liang et al.

NEURIPS 2025arXiv:2510.18442

Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory

Svetha Venkatesh, Kien Do, Hung Le et al.

ICLR 2025

PokerBench: Training Large Language Models to Become Professional Poker Players

Richard Zhuang, Akshat Gupta, Richard Yang et al.

AAAI 2025paperarXiv:2501.08328
8
citations

PolarQuant: Leveraging Polar Transformation for Key Cache Quantization and Decoding Acceleration

Songhao Wu, Ang Lv, xiao feng et al.

NEURIPS 2025

Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Zhijian Zhuo, Ya Wang, Yutao Zeng et al.

ICLR 2025arXiv:2411.03884
6
citations

PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches

Rana Muhammad Shahroz Khan, Pingzhi Li, Sukwon Yun et al.

ICLR 2025arXiv:2410.10870
3
citations

Predictable Scale (Part II) --- Farseer: A Refined Scaling Law in LLMs

Houyi Li, Wenzhen Zheng, Qiufeng Wang et al.

NEURIPS 2025spotlight

Preference-driven Knowledge Distillation for Few-shot Node Classification

Xing Wei, Chunchun Chen, Rui Fan et al.

NEURIPS 2025arXiv:2510.10116

Preference Optimization for Reasoning with Pseudo Feedback

Fangkai Jiao, Geyang Guo, Xingxing Zhang et al.

ICLR 2025arXiv:2411.16345
35
citations

Pretrained Hybrids with MAD Skills

Nicholas Roberts, Samuel Guo, Zhiqi Gao et al.

COLM 2025paper

Pre-trained Large Language Models Learn to Predict Hidden Markov Models In-context

Yijia Dai, Zhaolin Gao, Yahya Sattar et al.

NEURIPS 2025

PRIMT: Preference-based Reinforcement Learning with Multimodal Feedback and Trajectory Synthesis from Foundation Models

Ruiqi Wang, Dezhong Zhao, Ziqin Yuan et al.

NEURIPS 2025oralarXiv:2509.15607

Private Training Large-scale Models with Efficient DP-SGD

Liangyu Wang, Junxiao Wang, Jie Ren et al.

NEURIPS 2025

ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs

Hao Di, Tong He, Haishan Ye et al.

ICLR 2025
2
citations

Probabilistic Reasoning with LLMs for Privacy Risk Estimation

Jonathan Zheng, Alan Ritter, Sauvik Das et al.

NEURIPS 2025

Probabilistic Token Alignment for Large Language Model Fusion

Runjia Zeng, James Liang, Cheng Han et al.

NEURIPS 2025arXiv:2509.17276
2
citations