"instruction following" Papers

29 papers found

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.

ICLR 2025arXiv:2410.18252
43
citations

Checklists Are Better Than Reward Models For Aligning Language Models

Vijay Viswanathan, Yanchao Sun, Xiang Kong et al.

NEURIPS 2025spotlightarXiv:2507.18624
32
citations

CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model

Dapeng Zhang, Fei Shen, Rui Zhao et al.

NEURIPS 2025oralarXiv:2511.19914

Fixing It in Post: A Comparative Study of LLM Post-Training Data Quality and Model Performance

Aladin Djuhera, Swanand Kadhe, Syed Zawad et al.

NEURIPS 2025spotlightarXiv:2506.06522

Generalizing Verifiable Instruction Following

Valentina Pyatkin, Saumya Malik, Victoria Graf et al.

NEURIPS 2025arXiv:2507.02833
38
citations

HalLoc: Token-level Localization of Hallucinations for Vision Language Models

Eunkyu Park, Minyeong Kim, Gunhee Kim

CVPR 2025arXiv:2506.10286
3
citations

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Yulei Qin, Gang Li, Zongyi Li et al.

NEURIPS 2025arXiv:2506.01413
5
citations

Is In-Context Learning Sufficient for Instruction Following in LLMs?

Hao Zhao, Maksym Andriushchenko, francesco croce et al.

ICLR 2025arXiv:2405.19874
22
citations

Language-Image Models with 3D Understanding

Jang Hyun Cho, Boris Ivanovic, Yulong Cao et al.

ICLR 2025arXiv:2405.03685
27
citations

Language Imbalance Driven Rewarding for Multilingual Self-improving

Wen Yang, Junhong Wu, Chen Wang et al.

ICLR 2025arXiv:2410.08964
23
citations

Language Models Can Predict Their Own Behavior

Dhananjay Ashok, Jonathan May

NEURIPS 2025arXiv:2502.13329
5
citations

Learning to Instruct for Visual Instruction Tuning

Zhihan Zhou, Feng Hong, JIAAN LUO et al.

NEURIPS 2025arXiv:2503.22215
3
citations

Lookahead Routing for Large Language Models

Canbin Huang, Tianyuan Shi, Yuhua Zhu et al.

NEURIPS 2025arXiv:2510.19506

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Zhangchen Xu, Fengqing Jiang, Luyao Niu et al.

ICLR 2025arXiv:2406.08464
276
citations

Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost

Sheng Cao, Mingrui Wu, Karthik Prasad et al.

ICLR 2025

SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning

Ziqi Wang, Chang Che, Qi Wang et al.

ICCV 2025arXiv:2411.13949
4
citations

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Jiale Cheng, Xiao Liu, Cunxiang Wang et al.

ICLR 2025arXiv:2412.11605
13
citations

Sparta Alignment: Collectively Aligning Multiple Language Models through Combat

Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.

NEURIPS 2025arXiv:2506.04721
4
citations

Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking

Benjamin Feuer, Micah Goldblum, Teresa Datta et al.

ICLR 2025arXiv:2409.15268
28
citations

Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers

Daniel Dsouza, Julia Kreutzer, Adrien Morisot et al.

NEURIPS 2025arXiv:2506.14702
1
citations

Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning

Minheng Ni, YuTao Fan, Lei Zhang et al.

ICLR 2025arXiv:2410.03321
20
citations

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Zimu Lu, Yunqiao Yang, Houxing Ren et al.

NEURIPS 2025oralarXiv:2505.03733
19
citations

Attention Prompting on Image for Large Vision-Language Models

Runpeng Yu, Weihao Yu, Xinchao Wang

ECCV 2024arXiv:2409.17143
28
citations

BAGEL: Bootstrapping Agents by Guiding Exploration with Language

Shikhar Murty, Christopher Manning, Peter Shaw et al.

ICML 2024arXiv:2403.08140
29
citations

ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models

Rohan Wadhawan, Hritik Bansal, Kai-Wei Chang et al.

ICML 2024arXiv:2401.13311
20
citations

Fool Your (Vision and) Language Model with Embarrassingly Simple Permutations

Yongshuo Zong, Tingyang Yu, Ruchika Chavhan et al.

ICML 2024arXiv:2310.01651
27
citations

Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation

JoonHo Lee, Jae Oh Woo, Juree Seok et al.

ICML 2024arXiv:2405.06424
3
citations

Towards Learning a Generalist Model for Embodied Navigation

Duo Zheng, Shijia Huang, Lin Zhao et al.

CVPR 2024highlightarXiv:2312.02010
118
citations

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Yunhao Ge, Xiaohui Zeng, Jacob Huffman et al.

CVPR 2024arXiv:2404.19752
35
citations