"robot manipulation" Papers
26 papers found
Conference
AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
Yuanfei Wang, Xiaojie Zhang, Ruihai Wu et al.
BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models
Peiyan Li, Yixiang Chen, Hongtao Wu et al.
GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
Kelin Yu, Sheng Zhang, Harshit Soora et al.
IRASim: A Fine-Grained World Model for Robot Manipulation
Fangqi Zhu, Hongtao Wu, Song Guo et al.
Language Guided Skill Discovery
Seungeun Rho, Laura Smith, Tianyu Li et al.
Latent Action Pretraining from Videos
Seonghyeon Ye, Joel Jang, Byeongguk Jeon et al.
Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
Yi Chen, Yuying Ge, Weiliang Tang et al.
On-Device Diffusion Transformer Policy for Efficient Robot Manipulation
Yiming Wu, Huan Wang, Zhenghao Chen et al.
PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
Atharva Gundawar, Som Sagar, Ransalu Senanayake
Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
Zhizhen Zhang, Lei Zhu, Zhen Fang et al.
Real-World Reinforcement Learning of Active Perception Behaviors
Edward Hu, Jie Wang, Xingfang Yuan et al.
ReGen: Generative Robot Simulation via Inverse Design
Peter (Phat) Nguyen, Johnson (Tsun-Hsuan) Wang, Zhang-Wei Hong et al.
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Chan Hee Song, Valts Blukis, Jonathan Tremblay et al.
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Chi-Pin Huang, Yueh-Hua Wu, Min-Hung Chen et al.
Token Bottleneck: One Token to Remember Dynamics
Taekyung Kim, Dongyoon Han, Byeongho Heo et al.
Tree-Guided Diffusion Planner
Hyeonseong Jeon, Cheolhong Min, Jaesik Park
VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Yichao Shen, Fangyun Wei, Zhiying Du et al.
VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation
Wei Zhao, Pengxiang Ding, Zhang Min et al.
Weakly-Supervised Learning of Dense Functional Correspondences
Stefan Stojanov, Linan Zhao, Yunzhi Zhang et al.
What Matters in Learning from Large-Scale Datasets for Robot Manipulation
Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige et al.
What's the Move? Hybrid Imitation Learning via Salient Points
Priya Sundaresan, Hengyuan Hu, Quan Vuong et al.
Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images
Chuanrui Zhang, Yonggen Ling, Minglei Lu et al.
LINGO-Space: Language-Conditioned Incremental Grounding for Space
Dohyun Kim, Nayoung Oh, Deokmin Hwang et al.
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
Jiachen Li, Qiaozi Gao, Michael Johnston et al.
Position: Scaling Simulation is Neither Necessary Nor Sufficient for In-the-Wild Robot Manipulation
Homanga Bharadhwaj
Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation
Yuanchen Ju, Kaizhe Hu, Guowei Zhang et al.