α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Jiaming Ji
Jiaming Ji
9
papers
896
total citations
papers (9)
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
NEURIPS 2023
arXiv
760
citations
Constrained Update Projection Approach to Safe Policy Optimization
NEURIPS 2022
arXiv
74
citations
SafeDreamer: Safe Reinforcement Learning with World Models
ICLR 2024
arXiv
37
citations
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
NEURIPS 2025
arXiv
9
citations
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
ICML 2025
arXiv
8
citations
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
NEURIPS 2025
arXiv
7
citations
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
NEURIPS 2025
arXiv
1
citations
Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark
NEURIPS 2023
0
citations
VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning
NEURIPS 2023
0
citations