α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Michal Valko
Michal Valko
1
Affiliations
Affiliations
Building something new @ Stealth Startup & Inria & MVA - Ex: Llama @AIatMeta Gemini and BYOL @GoogleDeepMind
20
papers
9,033
total citations
papers (20)
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning
NEURIPS 2020
arXiv
8,033
citations
Nash Learning from Human Feedback
ICML 2024
arXiv
195
citations
Generalized Preference Optimization: A Unified Approach to Offline Alignment
ICML 2024
arXiv
150
citations
Broaden Your Views for Self-Supervised Video Learning
ICCV 2021
arXiv
139
citations
BYOL-Explore: Exploration by Bootstrapped Prediction
NEURIPS 2022
arXiv
88
citations
Human Alignment of Large Language Models through Online Preference Optimisation
ICML 2024
arXiv
88
citations
Decoding-time Realignment of Language Models
ICML 2024
arXiv
59
citations
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity
NEURIPS 2021
arXiv
44
citations
Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits
NEURIPS 2020
arXiv
41
citations
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
NEURIPS 2020
arXiv
39
citations
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
NEURIPS 2021
arXiv
38
citations
Sampling from a k-DPP without looking at all items
NEURIPS 2020
arXiv
29
citations
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
NEURIPS 2021
arXiv
19
citations
Learning in two-player zero-sum partially observable Markov games with perfect recall
NEURIPS 2021
arXiv
18
citations
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs
NEURIPS 2020
arXiv
13
citations
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
NEURIPS 2022
arXiv
12
citations
Unlocking the Power of Representations in Long-term Novelty-based Exploration
ICLR 2024
arXiv
9
citations
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
NEURIPS 2021
arXiv
9
citations
Model-free Posterior Sampling via Learning Rate Randomization
NEURIPS 2023
arXiv
5
citations
Demonstration-Regularized RL
ICLR 2024
arXiv
5
citations