α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Karttikeya Mangalam
Karttikeya Mangalam
21
papers
6,719
total citations
papers (21)
Multiscale Vision Transformers
ICCV 2021
arXiv
1,529
citations
Ego4D: Around the World in 3,000 Hours of Egocentric Video
CVPR 2022
arXiv
1,511
citations
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
CVPR 2022
arXiv
856
citations
It is not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction
ECCV 2020
arXiv
543
citations
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding
NEURIPS 2023
arXiv
515
citations
From Goals, Waypoints & Paths to Long Term Human Trajectory Forecasting
ICCV 2021
arXiv
327
citations
Long-term Human Motion Prediction with Scene Context
ECCV 2020
arXiv
279
citations
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
CVPR 2022
arXiv
247
citations
Sequential Modeling Enables Scalable Learning for Large Vision Models
CVPR 2024
arXiv
235
citations
Speculative Decoding with Big Little Decoder
NEURIPS 2023
arXiv
169
citations
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
NEURIPS 2022
arXiv
132
citations
Object-Region Video Transformers
CVPR 2022
arXiv
99
citations
LOKI: Long Term and Key Intentions for Trajectory Prediction
ICCV 2021
arXiv
98
citations
Diffusion Models as Masked Autoencoders
ICCV 2023
arXiv
75
citations
Reversible Vision Transformers
CVPR 2022
arXiv
61
citations
Do Vision and Language Encoders Represent the World Similarly?
CVPR 2024
arXiv
35
citations
xT: Nested Tokenization for Larger Context in Large Images
ICML 2024
arXiv
8
citations
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
NEURIPS 2022
arXiv
0
citations
Latency Matters: Real-Time Action Forecasting Transformer
CVPR 2023
0
citations
Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
CVPR 2023
0
citations
Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
CVPR 2024
0
citations