Poster "computational efficiency" Papers
158 papers found • Page 2 of 4
Conference
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Joshua Fixelle
IM-LUT: Interpolation Mixing Look-Up Tables for Image Super-Resolution
Sejin Park, Sangmin Lee, Kyong Hwan Jin et al.
Importance-Based Token Merging for Efficient Image and Video Generation
Haoyu Wu, Jingyi Xu, Hieu Le et al.
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
Yudong Liu, Jingwei Sun, Yueqian Lin et al.
LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models
Yu Cheng, Fajie Yuan
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
François Rozet, Ruben Ohana, Michael McCabe et al.
MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining
Yunze Liu, Li Yi
METEOR: Multi-Encoder Collaborative Token Pruning for Efficient Vision Language Models
Yuchen Liu, Yaoming Wang, Bowen Shi et al.
Mobile Video Diffusion
Haitam Ben Yahia, Denis Korzhenkov, Ioannis Lelekas et al.
Multi-Agent Collaboration via Evolving Orchestration
Yufan Dang, Chen Qian, Xueheng Luo et al.
Multilevel neural simulation-based inference
Yuga Hikida, Ayush Bharti, Niall Jeffrey et al.
Mutual Effort for Efficiency: A Similarity-based Token Pruning for Vision Transformers in Self-Supervised Learning
Sheng Li, Qitao Tan, Yue Dai et al.
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
Luca Eyring, Shyamgopal Karthik, Alexey Dosovitskiy et al.
One Head to Rule Them All: Amplifying LVLM Safety through a Single Critical Attention Head
Junhao Xia, Haotian Zhu, Shuchao Pang et al.
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.
PhySwin: An Efficient and Physically-Informed Foundation Model for Multispectral Earth Observation
Chong Tang, Joseph Powell, Dirk Koch et al.
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao, Li, Shreyank Gowda et al.
Prior Knowledge Guided Neural Architecture Generation
Jingrong Xie, Han Ji, Yanan Sun
P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS
Malyaban Bal, Abhronil Sengupta
Random Is All You Need: Random Noise Injection on Feature Statistics for Generalizable Deep Image Denoising
Zhengwei Yin, Hongjun Wang, Guixu Lin et al.
RAST: Reasoning Activation in LLMs via Small-model Transfer
Siru Ouyang, Xinyu Zhu, Zilin Xiao et al.
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu, Xiaosen Zheng, Niklas Muennighoff et al.
Robust Regression of General ReLUs with Queries
Ilias Diakonikolas, Daniel Kane, Mingchen Ma
SCOPE: Saliency-Coverage Oriented Token Pruning for Efficient Multimodel LLMs
Jinhong Deng, Wen Li, Joey Tianyi Zhou et al.
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens
Qihang Fan, Huaibo Huang, Mingrui Chen et al.
Steering Large Language Models between Code Execution and Textual Reasoning
Yongchao Chen, Harsh Jhamtani, Srinagesh Sharma et al.
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Vikash Sehwag, Xianghao Kong, Jingtao Li et al.
Targeted Unlearning with Single Layer Unlearning Gradient
Zikui Cai, Yaoteng Tan, M. Salman Asif
Temporal Separation with Entropy Regularization for Knowledge Distillation in Spiking Neural Networks
Kairong Yu, Chengting Yu, Tianqing Zhang et al.
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
Tian Jin, Ahmed Imtiaz Humayun, Utku Evci et al.
The Omni-Expert: A Computationally Efficient Approach to Achieve a Mixture of Experts in a Single Expert Model
Sohini Saha, Mezisashe Ojuba, Leslie Collins et al.
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
Ziyang Wu, Tianjiao Ding, Yifu Lu et al.
Training-free and Adaptive Sparse Attention for Efficient Long Video Generation
yifei xia, Suhan Ling, Fangcheng Fu et al.
UGM2N: An Unsupervised and Generalizable Mesh Movement Network via M-Uniform Loss
Zhichao Wang, Xinhai Chen, Qinglin Wang et al.
URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration
Rui Xu, Yuzhen Niu, Yuezhou Li et al.
VA-MoE: Variables-Adaptive Mixture of Experts for Incremental Weather Forecasting
Hao Chen, Tao Han, Song Guo et al.
Variational Bayesian Pseudo-Coreset
Hyungi Lee, Seungyoo Lee, Juho Lee
VCM: Vision Concept Modeling with Adaptive Vision Token Compression via Instruction Fine-Tuning
Run Luo, Renke Shan, Longze Chen et al.
VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
Sihan Yang, Runsen Xu, Chenhang Cui et al.
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Junwei Luo, Yingying Zhang, Xue Yang et al.
Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering
Yangfu Li, Hongjian Zhan, Tianyi Chen et al.
3D Small Object Detection with Dynamic Spatial Pruning
Xiuwei Xu, Zhihao Sun, Ziwei Wang et al.
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han, Tianzhu Ye, Yizeng Han et al.
Agglomerative Token Clustering
Joakim Bruslund Haurum, Sergio Escalera, Graham W. Taylor et al.
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Wei Chen, Long Chen, Yu Wu
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Liang Chen, Haozhe Zhao, Tianyu Liu et al.
A Simple Baseline for Efficient Hand Mesh Reconstruction
zhishan zhou, shihao zhou, Zhi Lv et al.
Binarized Low-light Raw Video Enhancement
Gengchen Zhang, Yulun Zhang, Xin Yuan et al.