Poster "parameter efficiency" Papers
31 papers found
Conference
As large as it gets – Studying Infinitely Large Convolutions via Neural Implicit Frequency Filters
Margret Keuper, Julia Grabinski, Janis Keuper
Composing Linear Layers from Irreducibles
Travis Pence, Daisuke Yamada, Vikas Singh
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob, Lorenzo Sani, Meghdad Kurmanji et al.
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Yongqi Huang, Peng Ye, Chenyu Huang et al.
Discovering Important Experts for Mixture-of-Experts Models Pruning Through a Theoretical Perspective
Weizhong Huang, Yuxin Zhang, Xiawu Zheng et al.
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Wenlong Wang, Ivana Dusparic, Yucheng Shi et al.
Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement
Gaurav Patel, Christopher M. Sandino, Behrooz Mahasseni et al.
eQMARL: Entangled Quantum Multi-Agent Reinforcement Learning for Distributed Cooperation over Quantum Channels
Alexander DeRieux, Walid Saad
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer
Kolmogorov-Arnold Transformer
Xingyi Yang, Xinchao Wang
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu, Zeyu Huang, Shuang Cheng et al.
Lightweight and Fast Real-time Image Enhancement via Decomposition of the Spatial-aware Lookup Tables
Wontae Kim, Keuntek Lee, Nam Ik Cho
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Fangxun Shu, Yue Liao, Lei Zhang et al.
(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning
Margaret Li, Sneha Kudugunta, Luke Zettlemoyer
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova, Angelos Katharopoulos, David Grangier et al.
Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
Kelvin Kan, Xingjian Li, Benjamin Zhang et al.
PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning
Muhammad Anwar Ma'sum, Mahardhika Pratama, Savitha Ramasamy et al.
Sample- and Parameter-Efficient Auto-Regressive Image Models
Elad Amrani, Leonid Karlinsky, Alex M. Bronstein
SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
Yichen Wu, Hongming Piao, Long-Kai Huang et al.
Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation
Nairouz Mrabah, Nicolas Richet, Ismail Ayed et al.
Surprising Effectiveness of pretraining Ternary Language Model at Scale
Ayush Kaushal, Tejas Vaidhya, Arnab Mondal et al.
TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation
Zonglin Lyu, Chen Chen
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras, Peng Wang, Laura Balzano et al.
Data-free Neural Representation Compression with Riemannian Neural Dynamics
Zhengqi Pei, Anran Zhang, Shuhui Wang et al.
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Yongchang Hao, Yanshuai Cao, Lili Mou
Image-adaptive 3D Lookup Tables for Real-time Image Enhancement with Bilateral Grids
Wontae Kim, Nam Ik Cho
In value-based deep reinforcement learning, a pruned network is a good network
Johan Obando Ceron, Aaron Courville, Pablo Samuel Castro
KernelWarehouse: Rethinking the Design of Dynamic Convolution
Chao Li, Anbang Yao
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Kai Zhang, Yi Luan, Hexiang Hu et al.
OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning
Geng Xinyu, Jiaming Wang, Jiawei Gong et al.
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Jiwon Song, Kyungseok Oh, Taesu Kim et al.