"model merging" Papers
41 papers found
Conference
Accurate and Efficient Low-Rank Model Merging in Core Space
Aniello Panariello, Daniel Marczak, Simone Magistri et al.
Activation-Guided Consensus Merging for Large Language Models
Yuxuan Yao, Shuqi LIU, Zehua Liu et al.
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
Yiyang Du, Xiaochen Wang, Chi Chen et al.
Agent Skill Acquisition for Large Language Models via CycleQD
So Kuroki, Taishi Nakamura, Takuya Akiba et al.
CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving
Huitong Yang, Zhuoxiao Chen, Fengyi Zhang et al.
Continual Model Merging without Data: Dual Projections for Balancing Stability and Plasticity
Enneng Yang, Anke Tang, Li Shen et al.
Curriculum Model Merging: Harmonizing Chemical LLMs for Enhanced Cross-Task Generalization
Baoyi He, Luotian Yuan, Ying Wei et al.
DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic
Munish Monga, Vishal Chudasama, Pankaj Wasnik et al.
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Zhenyi Lu, Xiaoye Qu, Zhenyi Lu et al.
FREE-Merging: Fourier Transform for Efficient Model Merging
Shenghe Zheng, Hongzhi Wang
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou, Xingyu Wu, Jibin Wu et al.
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Lucas Bandarkar, Benjamin Muller, Pritish Yuvraj et al.
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai, Sile Hu, Xu Shen et al.
Linear Mode Connectivity between Multiple Models modulo Permutation Symmetries
Akira Ito, Masanori Yamada, Atsutoshi Kumagai
Linear Mode Connectivity in Differentiable Tree Ensembles
Ryuichi Kanoh, Mahito Sugiyama
LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging
Ke Wang, Nikos Dimitriadis, Alessandro Favero et al.
Local Mixtures of Experts: Essentially Free Test-Time Training via Model Merging
Ryo Bertolissi, Jonas Hübotter, Ido Hakimi et al.
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
Enis Simsar, Thomas Hofmann, Federico Tombari et al.
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He, Siqi Zeng, Yuzheng Hu et al.
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering
Ziyu Zhao, tao shen, Didi Zhu et al.
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Yeoreum Lee, Jinwook Jung, Sungyong Baik
Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
Jinluan Yang, Dingnan Jin, Anke Tang et al.
Model merging with SVD to tie the Knots
George Stoica, Pratik Ramesh, Boglarka Ecsedi et al.
Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine
Konstantin Hemker, Nikola Simidjievski, Mateja Jamnik
PLeaS - Merging Models with Permutations and Least Squares
Anshul Nasery, Jonathan Hayase, Pang Wei Koh et al.
Pretrained Hybrids with MAD Skills
Nicholas Roberts, Samuel Guo, Zhiqi Gao et al.
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
Fanhu Zeng, Haiyang Guo, Fei Zhu et al.
Task Vector Quantization for Memory-Efficient Model Merging
Youngeun Kim, Seunghwan Lee, Aecheon Jung et al.
Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
Wenju Sun, Qingyong Li, Wen Wang et al.
Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning
Haomiao Qiu, Miao Zhang, Ziyue Qiao et al.
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin J Biggs, Arjun Seshadri, Yang Zou et al.
Equivariant Deep Weight Space Alignment
Aviv Navon, Aviv Shamsian, Ethan Fetaya et al.
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu, Bowen Yu, Haiyang Yu et al.
Localizing Task Information for Improved Model Merging and Compression
Ke Wang, Nikolaos Dimitriadis, Guillermo Ortiz-Jimenez et al.
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Anke Tang, Li Shen, Yong Luo et al.
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
MohammadReza Davari, Eugene Belilovsky
On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm
Zhanpeng Zhou, Zijun Chen, Yilan Chen et al.
Representation Surgery for Multi-Task Model Merging
Enneng Yang, Li Shen, Zhenyi Wang et al.
Training-Free Model Merging for Multi-target Domain Adaptation
Wenyi Li, Huan-ang Gao, Mingju Gao et al.
Training-Free Pretrained Model Merging
Zhengqi Xu, Ke Yuan, Huiqiong Wang et al.
Variational Learning is Effective for Large Deep Networks
Yuesong Shen, Nico Daheim, Bai Cong et al.