"self-distillation" Papers

21 papers found

Beyond Autoregression: Fast LLMs via Self-Distillation Through Time

Justin Deschenaux, Caglar Gulcehre

ICLR 2025arXiv:2410.21035
29
citations

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

Sanghyun Jo, Seo Lee, Seungwoo Lee et al.

ICCV 2025arXiv:2503.11439
2
citations

DebGCD: Debiased Learning with Distribution Guidance for Generalized Category Discovery

Yuanpei Liu, Kai Han

ICLR 2025arXiv:2504.04804
7
citations

FedLPA: Local Prior Alignment for Heterogeneous Federated Generalized Category Discovery

Geeho Kim, Jinu Lee, Bohyung Han

NEURIPS 2025

How to build a consistency model: Learning flow maps via self-distillation

Nicholas Boffi, Michael Albergo, Eric Vanden-Eijnden

NEURIPS 2025arXiv:2505.18825
31
citations

LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models

Haiwen Huang, Anpei Chen, Volodymyr Havrylov et al.

ICCV 2025arXiv:2504.14032
12
citations

Long-tailed Adversarial Training with Self-Distillation

Seungju Cho, Hongsin Lee, Changick Kim

ICLR 2025arXiv:2503.06461
1
citations

MAESTRO: Masked Encoding Set Transformer with Self-Distillation

Matthew Lee, Jaesik Kim, Matei Ionita et al.

ICLR 2025

Rethinking Self-Distillation: Label Averaging and Enhanced Soft Label Refinement with Partial Labels

Hyeonsu Jeong, Hye Won Chung

ICLR 2025arXiv:2402.10482
1
citations

SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling

Yizhao Gao, Zhichen Zeng, DaYou Du et al.

NEURIPS 2025

Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch

Xu Cai, Yang Wu, Qianli Chen et al.

NEURIPS 2025arXiv:2510.17858

Synergy Between the Strong and the Weak: Spiking Neural Networks are Inherently Self-Distillers

Yongqi Ding, Lin Zuo, Mengmeng Jing et al.

NEURIPS 2025oralarXiv:2510.07924

The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model

Kaito Takanami, Takashi Takahashi, Ayaka Sakata

NEURIPS 2025arXiv:2501.16226
2
citations

Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction

Yunheng Li, Yuxuan Li, Quan-Sheng Zeng et al.

ICCV 2025arXiv:2412.06244
6
citations

Vision Transformers with Self-Distilled Registers

Zipeng Yan, Yinjie Chen, Chong Zhou et al.

NEURIPS 2025spotlightarXiv:2505.21501
4
citations

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions

Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi

ECCV 2024arXiv:2407.16698
29
citations

Enhanced Sparsification via Stimulative Training

Shengji Tang, Weihao Lin, Hancheng Ye et al.

ECCV 2024arXiv:2403.06417
2
citations

EPSD: Early Pruning with Self-Distillation for Efficient Model Compression

Dong Chen, Ning Liu, Yichen Zhu et al.

AAAI 2024paperarXiv:2402.00084
9
citations

Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation

Pengfei Wang, Yuxi Wang, Shuai Li et al.

ECCV 2024arXiv:2407.13362
10
citations

Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

Ziyin Zhang, Ning Lu, Minghui Liao et al.

AAAI 2024paperarXiv:2308.08806
20
citations

SILC: Improving Vision Language Pretraining with Self-Distillation

Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai et al.

ECCV 2024arXiv:2310.13355
59
citations