Unsupervised Pretraining for Fact Verification by Language Model Distillation

5citations

arXiv:2309.16540 Project

citations

#1627

in ICLR 2024

of 2297 papers

Top Authors

Data Points

Top Authors

Adrian Bazaga Pietro Lio Gos Micklem

Abstract

Fact verification aims to verify a claim using evidence from a trustworthy knowledge base. To address this challenge, algorithms must produce features for every claim that are both semantically meaningful, and compact enough to find a semantic alignment with the source information. In contrast to previous work, which tackled the alignment problem by learning over annotated corpora of claims and their corresponding labels, we propose SFAVEL ($\underline{S}$elf-supervised $\underline{Fa}$ct $\underline{Ve}$rification via $\underline{L}$anguage Model Distillation), a novel unsupervised pretraining framework that leverages pre-trained language models to distil self-supervised features into high-quality claim-fact alignments without the need for annotations. This is enabled by a novel contrastive loss function that encourages features to attain high-quality claim and evidence alignments whilst preserving the semantic relationships across the corpora. Notably, we present results that achieve a new state-of-the-art on FB15k-237 (+5.3\% Hits@1) and FEVER (+8\% accuracy) with linear evaluation.

Citation History

Jan 28, 2026

Feb 13, 2026

5+5

Feb 13, 2026