PhyloLM: Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks

10citations

arXiv:2404.04671 Project

citations

#1512

in ICLR 2025

of 3827 papers

Top Authors

Data Points

Top Authors

Nicolas Yax Pierre-Yves Oudeyer Stefano Palminteri

Abstract

This paper introduces PhyloLM, a method adapting phylogenetic algorithms to Large Language Models (LLMs) to explore whether and how they relate to each other and to predict their performance characteristics. Our method calculates a phylogenetic distance metric based on the similarity of LLMs' output. The resulting metric is then used to construct dendrograms, which satisfactorily capture known relationships across a set of 111 open-source and 45 closed models. Furthermore, our phylogenetic distance predicts performance in standard benchmarks, thus demonstrating its functional validity and paving the way for a time and cost-effective estimation of LLM capabilities. To sum up, by translating population genetic concepts to machine learning, we propose and validate a tool to evaluate LLM development, relationships and capabilities, even in the absence of transparent training information.

Citation History

Jan 25, 2026

Jan 26, 2026

Jan 28, 2026

Feb 13, 2026

10+10

Feb 13, 2026