Poster by Guangyu Song Papers
2 papers found
Conference
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.
ICLR 2025arXiv:2502.13595
80
citations
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
Nikhil Kandpal, Brian Lester, Colin Raffel et al.
NEURIPS 2025arXiv:2506.05209
11
citations