Paper by Vinko Sabolčec Papers
2 papers found
Conference
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan, Vinko Sabolčec, Matin Ansaripour et al.
COLM 2025paper
4
citations
FineWeb2: One Pipeline to Scale Them All — Adapting Pre-Training Data Processing to Every Language
Guilherme Penedo, Hynek Kydlíček, Vinko Sabolčec et al.
COLM 2025paperarXiv:2506.20920
51
citations