dc.contributor.author
Le Mens, Gaël
dc.date.accessioned
2026-02-11T01:27:54Z
dc.date.available
2026-02-11T01:27:54Z
dc.date.issued
2025-07-09
dc.identifier
Le Mens, G. SORS: LLMs for measurement in social science. A: Severo Ochoa Research Seminars at BSC. «10th Severo Ochoa Research Seminar Lectures at BSC, Barcelona, 2024-25». Barcelona: Barcelona Supercomputing Center, 2025, p. 171-172.
dc.identifier
https://hdl.handle.net/2117/454484
dc.identifier.uri
http://hdl.handle.net/2117/454484
dc.description.abstract
The seminar talk is based on some of my recent work that used LLMs
for measurement of similarity in semantic spaces. I will report on using
fine-tuned 'BERT' and pre-trained instruction-tuned LLMs (such as
GPT-4, Meta Llama 3, or MiXtral) for measuring the typicality of text
documents into concepts (tweets in political parties, books in literary
genres) and for positioning text documents in policy and ideological
spaces. I will also report on a systematic comparison of the performance
of the most recent LLMs for these tasks and will outline a strategy for
choosing among the available LLMs given the research objectives and
constraints that pertain to a specific research project. The talk is based
on the following recent papers and some on-going work:
1. Positioning Political Texts with Large Language Models by
Asking and Averaging (with Aina Gallego): Using the recent
LLMs (2023-2024) to position tweets, party manifestos, political
speeches in multiple languages in ideological spaces. Includes a
comparison of the performance of various models, including
proprietary and open models.
2. Uncovering the Semantics of Concepts Using GPT-4 (with Balázs
Kovács, Michael Hannan, & Guillem Pros). PNAS, 2023. Using
GPT4- for measuring the typicality of books in literary genres and
the typicality of tweets in political parties + comparison to other
methods based on BERT, text embeddings and word embeddings.
3. Using Machine Learning to Uncover the Semantics of Concepts:
How Well Do Typicality Measures Extracted from a BERT Text
Classifier Match Human Judgments of Genre Typicality? (with
Balázs Kovács, Michael Hannan, & Guillem Pros). Sociological
Science. March 2023. Fine-tuning BERT for measuring the
typicality of books in literary genres + comparisons with more
standard NLP approaches.
dc.format
application/pdf
dc.publisher
Barcelona Supercomputing Center
dc.rights
http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject
Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject
High performance computing
dc.subject
Càlcul intensiu (Informàtica)
dc.title
SORS: LLMs for measurement in social science
dc.type
Conference report