SORS: LLMs for measurement in social science

dc.contributor.author
Le Mens, Gaël
dc.date.accessioned
2026-02-11T01:27:54Z
dc.date.available
2026-02-11T01:27:54Z
dc.date.issued
2025-07-09
dc.identifier
Le Mens, G. SORS: LLMs for measurement in social science. A: Severo Ochoa Research Seminars at BSC. «10th Severo Ochoa Research Seminar Lectures at BSC, Barcelona, 2024-25». Barcelona: Barcelona Supercomputing Center, 2025, p. 171-172.
dc.identifier
https://hdl.handle.net/2117/454484
dc.identifier.uri
http://hdl.handle.net/2117/454484
dc.description.abstract
The seminar talk is based on some of my recent work that used LLMs for measurement of similarity in semantic spaces. I will report on using fine-tuned 'BERT' and pre-trained instruction-tuned LLMs (such as GPT-4, Meta Llama 3, or MiXtral) for measuring the typicality of text documents into concepts (tweets in political parties, books in literary genres) and for positioning text documents in policy and ideological spaces. I will also report on a systematic comparison of the performance of the most recent LLMs for these tasks and will outline a strategy for choosing among the available LLMs given the research objectives and constraints that pertain to a specific research project. The talk is based on the following recent papers and some on-going work: 1. Positioning Political Texts with Large Language Models by Asking and Averaging (with Aina Gallego): Using the recent LLMs (2023-2024) to position tweets, party manifestos, political speeches in multiple languages in ideological spaces. Includes a comparison of the performance of various models, including proprietary and open models. 2. Uncovering the Semantics of Concepts Using GPT-4 (with Balázs Kovács, Michael Hannan, & Guillem Pros). PNAS, 2023. Using GPT4- for measuring the typicality of books in literary genres and the typicality of tweets in political parties + comparison to other methods based on BERT, text embeddings and word embeddings. 3. Using Machine Learning to Uncover the Semantics of Concepts: How Well Do Typicality Measures Extracted from a BERT Text Classifier Match Human Judgments of Genre Typicality? (with Balázs Kovács, Michael Hannan, & Guillem Pros). Sociological Science. March 2023. Fine-tuning BERT for measuring the typicality of books in literary genres + comparisons with more standard NLP approaches.
dc.format
2 p.
dc.format
application/pdf
dc.language
eng
dc.publisher
Barcelona Supercomputing Center
dc.rights
http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights
Open Access
dc.subject
Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject
High performance computing
dc.subject
Càlcul intensiu (Informàtica)
dc.title
SORS: LLMs for measurement in social science
dc.type
Conference report


Ficheros en el ítem

FicherosTamañoFormatoVer

No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

Congressos [11156]