Oscar Stuhler
I am a sociologist studying discourse with computational methods. Much of my work focuses on how to measure, analyze, and theorize textual representations of social structures. Broadly, I am interested in culture, political sociology, social networks, and natural language processing.
I am an Assistant Professor at Northwestern University’s Department of Sociology. I received my PhD in sociology at New York University.
You can access a version of my CV here and reach me at oms@northwestern.edu. Below, you can find out about some of my work.
STUDYING TEXTUAL REPRESENTATIONS OF SOCIAL STRUCTURES
Oscar Stuhler, Cat Dang Ton, and Étienne Ollion, published in Sociological Methods & Research (2025) [PDF] [code]
Abstract: Generative AI (GenAI) is quickly becoming a valuable tool for sociological research. Already, sociologists employ GenAI for tasks like classifying text and simulating human agents. We point to another major use case: the extraction of structured information from unstructured text. Information Extraction (IE) is an established branch of Natural Language Processing, but leveraging the affordances of this paradigm has thus far required familiarity with specialized models. GenAI changes this by allowing researchers to define their own IE tasks and execute them via targeted prompts. This article explores the potential of open-source large language models for IE by extracting and encoding biographical information (e.g., age, occupation, origin) from a corpus of newspaper obituaries. As we proceed, we discuss how sociologists can develop and evaluate prompt architectures for such tasks, turning codebooks into “promptbooks.” We also evaluate models of different sizes and prompting techniques. Our analysis showcases the potential of GenAI as a flexible and accessible tool for IE, while also underscoring risks like non-random error patterns that can bias downstream analyses.

The Cultural Construction of Personal Relationships
Published in Social Networks (2025) [PDF]
Abstract: Network analysis aspires to be “anticategorical,” yet its basic units—relationships—are usually readily categorized entities with labels like “friendship,” “love,” or “patronage.” In this way, a nontrivial cultural typification underlies the very building blocks of most network analyses. Despite work showing that a specific “type of tie” often stands in for quite heterogeneous empirical phenomena, this typification is seldom challenged in research practice. This article expands on recent efforts to more adequately theorize ties by further developing and arguing for the concept of relationship frames—cultural models that stabilize relational expectations. I suggest that such frames are rooted in regularities in the duality of dyad and content. Building on this idea, I develop a formal notion of frame ambiguity—the extent to which the actions and symbols designating a relationship index a variety of frames rather than just one. Putting these ideas to analytical use, I inductively identify relationship frames from the content of 1.2 million relationships between characters in fiction writing. I conclude with an exploratory investigation of some of the conditions under which ties in fiction writing display variation in frame ambiguity.

The Gender Agency Gap in Fiction Writing (1850 to 2010)
Published in PNAS (2024) [PDF] [code] [web interface for exploring the data]
Abstract: Works of fiction play a crucial role in the production of cultural stereotypes. Concerning gender, a widely held presumption is that many such works ascribe agency to men and passivity to women. However, large-scale diachronic analyses of this notion have been lacking. This paper provides an assessment of agency attributions in 87,531 fiction works written between 1850 and 2010. It introduces a syntax-based approach for extracting networks of character interactions. Agency is then formalized as a dyadic property: Does a character primarily serve as an agent acting upon the other character or as recipient acted upon by the other character? Findings indicate that female characters are more likely to be passive in cross-gender relationships than their male counterparts. This difference, the gender agency gap, has declined since the 19th century but persists into the 21st. Male authors are especially likely to attribute less agency to female characters. Moreover, certain kinds of actions, especially physical and villainous ones, have more pronounced gender disparities.

Who Does What to Whom? Making Text Parsers Work for Sociological Inquiry
Published in Sociological Methods & Research (2022) [PDF] [code] [package]
Abstract: Over the past decade, sociologists have become increasingly interested in the formal study of semantic relations within text. Most contemporary studies focus either on mapping concept co-occurrences or on measuring semantic associations via word embeddings. Although conducive to many research goals, these approaches share an important limitation: they abstract away what one can call the event structure of texts, that is, the narrative action that takes place in them. I aim to overcome this limitation by introducing a new framework for extracting semantically rich relations from text that involves three components. First, a semantic grammar structured around textual entities that distinguishes six motif classes: actions of an entity, treatments of an entity, agents acting upon an entity, patients acted upon by an entity, characterizations of an entity, and possessions of an entity; second, a comprehensive set of mapping rules, which make it possible to recover motifs from predictions of dependency parsers; third, an R package that allows researchers to extract motifs from their own texts. The framework is demonstrated in empirical analyses on gendered interaction in novels and constructions of collective identity by U.S. presidential candidates.

What’s in a Category? A New Approach to Discourse Role Analysis
Published in Poetics (2021) [PDF]
Abstract: Building on the work of John Mohr, I propose a new, broadly applicable approach to Discourse Role Analysis (DRA). Whereas the goal of behavioral role analysis is to identify the different kinds of actors that exist in interaction, the goal of DRA is to identify the different kinds of identities that exist in discourse. To do this, I suggest thinking of discourse roles as latent conceptions of identities composed of treatments, actions, and characteristics that are frequently concurrently associated with identities in stories. I propose a method to infer discourse roles from unstructured text data that draws on novel techniques from Natural Language Processing. This framework is leveraged to shed light on German news coverage of refugees (2010-2020), which employs a set of distinct discourse roles such as refugee as claimant of welfare benefits, refugee in distress at sea, and refugee as a criminal. I then assess how different refugee identity categories are situated within this discourse role structure. I pay particular attention to Geflüchtete, a category that emerged only recently in German discourse. Whereas initial use of Geflüchtete was motivated by a language critique that aimed at replacing the general term for refugees (Flüchtlinge), DRA indicates a process of categorical differentiation in which the category increasingly serves to distinguish different kinds of refugees.

STUDYING OTHER THINGS
Generative AI in Sociological Research: State of the Discipline
AJ ALvero, Dustin Stoltz, Oscar Stuhler, Marshall Taylor, forthcoming in Sociological Science (2025) [PDF] [code]
Abstract: Generative artificial intelligence (GenAI) has garnered considerable attention for its potential utility in research and scholarship, even among those who typically do not rely on computational tools. Early commentators, however, have also articulated concerns about how GenAI usage comes with enormous environmental costs, serious social risks, and a tendency to produce low-quality content. In the midst of both excitement and skepticism, it is crucial to take stock of how GenAI is actually being used. Our study focuses on sociological research as our site, and here we present findings from a survey of 433 authors of articles published in 50 sociology journals in the last five years. The survey provides an overview of the state of the discipline with regard to the use of GenAI by providing answers to fundamental questions: how (much) do scholars use the technology for their research; what are their rea- sons for using it; and how concerned, trustful, and optimistic are they about the technology? Of the approximately one third ofrespondents who self-report using GenAI at least weekly, the primary uses are for writing assistance and comparatively less so in planning, data collection, or data analysis. In both use and attitudes, there are surprisingly few differences between self-identified computational and non- computational researchers. Generally, respondents are very concerned about the social and environmental consequences of GenAI. Trust in GenAI outputs is low, regardless of expertise or frequency of use. While optimism that GenAI will improve is high, scholars are divided on whether GenAI will have a positive impact on the field.

Bottom Up? Top Down? Determinants of Issue-Attention in State Politics
Andreu Cases, Oscar Stuhler, Julia Payson, Joshua A. Tucker, Richard Bonneau, Jonathan Nagler, published in the Journal of Politics (2025) [PDF] [code]
Abstract: Who shapes the issue-attention cycle of state legislators? Although state governments make critical policy decisions, data and methodological constraints have limited researchers’ ability to study state-level agenda setting. For this paper, we collect more than 122 million Twitter messages sent by state and national actors in 2018 and 2021. We then employ supervised machine learning and time series techniques to study how the issue-attention of state lawmakers evolves vis-à-vis various local- and national-level actors. Our findings suggest that state legislators operate at the confluence of national and local influences. In line with arguments highlighting the nationalization of state politics, we find that state legislators are consistently responsive to policy debates among members of Congress. However, despite growing nationalization concerns, we also find strong evidence of issue responsiveness by legislators to members of the public in their states and moderate responsiveness to regional media sources.

Bart Bonikowski, Yuchen Luo, and Oscar Stuhler (equal authorship), published in Sociological Methods & Research (2022) [PDF] [code]
Abstract: Radical-right parties and candidates combine three discursive elements in their electoral appeals: anti-elite populism, exclusionary and declinist nationalism, and illiberal authoritarianism. Recent studies have explored whether these frames have diffused from radical-right to centrist parties in the latter’s effort to compete for the former’s voters. This paper investigates the obverse process: the radical right’s (specifically, Donald Trump’s) reliance on discursive elements that had long been present in mainstream institutional politics. To do so, we identify instances of populism, nationalism (i.e., exclusionary and inclusive definitions of national symbolic boundaries and displays of low and high national pride), and authoritarianism in the speeches of Democratic and Republican presidential nominees between 1952 and 2016. These frames are subtle, infrequent, and polysemic, which makes their quantitative measurement difficult. We overcome this by leveraging the affordances of cutting-edge neural language models; in particular, we combine a variant of bidirectional encoder representations from transformers (RoBERTa) with active learning. As we demonstrate, this approach is considerably more powerful than other methods commonly used by social scientists to measure discursive frames. Our results suggest that what set Donald Trump’s campaign apart from those of mainstream presidential candidates was not its invention of a new form of politics, but its combination of negative evaluations of elites, low national pride, and authoritarianism—all of which had long been present among both parties—with an explicit evocation of exclusionary nationalism, which had previously been used only in coded form. Radical-right discourse therefore appearsto be less a break with the past and more an amplification and creative rearrangement of existing political-cultural tropes.

Reclaiming the Past to Transcend the Present: Nostalgic Appeals in U.S. Presidential Elections
Bart Bonikowski and Oscar Stuhler, published in Sociological Forum (2022) [PDF]
Abstract: Nostalgic appeals to an idealized past are a commonly associated with radical-right discourse. They bolster candidates’ critiques of the status quo and promises of a better future, all while mobilizing perceptions of collective status threat among supporters. In this paper, we ask whether nostalgia is a radical-right innovation or whether it has precedents in mainstream politics. We make use of recent advances in natural language processing—specifically transformer-based deep learning models—to identify nostalgic claims in U.S. presidential campaign speeches from 1952 to 2020. We then examine what form nostalgia takes, when it has been most salient, what aspects of the nation it has been used to glorify, and how it relates to populist and nationalist appeals. Our findings suggest that nostalgic rhetoric usually takes the form of brief and multivocal statements with a consistent lexical signature. It is frequently used by challenger candidates from both parties to generate a heightened sense of crisis and to morally indict incumbent opponents, particularly during times of widespread cultural contention. In so doing, nostalgia helps substantiate candidates’ populist claims and expressions of low national pride. Given that these patterns are found throughout our time series, this points to important continuities between the discourse of mainstream and radical-right actors in U.S. politics. Where their respective messaging diverges, however, is in the use of nostalgia to frame exclusionary nationalist and authoritarian appeals, a practice limited to the radical-right (in our data, Donald Trump). Our findings suggest that radical-right actors did not invent their rhetorical strategies de novo, but rather, have adopted frames already widespread in mainstream politics, adapting and creatively recombining them for their own ends.

Jan Fuhse, Oscar Stuhler, Jan Riebling, and John Levi Martin, published in Poetics (2020) [PDF]
Abstract: Social relations between actors and symbolic relations between concepts or ideas are interwoven in discourse. We conceptually distinguish three approaches that construct relations between symbols with different connections to social structures. These three approaches are illustrated empirically with automated text analyses of the parliamentary proceedings of the Weimar Republic in Germany (1919-1933). First, cultural relations between symbols, as reconstructed from co-occurrences of terms in large text corpora, are supposedly widely shared in a social context. In this sense, we analyze a set of key terms in Weimar political discourse around the central term “Volk” (“people”). These fall into five word communities, each of them representing a different way of conceiving politics. Secondly, symbolic practices are related to actors positioning themselves through them in socio-symbolic constellations. We reconstruct such a constellation from the usage of key terms of Weimar parliamentary discourse by the eight major political parties in their speeches, with different parties signaling their ideological positions through these terms. Thirdly, the use of symbols in interaction characterizes social relationships between actors. In this vein, the ties between the Weimar parties show distinct patterns of hostility or support in their interjections and reactions to each other’s speeches. The second and the third analyses reveal a two-dimensional patterning of the Weimar political landscape, with the traditional Left-Right dimension complemented by an opposition of forces supporting or rejecting the republic. Also, the similarities in word usage by parties correspond fairly well to the support or hostility in their interjections and reactions.





