Impact of LLM Alignment on Impression Formation in Social Interactions
Abstract
Impression formation plays a crucial role in shaping social life, influencing behaviors, attitudes, and interactions across different contexts. Affect Control Theory (ACT) offers a well-established, empirically grounded model of how people form impressions and evaluate social interactions. We investigate whether Large Language Models (LLMs) exhibit patterns of impression formation that align with ACT's predictions. As a case study, we focus on gendered social interactions—how an LLM perceives gender in a prototypic social interaction. We compare several preference-tuned derivatives of LLaMA-3 model family (including LLaMA-Instruct, Tulu-3, and DeepSeek-R1-Distill) with GPT-4 as a baseline, examining the extent to which alignment or preference tuning influences the models' tendencies in forming gender impressions. We find that LLMs form impressions quite differently than ACT. Notably, LLMs are insensitive to situational context: the impression of an interaction is overwhelmingly driven by the identity of the actor, regardless of the actor’s actions or the recipient of those actions. This stands in contrast to ACT’s interaction-based reasoning, which accounts for the interplay of identities, behaviors, and recipients. We further find that preference tuning often amplifies or skews certain impressions in unpredicted ways. Our corpus offers a benchmark for assessing LLMs' social intelligence; we encourage further research using ACT-like frameworks to explore how tuning influences impression formation across diverse social dimensions.