PEER2025 Program
The 4th Workshop on Processing and Evaluation Event Representations (PEER2025) will be held at University of Rochester on April 18, 2025. Both talks and breaks will be held on the second floor of Rush Rhees Library (map) in the Humanities Center (Conference Room D). If you plan to attend in person, we ask that you please register here.
-
09:00–09:45
Breakfast
-
09:45–10:00
Opening Remarks
-
10:00–11:00
Keynote: Binding and quantifying representations of states, processes, and events
Alexis Wellwood (USC)
Semanticists have long made use of a distinction between 'event' and (something like) 'process' in their theorizing. In this talk, I argue for an understanding of this distinction on which it corresponds to a distinction of conceptual class. My case study considers people's evaluations of comparative sentences with 'more' against dynamic displays that make multiple competing dimensions for comparison available. Specifically, I look at canonical and non-canonical adjectival comparatives (cp.: S was more gleeb than B was, S was gleeb more than B was) in comparison with verbal controls (S gleebed more than B did). I have previously argued that subtle features of syntax—rather than lexicon—play a major role in people's selections of measures for comparatives. But why and how does syntax play that role? I argue that words like more are sensitive to features conceptual class, i.e. to class-level definitions of proprietary measures. A very interesting possibility flowing from this view is that morphemes postulated to perform semantic functions like atomization and pluralization in fact cue binding relations between representations in distinct classes. -
11:00–11:30
Coffee Break
-
11:30–11:50
Visual Perception and 4-place Event Representations in Preschoolers and Infants
Ekaterina A. Khlystova (UCLA), Alexander Williams (UMD), Jeffrey Lidz (UMD), Laurel Perkins (UCLA)
Verb learning relies upon mapping between linguistic representations of sentences and conceptual representations of events. In adults, recognition of event relations occurs automatically and rapidly in visual perception, indicating that the human perceptual system is tuned to extract event relations from brief visual exposure. This rapid extraction is facilitated by the visual working memory system (VWM), which has a limit of 3 items in infants and 3-4 items in adults. In this study, we examined events of trading, which may be represented as having a number of participants (4) that tax the capacities of VWM early in development. In our first experiment, we find that preschoolers' VWM systems can readily yield an event concept with 4 participants for a trading scene. In our second experiment (ongoing), we ask whether 10.5-month-old infants' VWM systems can do so as well. We discuss the implications of our findings for the acquisition of verbs like trade and for theories of verb learning more generally. Understanding the perceptual support for acquiring verbs that describe events with high-adicity representations may have implications for why, cross-linguistically, so few 4-place concepts are lexicalized by a single morpheme. -
11:50–12:10
Experimentally extracting implicit instruments
Ashlyn Winship (Cornell), Zander Lynch (Cornell), John R Starr (Cornell), Yifan Wu (Cornell), Lucas Y. Li (Cornell), Marten van Schijndel (Cornell)
A prominent theory of cognition (Altmann & Ekves 2019) holds that event representations arise from the component entities involved in the event. This would mean that instrument representations necessary to an event should be generated in people's minds even when those instruments are not explicitly mentioned. For example, when presented with the event "The chef chopped an onion," participants should complete the subsequent sentence "Then, using a different …” with the instrument "knife." We use the covered box paradigm to study how people model implicit instruments. Our results show that people model implicit instruments in event representations, even though the instruments are not overtly stated in the description of the event. -
12:10–13:10
Lunch
-
13:10–14:10
Keynote: Events, indexes, and hippocampal memory
Ellen Lau (UMD)
Some theorists have held that all clausal meanings are event descriptions, containing an event variable to which predicates are applied. One classic concern about this strong position is that it runs counter to commonsense intuitions about what events are. In this talk, I'll discuss a different motivation for doubting this position: the hypothesis that the structure of natural language sentences was partly shaped by the need to interface with the rapid and powerful memory indexing system provided by the hippocampus. The hippocampal system originally evolved in early animals to index locations to which predicates could be attached, for the purpose of goal-directed navigation (returning back to previously-encountered locations and updating knowledge about them as needed). I suggest that this ancestral function resulted in a modern hippocampal memory system whose underlying blueprint is still organized around first-order predications on locations and entities, and that the structure of natural language sentences is responsive to this. While this doesn't preclude that the human hippocampus affords additional, more complex data structures than that of its ancestors, nor that some natural language sentences meanings are predications on events, these considerations make it seem less likely that all natural language sentences have such meanings as a matter of course. -
14:10–14:40
Coffee Break
-
14:40–15:10
Lightning Talks 1
Interactions of Aspect and Negation in Russian and Polish: Evidence from the Visual World Paradigm
Clara McMahon (CUNY), Valeriia Modina (CUNY)
In Russian, an interaction between aspect and negation is reported in the theoretical semantics and pragmatics literature (Zinova & Filip, 2014a,b). When an imperfective verb is negated, the entire event (both subparts) is negated; however, when the perfective aspect is under negation in Russian, only a subpart of the event is denied (the result) as a result of pragmatic implicature. This study aims to investigate this hypothesized interaction using the Visual World Paradigm (VWP). We examine aspect (perfective vs. imperfective) and negation (affirmative vs. negated sentences) in Russian and Polish. This study additionally introduces a novel psycholinguistic task: a visual variant of a Likert scale. The visual Likert scale will display images corresponding to Ramchand (2008)’s syntactic decomposition of an event. Using a visual depiction of the full event structure will allow us to collect more fine-grained information in real-time about the interpretations of grammatical aspect, as well as about the processing of events more generally.
Beyond Projecting: Modifiers' Influence on Proto-role Properties
Zander Lynch (Cornell), Lucas Y. Li (Cornell), Marten van Schijndel (Cornell)
One of the foundational theories of the syntax-semantics interface is that theta roles are projected from the verb (Jackendoff 1987). However the granularity of these roles (Dowty 1991) and the origin of these roles within the sentence (Husband 2023) are both questions that are still relevant today. We follow the likelihood rating paradigm for proto-roles of Kako (2006) and Reisinger et al. (2015) as well as using phonotactically realistic nonsense words (Berko 1958) in order to address aspects of these questions. Our results show that not only are proto-role properties influenced by elements beyond verbs, like nouns and their modifiers, but also that syntactic positions themselves modulate specific proto-role properties.
Multidimensionality and individual differences in the perception of creativity in improvised narratives
Qingzhi Ruby Zeng (Rochester), Derek Bryant Lilienthal (SJSU), Coraline Rinn Iordan (Rochester), Aaron Steven White (Rochester), Elise A. Piazza (Rochester)
Humans judge creativity differently, but the structural and perceptual mechanisms remain unclear. By prioritizing high inter-rater reliability, many previous studies undervalued the importance of individual differences, including expertise. Furthermore, in the domain of storytelling, creativity is often studied separately from other narrative dimensions (e.g., cohesiveness). Moreover, while there are quantitative linguistic features that are predictive of creativity ratings of stories, it is unclear how those features predict distinct aspects of story quality. Here, we addressed these limitations by examining how expertise influences ratings of creativity and related dimensions of stories, as well as the link between diverse linguistic features and these ratings. Novices (N=309) and creative writing/storytelling experts (N=15) rated transcripts of improvised spoken stories (N=108) on five story quality dimensions. Both groups had similar average ratings and showed two distinct latent rating dimensions–innovativeness and cohesiveness–which were predicted by distinct sets of linguistic features. Notably, however, experts perceived the five dimensions as more differentiated, and their ratings were less driven by linguistic features than novices’. These findings indicate that experts’ creativity perceptions are more nuanced and less reliant on fine-grained narrative structures, highlighting the importance of capturing individual differences and multidimensionality in creativity research.
Probing Telicity in Transformer Models
Qi Han (Cornell), Marten van Schijndel (Cornell)
Telicity (whether a described event or state has an explicit endpoint) is a crucial part of sentence understanding and reasoning. To investigate whether transformer models can grasp the concept of telicity, we conducted a prompt-priming experiment on RoBERTa, GPT2, and Llama2. We also fine-tuned RoBERTa on a masked token prediction task for telicity. The data consists of minimal sentence pairs that differ in their telic or atelic readings by one token (e.g., I am going to run for/in 2 hours). Finetuning RoBERTa suggests its sensitivity to telicity. However, models' performances varied in the prompt-priming experiment, which reveals that telicity is difficult to stably prime and may not be actively used by incremental models as a feature.
-
15:15–15:45
Lightning Talks 2
Focus reveals how people (variably) update event representations to novel material
John R Starr (Cornell), Marten van Schijndel (Cornell)
Decomposing Semantic Decomposition Reveals Origin of Thematic Roles in LLMs
Lucas Y. Li (Cornell), Marten van Schijndel (Cornell)
In event semantics, there are conflicting theories of whether thematic roles are projected from predicates onto their arguments or arise from the arguments themselves (thematic separation). Psycholinguistic research has only found evidence for thematic separation in pre-verbal NPs, which can be explained by incremental processing. In this work, we probe the event representations of LLMs on a large dataset for Semantic Proto-Role Labeling augmented to contain diverse syntactic structures. To quantify the degree to which the models use arguments and predicates during thematic interpretation, we propose a novel Transformer interpretation method that decomposes classification logits into independent contributions from input tokens. Our results provide evidence for event separation in both incremental and bidirectional LLMs, with arguments contributing an equal or greater amount of thematic information as predicates, especially when identifying features of agentiveness.
Processing Events with Grunts
Richard Brutti (Brandeis)
Representations for extra-linguistic modalities are becoming more common in work on group dynamics. There is less attention on the impact of real-world events in dialogue interpretation. Events significantly contribute to the multimodal context for speech, and can aid in automatic understanding of dialogues, from the perspective of the Common Ground. Additionally, 'grunts', or non-lexical utterances, play a key role in conversations. An annotation scheme for the classification of conversational grunts is proposed, designed to classify pragmatic functions of grunts, considering phonetic and prosodic form. The approach is inspired by work on laughter, which is extended by identifying the gruntable as the event or utterance that proceeds a grunt. By representing both speech and events as communicative acts in the sequence of task-based interactions, the situational content resulting from each move can be tracked. These events can modify the Common Ground in ways that speech and gesture cannot, by adding information about objects in the environment. A shared task video dataset is annotated with the new schema. The characteristics of the conversational grunts can give insight into the state of the task sequence, and how participants process information gained from observed events.
-
15:45–16:15
Coffee Break
-
16:15–16:35
Event Structure Shapes How Humans Summarize Naturalistic Narratives
Claire Sun (Rochester), Coraline Rinn Iordan (Rochester)
People routinely summarize their experiences, a process that involves retrieving episodic memories, ordering them by relevance, and organizing them into a cohesive, compressed narrative. This process is especially critical for naturalistic experiences, such as recounting a movie or describing daily events. Although this process is ubiquitous, the behavioral and neural mechanisms underlying summarization remain poorly understood. Here, we propose that summarization relies not only on selecting the most relevant or salient individual concepts of a narrative to be included in a summary, but also on identifying key broader components of a story to guide how the summary of a narrative is structured. According to Event Segmentation Theory, humans automatically segment continuous experiences into discrete temporal units called events. Events are pivotal for encoding naturalistic experiences into episodic memory: neural activity in the hippocampus peaks at event boundaries, and the magnitude of this activity correlates with subsequent memory reactivation for those events and with the amount of information from those events successfully remembered. Building on this framework, we hypothesize that the event structure of naturalistic narratives shapes how people summarize them, consistent with the essential role that events play in memory encoding and recall. -
16:35–16:55
Understanding Events in Multimodal Data Through Question Answering
Alexander Martin (JHU), Kate Sanders (JHU), Reno Kriz (JHU), David Etter (Etter Solutions), Jimena Guallar-Blasco (JHU), Hannah Recknor (JHU), Cameron Carpenter (JHU), Jingyang Lin (Rochester), William Gantt Walden (JHU), Benjamin Van Durme (Microsoft, JHU)
Understanding and interpreting events is fundamental to human cognition, enabling us to engage with the world and make sense of complex scenarios. Vision-capable AI systems must similarly develop robust event-centric reasoning to support human information needs in multimodal environments. We explore understanding events in videos through template based question answering. We will discuss methods for extracting information from videos and their aligned text highlighting the current challenges and potential applications of these tools for advancing multimodal event understanding and representations of eventuality structure in the real world. -
16:55–17:15
Cross-Document Event-Keyed Summarization
William Gantt Walden (JHU), Pavlo Kuchmiichuk (Rochester), Alexander Martin (JHU), Chihsheng Jin (Rochester), Angela Cao (Rochester), Claire Sun (Rochester), Curisia Allen (Rochester), Aaron Steven White (Rochester)
Providing useful insights about events to readers requires not only extracting information from text sources relevant to their information needs, but also presenting this information in a readable form. Doing this effectively in turn requires the ability both to synthesize information about an event across multiple sources and to support such synthesis for diverse types of events and participants. To this end, we extend recent work on event-keyed summarization (EKS; Gantt et al., 2024)—the task of producing a summary of a specific event, given a (single) document and an extracted event representation—to the cross-document setting (CDEKS). Drawing on the FA-MuS dataset for cross-document argument extraction (CDAE; Vashishtha et al., 2024), we release a new, expert-curated dataset for CDEKS, dubbed SEAMUS (Summaries of Events Across Multiple Sources). We present a range of experiments with both smaller fine-tuned summarization models, as well as zero- and few-shot prompted large language models (LLMs). We also present a set of ablations and a detailed human evaluation study, showing SEAMUS to be a useful benchmark for CDEKS. -
17:15–17:30
Closing Remarks
-
18:00–22:00
Dinner at Strangebird (authors and organizers only)