Interruption Handling for Robots

Robots should not just detect that you cut in. They should infer why.

Interruptions are not all the same. Some signal agreement, some offer help, some ask for clarification, and some challenge the floor entirely. This project built a conversational framework that classifies interruption intent in real time and adapts the robot's response so the interaction stays socially coherent.

RSS 2025 Conversational Robots LLM Classification Open Source
Taxonomy and handling patterns for interruptions in human-human and human-robot interaction.

The handling patterns separate interruptions by intent and map them to floor-holding, brief acknowledgement, contextual response, or yield behaviors.

Intent-aware interruption handling for conversational robots

RSS 2025: Shiye Cao, Jiwon Moon, Amama Mahmood, Victor Nikhil Antony, Ziang Xiao, Anqi Liu, Chien-Ming Huang

93.69%
Interruptions successfully handled without breaking the conversation
88.78%
Interruption intent classified correctly in real time
246
Human-human interruptions used to derive the handling strategies
4
Interruption types with distinct robot behaviors

Always ignoring interruptions feels rigid. Always yielding feels fragile.

Existing systems usually choose one of two bad defaults. They either plow ahead as if the user never spoke, or they stop immediately every time speech overlaps. Human conversation works differently. People infer intent from timing, wording, and context. This project brings that same logic into robotic turn-taking.

Ignore-all baseline 27.03%

Treating all interruptions as noise breaks down fast when the user is trying to redirect, clarify, or help.

Yield-all baseline 84.68%

Always stopping is better, but it still mishandles supportive overlaps that should be acknowledged without giving up the floor.

Intent-aware system 93.69%

The strongest performance comes from treating interruption handling as a social inference problem, not a raw speech-detection problem.

The interaction problem becomes manageable once the robot knows what kind of interruption it is hearing.

The taxonomy came from analyzing human conversation across discussions, interviews, and briefings. Instead of treating overlap as one event, the system distinguishes between supportive and disruptive moves in the interaction.

Examples of different interruption types and robot responses in conversation.

The same robot turn can be met with agreement, clarification, assistance, or disruption. Each requires a different response strategy.

Supportive overlap

Cooperative agreement

The user is signaling understanding or agreement while the robot is speaking. The right move is acknowledgement, not surrender.

Supportive overlap

Cooperative assistance

The user is helping the robot finish a point or supplying a needed word. The robot briefly listens, accepts the assist, and resumes.

Contextual pivot

Cooperative clarification

The user needs an explanation before the conversation can continue. The robot answers in context, then returns to the original thread.

Floor challenge

Disruptive interruption

The user is changing topic, disagreeing, or explicitly taking the floor. Here, the robot should yield or briefly wrap up before yielding.

The system moves from overlap detection to intent classification to a matched handling strategy.

The architecture is intentionally narrow and legible. It watches for user-initiated overlap, classifies the interruption with conversational context, and then selects the response policy that best fits the inferred intent.

Interruption handling architecture showing detection, intention classification, and strategy selection.

Three modules coordinate the interaction: user-initiated interruption detection, LLM-based intent classification, and handling logic tuned to each interruption type.

Detection Overlap, not turn exchange

The system monitors simultaneous speech and filters out normal floor exchange so only genuine user-initiated interruptions enter the pipeline.

Classification LLM in context

Conversational history and timing are passed into the classifier so the system can distinguish help, clarification, agreement, and disruption.

Handling Behavior policy

The robot then decides whether to hold the floor, acknowledge and continue, answer and resume, or yield gracefully.

The strongest demo is not one interruption. It is what happens when several overlap patterns happen in sequence.

Real conversation gets messy. This example shows the system handling compounded interruptions while keeping the interaction coherent and preserving conversational momentum.

Example interaction showing multiple compounded interruptions handled in sequence.

A single turn can contain agreement, clarification, and floor exchange. The system adapts its response each time instead of relying on a single fallback behavior.

The main result is not just higher handling accuracy. It is better conversation quality when interruptions are resolved appropriately.

The evaluation makes the interaction stakes visible. Mis-handled interruptions are not just technical misses; they reduce how included and satisfied people feel in the conversation.

104 / 111

Handled successfully

The intent-aware framework substantially outperformed both naive baselines in resolving user interruptions without conversational breakdown.

ASR

Main failure source

The classifier itself held up well. Most failures came from speech recognition errors, pointing to multimodal sensing as the next leverage point.

rho -.43

User experience consequence

Failed interruption handling correlated with lower perceived inclusion and lower discussion satisfaction, showing that turn-taking quality directly shapes trust.

"The moment I speak, it wants to just listen. My speech should take precedence over anything it says."

Participant — on expectations for conversational robots

"Luna, we don't have time," and the robot stopped, acknowledged, and moved on. That felt right.

Participant — on a disruptive interruption handled well

Robots that speak naturally also need to be interruptible naturally.

This project argues for a higher bar in conversational robotics. Interruption handling should not be a simple stop-or-ignore feature. It is a social behavior layer that helps robots stay readable, collaborative, and appropriately responsive once real conversation begins.