Can a ‘Video-Based Conversational Chatbot’ Survey Really Work? A Pilot Study in New York City

Assessing the Feasibility of a Video-Based Conversational Chatbot Survey for Measuring Perceived Cycling Safety: A Pilot Study in New York City

arXivApr 7, 2026Feiyang Ren, Zhaoxi Zhang, Tamir Mendel, Takahiro Yabe0 views

View original →

HCI Today summarized the key points

Background

•This study examines whether a conversational AI chatbot survey that uses bicycle-lane videos from New York City can effectively measure perceptions of safety.

Main Points

•The research team showed participants driving videos and had them converse with an AI chatbot to answer whether the situation was safe and why.
•Sixteen participants evaluated nine road videos to completion, and scores for usability and chatbot friendliness were generally high.
•The analysis found that trees and greenery increased the feeling of safety, while construction, cars, and parking increased the feeling of danger.

Conclusion

•Because this method can capture both people’s real feelings and the reasons behind them, it has strong potential for new use in bicycle-path design.

This summary was generated by an AI editor based on HCI expert perspectives.

Why Read This from an HCI Perspective

This article is especially meaningful for HCI/UX practitioners and researchers because it treats AI not as a mere automated response tool, but as an interface that elicits people’s lived experiences. In particular, the attempt to combine video with a conversational chatbot to transform surveys that rely on memory into something closer to an in-the-moment experience is an intriguing direction that helps address limitations of existing measurement approaches. Looking at both user experience and data quality also makes the work directly relevant to practice.

CIT's Commentary

The core value of this study lies in interaction design rather than model performance. The flow—watching video, making an immediate judgment, and then explaining why—serves as a strong mechanism for externalizing the sensations users already have in their minds. At the same time, it clearly reveals a trade-off: as repeated responses increase, fatigue builds and drop-off grows. That’s why, for tools like this, what matters is not simply asking more questions, but deciding when to stop, what to ask, and how to leave an intervention path. In mobility environments where safety is critical, it’s especially important how transparently the system state is presented, and how far users can trust and correct the AI’s judgments. Moreover, using LLMs not only as response generators but as part of UX measurement tools is a practical approach that expands the research methodology itself. In Korea, where contexts are often familiar with short, fast interactions—such as Naver, Kakao, and public survey environments—this kind of design may require even more concise and clearly defined feedback loops.

Questions to Consider While Reading

Q.In a video-based chatbot survey, how should the number of questions and the way context is presented be adjusted to reduce user fatigue while still collecting enough narrative data?
Q.When the AI presents explanations first in safety-related judgments, how can we reduce the bias that pulls users toward those explanations?
Q.If this approach were applied to Korean mobility services or local-government complaint collection, which interaction elements would need to be localized first?

This commentary was generated by an AI editor based on HCI expert perspectives.
Please refer to the original for accurate details.

Read original →

Subscribe to Newsletter

Get the weekly HCI highlights delivered to your inbox every Friday.