Talk2AI: A Longitudinal Dataset of Human--AI Persuasive Conversations

arXivApr 6, 2026Alexis Carrillo, Enrique Taietta, Ali Aghazadeh Ardebili, Giuseppe Alessandro Veltri, Massimo Stella4 views

View original →

HCI Today summarized the key points

Background

•Talk2AI is a large-scale collection of conversational data examining how persuasion and changes in thinking unfold as humans and AI talk to each other.

Main Points

•The dataset records conversations over four weeks between 770 Italian adults and one of GPT-4o, Claude, DeepSeek, or Mistral.
•Each week, participants held 10 conversations on topics such as climate change, math anxiety, and health misinformation, and also answered surveys.
•The research team organized the data so that conversation content is linked with information such as participants’ age, personality, and trust in the AI.

Conclusion

•The dataset helps analyze how AI changes people’s opinions and beliefs over time.

This summary was generated by an AI editor based on HCI expert perspectives.

Why Read This from an HCI Perspective

This article is important for HCI researchers and UX practitioners because it treats LLMs not just as text generators, but as conversational interfaces that can change people’s thoughts and attitudes. In particular, since it captures not a single interaction but exchanges over multiple weeks, it allows us to examine issues such as trust formation, persuasion, and points of intervention along a time axis. It also prompts product teams to think about what actually moves people when adding AI.

CIT's Commentary

What’s interesting is that this study makes it much clearer how a conversation shapes a person’s experience than it does how to build an AI that produces ‘good answers.’ Even with the same model, persuasion effects can vary depending on the topic, an individual’s traits, and repeated interaction patterns—suggesting that simply improving model performance in real services does not automatically guarantee safe or desirable interactions. In particular, interface mechanisms such as prompting questions, encouraging rebuttals, and limiting message length can significantly change user behavior, so products need to evaluate whether these mechanisms help guide users or instead intensify manipulation. The study’s measurement of how trust and conviction evolve across repeated conversations also offers hints for designing more rigorous LLM-based UX evaluations. For example, even if you use an LLM to assist with analyzing survey responses or conversation patterns, you still need to separately validate the tool’s own biases and the consistency of its measurements.

Questions to Consider While Reading

Q.In repeated conversations, how can we separate and verify whether the factors that increase persuasion effectiveness come from the model’s response quality or from the interface’s prompting and guidance methods?
Q.What interaction signals could distinguish the points where users perceive AI persuasion as ‘help’ versus where they perceive it as ‘manipulation’?
Q.If we apply this dataset to Korean services from Naver, Kakao, and startups, in which topics and user groups would the results differ the most?

This commentary was generated by an AI editor based on HCI expert perspectives.
Please refer to the original for accurate details.

Read original →

Subscribe to Newsletter

Get the weekly HCI highlights delivered to your inbox every Friday.