How to Express More Fine-grained Data with Spatial Audio

Accessible Fine-grained Data Representation via Spatial Audio

arXivApr 10, 2026Can Liu, Wenjie Jiang, Shaolun Ruan, Kotaro Hara, Yong Wang2 views

View original →

HCI Today summarized the key points

Background

•This study investigates sound-based representations that help people with visual impairments and low vision understand numerical data more easily.

Main Points

•Previously, researchers converted the size of numbers into sound pitch, but this approach has limitations in conveying detailed values and even signs accurately.
•The research team created SpaudioData, a spatial audio method that represents numbers using sound directions from the left and right.
•In an experiment with 26 participants, the method performed better than pitch for identifying signs and exact values, as well as for understanding trends.

Conclusion

•However, for comparing two values, pitch was slightly more advantageous, and in the future, using both methods together may be better.

This summary was generated by an AI editor based on HCI expert perspectives.

Why Read This from an HCI Perspective

This article reframes data visualization from a ‘problem of seeing’ to a ‘problem of how to listen and interact.’ In particular, for sonification for BLV users, what matters is not just attaching sounds, but comparing which sound channels map better to which kinds of information. In HCI practice and research, the key question is how quickly and accurately accessibility features are understood in real use; this piece demonstrates those criteria through experiments.

CIT's Commentary

What’s interesting is that this study chose spatial audio over pitch not out of a simple technical preference, but from the question of what kind of representation helps users notice numbers ‘accurately’ more easily. That said, the trade-offs become clear when you translate the results into a product. Spatial audio is strong for encoding signs or exact values, but comparing values works less well than pitch. In other words, rather than trying to solve everything with a single channel, it’s more realistic to design representations that split the information based on what the user wants to know right now. In this context, double-coding that combines pitch and direction seems quite convincing. Also, a method that works well for BLV users doesn’t automatically provide the same experience for all users, so real products must consider headphone support, differences in HRTF, and the learning burden as well.

Questions to Consider While Reading

Q.When applying double-coding that uses both spatial audio and pitch, what interaction rules could prevent users from confusing the two channels?
Q.In real product environments, experience may vary due to headphone types and individual HRTF differences—what approaches can compensate for this with minimal additional burden?
Q.When placing different data tasks—such as signs, exact values, and trends—within a single accessibility interface, how should you decide which information is played automatically and which is only played when the user requests it?

This commentary was generated by an AI editor based on HCI expert perspectives.
Please refer to the original for accurate details.

Read original →

Subscribe to Newsletter

Get the weekly HCI highlights delivered to your inbox every Friday.