ROLE
Product Designer
TEAM
Jae Sung Park
Uyen Hoang
TIMELINE
3 day design sprint
TOOLS
Figma, Blender, After Effects
OVERVIEW
The deaf can’t hear emotion over the phone without video
Conversation is not the words we say—it’s the way we say them. A pause. A laugh. A shift in tone. A raised brow. These small cues carry big emotional weight. But what happens when those cues are stripped away?
APPROACH
Nuance: call w/ emotional context
A mobile calling app that pairs animation with live captioning.
OUTCOMES
Introduced an expressive 3D character to give emotional cues
AI analyzes the caller’s tone and other subtle cues to animate the 3D character, communicating emotion.
ASL video-to-text-to-speech
ASL will be translated to text then voice for the caller on the other end.
INITIAL FINDINGS
Phone calls are a mess
People don't know how to communicate with the hard-of-hearing.
The deaf often resort to messaging even with captions available.
“Can we do this over email?”
It’s not hard to find a shared problem
Numerous deaf influencers complain about phone calls and captions as a necessity.
Captions miss the mark
Current calling apps turn to texting, still losing the emotional clarity.
Phone calls turn into bubbles
Captions don’t provide enough context in conversations.
Call vs text
The feeling of a real conversation is lost
“I know what they’re saying, but I can’t feel how they’re saying it.”
How do we bring the emotional cues of a real conversation into the calling experience?
solution
Combining visual emotional cues with captioning.
facial expressions
Without auditory cues, deaf people rely heavily on facial expressions.
Animations can show facial expressions without video.
Despite these being drawn in 2 seconds, you could already interpret each face’s emotion.
AI may evolve to be emotionally intelligent.
design for the future
AI may not accurately pick up emotions right now, but in the future, AI will be able to detect emotional signals.
3D character expressions
I modeled, rigged, and animated an avatar that changed expressions based on the tone, pacing, and other subtle nuances.
Just facial expressions aren’t enough; we need more clarity.
layering visual cues
How can we maximize on sight being the most engaged sense?
The more senses that are engaged, the more real the experience becomes. In the same vein, if we layer more sensory information given to the most active sense, sight, they’ll be able to more efficiently analyze context clues.
We can give layer multiple visual emotional cues.
Multiple visual cues = clarity
The more cues, the more natural the conversation will feel.
Colorful expressions.
Color is often directly linked to emotion, so we leveraged color as a visual cue for emotion, making facial expressions more easy to interpret.
Mood setting: We give users control over what colors mean to them.
interface snapshots
Review recent + missed calls
Not just plain text transcription, but emotional documentation as well.
Home screen: Everything necessary consolidated.
We decided that the navbar wasn’t necessary given the speedy product experience. People just want to make a call, not dig through pages.
Reflection
What I learned
Layered cues/signifiers bring clarity.
Mashing different cues build an immersive experience. E.g. design for blind people could layer haptic + auditory cues.
No need to design everything to communicate value.
80% effort in divergent problem-thinking, not throw-away features.
We take communication for granted.
Accessibility is a necessary human right, not another box that has to be ticked off.










