Aaron Lee

work

About

Artwork

Resume

LinkedIn ↗

Instagram ↗

Twitter ↗

Email ↗

Nuance

Back

overview

inital findings

Captions miss the mark

solution

facial expressions

layering visual cues

user-defined color cues

interface snapshots

reflection

ROLE

Product Designer

TEAM

Jae Sung Park
Uyen Hoang

TIMELINE

3 day design sprint

TOOLS

Figma, Blender, After Effects

OVERVIEW

The deaf and hard-of-hearing can’t hear emotion over the phone without video.

Conversation is not the words we say—it’s the way we say them. A pause. A laugh. A shift in tone. A raised brow. These small cues carry big emotional weight.

But what happens when those cues are stripped away? That’s what we realized when we spoke to members of the deaf and hard-of-hearing community. Most products offer captioning and say “yay, it’s accessible now!”

That’s not a solution, it’s an afterthought.

APPROACH

Nuance: call w/ emotional context

A mobile calling app that pairs animation with live captioning.

OUTCOMES

Introduced an expressive, color-changing 3D character to give emotional cues

AI analyzes the caller’s tone and other subtle cues to animate the 3D character, communicating emotion.

ASL video-to-text-to-speech

ASL will be translated to text then voice for the caller on the other end.

This case study focuses on accessibility-first design by breaking down captions as the only method of serving the deaf.

INITIAL FINDINGS

Phone calls are frustrating because people don’t know how to communicate with deaf people.

People don't know how to communicate.

The deaf often resort to messaging even with captions available.

Sees an important number calling “Can we do this over email?”

It’s not hard to find a shared problem

Numerous deaf influencers complain about phone calls and captions as a necessity.

Not enough caption services

Imagine not understanding over 50% of what’s happening around you.

Important calls are missed

Important calls get dropped.

Only captions = less context.

Attitude, tone, and many other conversational nuances are lost.

60% of adults not confident interacting with the deaf.

YouGov survey commissioned by the Royal National Institute for Deaf people.

Captions miss the mark

Current calling apps make calls feel like texting. Losing the emotion and clarity.

Ideal Platform vs. Necessary Features

All work like messaging apps.

Captions don’t provide enough context in conversations.

Call vs text

Captions provided by these calling apps don’t preserve the feeling of a real conversation over the phone.

“I know what they’re saying, but I can’t feel how they’re saying it.”

The missing nuances in a conversation, from a soft chuckle to a frown, are crucial context clues. Sad? Giddy? Sarcastic?

How do we bring the emotional cues of a real conversation into the calling experience?

solution

Combining visual emotional cues with captioning.

Before

Literally just texting

(A mockup of what a standard captioned call feels like).

After

Captioning + emotional context

A live captioning service is paired with an avatar so callers feel understood. Users can use ASL, similar to video calls.

Low Empathy

Uyen: Can’t make it today Something urgent came up.

Aaron: oh ok. (unaware she’s sad about it)

Uyen: Sorry. I really wanted to see you.

Aaron: yea... (thinks she might be making excuses)

Room for Empathy

Uyen: Hey, Can’t make it today Something urgent came up.

Aaron: No worries! I hope everything’s okay. (Sees sadness). Let me know if you want to talk later

Uyen: Sorry. I really wanted to see you. I’ll call you in the evening!

Aaron: Sure! Let’s catch up later then. (Sees affection)

Captioning + emotional context

A live captioning service is paired with an avatar so callers feel understood. Users can use ASL, similar to video calls.

Room for Empathy

Uyen: Hey, Can’t make it today Something urgent came up.

Aaron: No worries! I hope everything’s okay. (Sees sadness). Let me know if you want to talk later

Uyen: Sorry. I really wanted to see you. I’ll call you in the evening!

Aaron: Sure! Let’s catch up later then. (Sees affection)

facial expressions

Without auditory cues, deaf people rely heavily on facial expressions.

Animation: We can show facial expressions without video.

Despite these being drawn in 2 seconds, you could already interpret each face’s emotion.

AI is evolving to be emotionally intelligent.

design for the future

AI may not accurately pick up emotions right now, but in the future, AI will be able to detect emotional signals.

3D character expressions

I modeled, rigged, and animated an avatar that changed expressions based on the tone, pacing, and other subtle nuances.

The facial expressions aren’t easy to read; we need more clarity.

layering visual cues

How can we maximize on sight as the most actively engaged sense?

The more senses that are engaged, the more real the experience becomes. In the same vein, if we layer more sensory information given to the most active sense, sight, they’ll be able to more efficiently analyze context clues.