Case Study

Echo Look (Product and VUI Design)

Echo Look (Product and VUI Design)

I was hired to help a fledgling product team transform their initial concept lab pitch into a production-worthy multimodal design for an entirely new hardware product category at Amazon.

Problem statement

How might we create a completely new experience that helps customers manage their wardrobes, combining natural user interfaces with traditional user interfaces and Amazon shopping technology? At the time I was hired, a concept lab team had identified a product opportunity combining fashion with natural user interfaces, and had received funding for a small pod of full time employees to work on the concept as a secretive, “tented” project team.

I became employee number 3 on the Echo Look project, joining product manager Maggie McDowell and Lab126 product leader Robert Zehner – brought on by our VP and champion Eva Manolis. The team had received approval to take a prototype and turn it into a full-fledged product proposal. However, the initial proposal was not commercially viable, and our team adapted rapidly while maintaining our connection to the customer problems being solved.

Personal contributions and responsibilities

  • Design and delivery of a photo booth system used for computer vision data collection
  • Foundational work: mind maps, bodystorming, research, participation in outside ethnography
  • Solo production storyboarding at scale (all key product scenarios)
  • Hands-on prototyping and collaboration with UI developers
  • Participation in technical reviews and strategic platform planning as UX representative
  • Design and facilitation of multiple usability studies
  • Early app information architecture and wireframing
  • Relationship management and requirements gathering with Alexa stakeholders
  • Defining the official hardware requirements for speech functionality
  • Delivering all voice user interface designs

Team and timeline

My work on the project began in September 2014, during our initial ethnographic research. I was on the team for the first year full-time and the second year as their voice UI specialist; the project ran about 3 years.

  • As mentioned, the team started out as a small, 3-person concept pod, and I was the sole UX designer.
  • At the end of our concept phase, we were greenlit for pre-production and the team grew to several dozen across two locations: a hardware and development team located in Sunnyvale and a small design and production team in Seattle.
  • At the end of the concept phase, we were greenlit for production and the team grew to 80 people across the two locations, with about 8 in Seattle and the remainder in Sunnyvale.
  • About halfway through production, the project was forked to add a major new feature, and an additional development team was added.

DEVELOPMENT PEERS: Throughout the process, I collaborated heavily with my peers in the Sunnyvale team, traveling near-weekly to their offices. I worked most closely with our front end prototypers, project managers, and development manager, with less frequent collaboration with natural language, computer vision, and hardware engineering specialists.

DESIGN PEERS: Our Seattle design team started out as myself only; another UX designer joined about 3 months in with the addition of a Principal designer to focus on the mobile experience about 6 months into the process. I was also solely responsible for UX research (aside from the initial ethnographic work run by an outside firm) until the addition of a Senior UX Researcher about 6 months into the process.

Stakeholders

As a first-to-market hardware product, we had many stakeholders. CEO Jeff Bezos was a primary stakeholder and made all final decisions about greenlights in each phase (concept, pre-production, production). My storyboards, design support in PR-FAQs, and prototype design were critical in supporting our approvals particularly in those early stages.

Additional stakeholders included Lab126 (Sunnyvale) leadership, the Vice President of Consumer Engagement, and the the Vice President of Amazon Fashion. Once the determination was made to become an Alexa device, Alexa leadership also became stakeholders and partners in our work, and I was the key liason with many of those individuals.

Concept phase

Initial pitch

Heading into a CEO level product review, we realized that our original idea was unlikely to obtain approval for a variety of reasons, and we brainstormed alternatives that would still solve the problems we identified in ethnography. One such strategy was the “Look”, a camera that plugged into the wall and attached to an existing mirror or wall, relying on an app for visuals. I decided to explore the idea with a pair of storyboards to show how the experience might translate to the home – both good and bad – on Day 0 and Day 60.

The Day 60 storyboards were revised several times and eventually shown to Jeff Bezos for approval, alongside the initial PR-FAQ. As a result, the project received enough funding to advance to concept review, the next hardware checkpoint.

Setting vision for a growing team

Once we received approval and funding, we were on the clock to move our initial concept for the Look from a single storyboard to a fleshed-out concept that covered all presumed core customer use cases. Furthermore, we had to do so in a way that allowed the entire hardware team to review and participate.

With the Product Manager and other key product stakeholders, we brainstormed a core series of use cases, based heavily on earlier ethnographic studies and our Amazon “working backwards” PR-FAQ documentation.

The result of the brainstorm was a series of 15 storyboards that incorporated key use cases in an end-to-end, customer focused narrative. These storyboards were reviewed by the entire product team, generating key discussions, new JIRA tasks, prototyping priorities, and the basis for our Concept Review documentation. I wrote and illustrated all storyboards – the goal was to work as quickly as possible to get ideas out for discussion.

Design and prototyping

The core Look concept team pushed forward on an ambitious prototype that explored multiple dimensions of natural user interactivity. I designed multiple gestural interactivity systems in addition to voice interactivity models, and a complete set of visuals for the system. We had about 3 months from initial go to our final demo, which was extremely aggressive with the level of technology and computer vision we were dealing with.

I worked hand-in-hand with our development team in Sunnyvale, delivering all designs personally and direct editing code when necessary. I spent many days on site in Sunnyvale with the team to remain close to the dev team and the specific unit that would be used in the demo, which also allowed me to learn more about the hardware explorations running in parallel. I also ran usability studies on the prototype to gain insight into the desirability and practicality of our proposed interactions.

It was at this time I also began to partner with Amazon’s sound design team, who also worked on the Echo sound design. My experience working on video games with websites like studyabroadnations taught me to make these connections early. We worked out an early sound cue list, potential timelines, and identified audio samples I could use in prototypes.

The final prototype allowed customers to interact with a computer vision system using a Kinect and webcam to simulate the capabilities of an embedded camera with infrared background removal, live grammar-based voice interactivity, and a few different visual UI models. The prototype demo received a greenlight from the CEO for the project to move into pre-production.

Pre-Production phase

Once we moved out of concepting, more team members joined and work began in earnest. I focused on information architecture, refined storyboarding, user research, and natural user interface designs while our new designers focused on visual design work: wireframing and high-fidelity explorations. But another major theme that emerged during this time was our relationship with the Alexa product line.

Enter Alexa

A month into our concepting work, Alexa was announced. Prior to that announcement we’d been planning on including simple grammar-based voice commands in our product. Suddenly the inclusion of natural language voice was an option – and as a matter of fact, it seemed likely that an attempt to ship with ONLY simple voice would seem off-brand by 2018. The problem was that in 2014, the Alexa team was so new that they were still figuring out their own world; they weren’t ready for other internal teams to come knocking on their door.

Usability evaluation of natural language for the product

We finished the concept phase with a prototype that incorporated gesture and voice, but gesture was cut. Voice, on the other hand, was still intriguing.

  • CHALLENGE: I was asked to run usability studies to assess the desirability of the speech feature, because inclusion of microphone arrays to support natural language could run upwards of several million dollars.
    • 15 participants across 2 locations (Seattle and San Jose)
    • All employees, selected from computer vision selfie photobooth volunteers against screening criteria
    • Run using mock hardware (Kinect on 20/20 frame, cell phone, arduino with webcam for preview)
  • My role:
    • Recruit all participants
    • Design and solo facilitate the study/interviews (I had some assistance with note-taking)
    • I built out prototype control code: Arduino + webserver for audio cue, lighting, and voice UI triggers

Because of the high potential cost, I designed a study that made conditions realistically difficult for voice and phones. I made it seem as if the phone app was the effortless default, and built a Wizard of Oz prototype web server I could run at a delay to make it seem as if the device were processing or misrecognizing. I did ask folks to simulate setting the phone down as they would have to when changing clothing, however.

The results from the study came back very clear: “customers” (the carefully vetted internal employees we were allowed to test with for tented projects) did not want to purchase this device without voice because the phone was viewed as cumbersome and it showed up in all the pictures. These findings, along with others, were reported back to leadership and led to the edict to seek inclusion on the Alexa platform.

Outside the design box: Partnering at the hardware level

Because of my strong ability to collaborate with stakeholders and solve difficult problems, and my unique position as a voice expert, I was asked to take on the challenge of pursuing our placement on the Alexa platform. This meant proactively seeking out conversations with dozens of product managers, developers, and directors for different feature areas on Alexa: at the time, I needed to get sign-off from each feature, like media and weather, individually. Each feature had different potential hardware/firmware/software requirements.

My partnerships with these folks also led to my role as the author of the speech technology portions of the actual BRD, the Amazon hardware/ business requirements specifications document used to communicate with the manufacturing plant. This was entirely out of my wheelhouse and entirely nonstandard for designers. However, I was explicitly asked to take it on alongside the hardware experts from Sunnyvale, and it provided me a unique opportunity to ensure that the voice UI requirements for natural language on the device would be met at the bare metal level – and it also gave me a much, much deeper understanding of natural language and smart speaker technology in general

Production phase: Voice UI delivery

As the project moved into production, I transitioned to the core Alexa voice user interface (VUI) team. From there, I remained on as the Echo Look Voice UI designer, and participated in and facilitated Alpha testing of the Echo Look as the devices came off the production line. As with any voice UI delivery, I was responsible for delivery of all call flows, sample utterances, and prompts, as well as partnership with development on any dynamic behavior or business logic. These were extremely early days in the world of natural language user interfaces and Alexa was a highly dynamic space, so the linguistic environment was exciting but very challenging to design within.

Example utterances included “Take a photo”, “Take a video”, and “What should I wear?”. Intents had to account for errors like outages related to cloud processing, lack of phone connection, camera thermal conditions, multi-user scenarios, and beyond.

Fun fact: As part of my voice UI deliveries, I wrote and “taught” Alexa the definitions for various levels of dress like business casual and semi-formal using the inbuilt Q&A framework. You can still get those prompts as of this writing from any device.

Reception & Reflection

The Echo Look product went into public beta in April 2017, and was released to the general public in May 2018. The Echo Look service was shut down in July 2020. I was not at Amazon when the product launched, so I’m not privy to inside knowledge about any reasoning behind the decision to sunset the service. However, I can posit what happened based on what I saw from the rollout of the product.

Our original target customers wanted a device to help them manage their wardrobes, which were unwieldy and spanned multiple rooms and sometimes storage facilities. They loved the art of choosing clothing and didn’t want to delegate that task.

However, our highest priority stakeholder felt so strongly that the device should tell people what to wear that millions of dollars were spent on expensive “style check” AI that became both the focus of the marketing for the product AND the focus of the reviews for the product. Anecdotally, I’ve had many people tell me they didn’t realize the device could be used as a wardrobe management tool and would have purchased it had they known.

My personal theory is the over-focus on the Style Check AI unfortunately alienated the core customers who our market research indicated had a strong interest in the product because the implication a computer could tell them what to wear was insulting and off-putting.

That said, I wonder what we could have done with storytelling to make said key stakeholder’s perspective more heard and seen in the existing scenarios to avoid this all-or-nothing approach. There is always something to learn.

Talks & Podcasts

I had the opportunity to speak with with O’Reilly Media about working on the Echo Look:

From Blank Page to World Stage

After the release of the Echo Look, I transformed my insights from working on the product into a talk that’s been featured as my keynote at Design Matters in Copenhagen and as a mainstage talk at Interaction 18 in Lyon, France.

Related Posts