My second published Alexa skill was born from a request from a dear friend. We were planning a joint family trip to Disney World, and my goddaughter wanted to know how long until the big day. Wouldn’t it be nice to have a skill to make it easier for her mom to answer that question?
The design process for this skill was somewhat unique. I initially designed the skill as the core example around which I built the sample deliverables for my now world-renowned voice design workshop, Giving Voice to Your Voice Designs. However, at the time I created these designs, I had to concentrate on developing the workshop, and could not spare the time to code the skill itself.
Design brief
At its core, My Countdown is simple: it allows customers to save a date for several key life events (DateSetIntent). Once each date is saved, the customer can query the number of remaining days at any time (DateGetIntent). They can use a separate intent to delete any date that is no longer valid (DateClearIntent).
One of the most important design decisions here was to limit the custom slot type – DATETYPE – to a short list of specific event names. While a common request is support for arbitrary names like “Mom’s birthday”, I have learned from my experience on VUI elsewhere that arbitrary input would be problematic in two key ways:
- The larger the set of potential names, the worse performance will be. The performance issue becomes magnified if the potential entries are not acoustically unique.
- We cannot control the spelling of arbitrary input. If we plan on supporting display cards at any point, arbitrary voice input may lead to confusion or even distress (like a mispelled name of a loved one).
- Since our database system is string-based, arbitrary input might lead to orphaned entries based on variability (for example, one day recognizing “Cheryl’s birthday” and another day recognizing “Sheryl’s birthday” and finding no matches.
Rather than set customers up for failure by allowing these arbitrary inputs, I made the very intentional choice to scope input to a small, fixed list of event dates whose names have been scrubbed to ensure acoustic uniqueness. These event types are unlikely to cause false positives.
Though this, like Trainer Tips, was a solo development effort, I did create a full set of design deliverables as part of the design process. I share these deliverables with students in my “Giving Voice to Your Voice Designs” workshop.
Implementation
A year later, I finally had the opportunity to build out the skill. It required support for a persistent cloud database, for which I learned how to integrate DynamoDB with the Lambda code (in Node.js) I was developing. Interestingly, the platform has changed in some significant ways since my first experience developing a skill (Trainer Tips), which required some adaptation.
During development, I decided to expand the scope of my skill beyond what we cover in my class. I focused on one-shot intents in class, but I’ve received suggestions for several additional event types, which made batch intents more useful. If you have 6 events saved, it’s much more likely a customer might want to listen to the full set. Thus I added “Get all dates” and “Clear all dates” intents to the original 3 one-shot intents.
Due to time constraints as a solo developer, I also needed to make some compromises. My original designs included a fairly forgiving slot-filling model for incomplete utterances. However, the newly introduced dialog model isn’t yet represented in Alexa sample code. I postponed implementation of the model (which would have required an entire state management system) in hopes I can use the new beta Skill Builder to implement my slot-filling more efficiently in the near future.
Next Steps
The My Countdown skill went live on January 11, 2018. I am currently in contact with Amazon about an invocation phrase issue – in some cases, the words “count down” are recognized as a single word “countdown”, which causes one-shot invocation of a critical intent to fail. I am actively troubleshooting this problem and may change the invocation phrase to fix the problem.
If initial response warrants further development, I will begin exploring a v2 with a more forgiving slot-filling dialog system, and may also explore multimodal display cards for the Echo Show and Echo Spot.