Confession: I feel awkward talking to machines, and I always will. My comfort zone ends at the ATM, or self-service checkout, where I’ll conclude transactions with a whispered “thank you” (my wife will attest to this). It’s a simple courtesy born of a generalized fear of machines; specifically their inevitable rise. When the day comes I hope my long appreciation of their servitude will be recognized, and that I’ll be spared.
Of course, I’ve had a voice assistant on my phone for years, but we rarely talk; the main exception being those evenings where we might interrogate it for fun. Such play will typically stray from affable tire-kicking—“What kind of Pokemon is Squirtle?”—into bolder territory where responses might unnerve: a mildly threatening snark suggesting possible non-compliance with the Three Laws.
A few months ago I finally bought a smart speaker: a dedicated conduit through which the Internet can penetrate my kitchen. I was determined to forge a valuable relationship with the hefty, fabric-wrapped cylinder, and began by immediately changing the “wake word” to something less gendered, dropping a syllable in the process. Enthusiastically, I set about testing it carefully in real scenarios, eager to identify our combined strengths.
Voicing my frustration
I learned quickly that for someone like me, living with a conversational object is an exercise in minimizing conversation. For example, I was jubilant when— by process of word elimination and substitution—I was able to eschew the weirdly self-important “What’s my Flash Briefing?” in favor of a blunt and succinct “News.” I mean, I’m not the Prime Minister; I just want some headlines and a weather forecast, finished off with a pretentiously sesquipedalian Word of the Day.
Attempts at more complex interaction—something wildly audacious like requesting a song I’d like to hear—seem ill-advised. Getting music from Spotify is laborious, and having it fetch one of my playlists seems all but impossible. Inevitably the object is loyal to its maker, preferring to pull from its own streaming service. I suppose I should be grateful that it’ll talk to Spotify at all when some rival assistants flat out refuse.
It’s a marvel that I can shout “Livin’ on a Prayer” and have that synth intro immediately fill the room, but in my experience voice recognition is only halfway there. The object regularly fails to play the right song or even find anything at all. “Sorry, I couldn’t find R K Fire.” What? But drawing a blank is preferable to an incorrect result; I once asked for Tiny Dancer and received what appeared to be Alpine yodeling music. Failure is most spectacular when requesting non-English language tunes. I recommend you avoid asking for “Viðrar vel til loftárása by Sigur Rós from the album Ágætis byrjun” (in my accent, at least). Also steer well clear of Gwreiddiau Dwfn Mawrth Oer Ar y Blaned Neifion by Super Furry Animals, or anything at all by X Japan (“Sorry, I couldn’t find Eggs Japan”).
Out of desperation, I’ll introduce … clear … gaps … between … words. I might also RAISE MY VOICE like a condescending Englishman abroad, convinced the locals will UNDERSTAND … ENGLISH … IF … I … SPEAK … SLOWWLLYY … AND … LOUDLY! Thankfully, the command “Stop” seems to work at almost any volume and in any tone or mood, such as the quiet exasperation I commonly use.
Of course, many errors are the result of accents, especially in the UK. Travel just a few miles on our isles and accents can change noticeably, or even significantly. Here in the East Midlands, it’s often necessary to enunciate carefully for home assistants. If you’re a Geordie (Newcastle), Mancunian (Manchester), Liverpudlian (Liverpool), Brummie (Birmingham), or cheeky Cockney (Dick Van Dyke), you’ll need a translator. And then there are the Scots—you’ve seen that elevator sketch, right?
In search of simplicity
Still, we’re getting better at learning how to live with these objects; how to be ourselves around them. One of my favorite examples is asking for a 49-minute timer, because 50-minutes may be interpreted as 15-minutes, and life’s too short to keep making that mistake. A 49-minute timer will do just fine.
Perhaps it’s no coincidence that there’s a gentle resurgence of command bar use (think Apple’s Spotlight, or the flexibility of the much-loved Alfred), offering a middle ground between the power and speed of command line minimalism and over-complicated GUIs. The margin for error is much smaller if I type, and I’ll receive fallback suggestions, unlike the dead end scenario of a failed voice interaction. Simple command bar interfaces can offer the convenience of voice control to those who don’t wish to speak.
Under my direction, virtual assistants work best with short and simple skills. My wife and I sometimes remember to “open box of cats,” for which we’ll receive a cute meow or mew, and occasionally an incongruous bark or baa because developers are hilarious. Perhaps my fave skill was made by Fictive colleague Joey and has me shouting “Mortal Kombat” for a 5-second blast of that theme tune. Because, well, it’s funny.
I can’t ask the object to control my smart home because my home is not smart. Of course, I can Airplay to a few things, but that’s the current limit. The wiring in these little Coronation Street houses is temperamental at best, and there aren’t enough outlets to demote the fridge or TV in favor of connected curtains and algorithmic cheese graters.
Mostly, I underuse the object. It’s a glorified radio. Nine out of ten times I’ll shout “6 Music”, and have it play the BBC’s indie/alt station. You see, that will work every time, and I’ll feel as though I’m succeeding; that this relationship is working. I rarely think to ask it to set timers, convert measurements, or add plums to my shopping list. I’m also unwilling to develop a friendship where a robot tells me jokes or performs magic tricks; that big tech wet dream where machines are our soulmates and everyone has Palo Alto teeth. Besides, devices telling jokes is dangerous: Hal was likable with genial humor, but look what happened there. Don’t encourage them.
Sometimes we stop to consider that our objects are always listening. Eavesdropping. I was weirded out recently when I went back through my usage history. I already knew the app collected my commands as text, but this time I noticed the recordings—THE RECORDINGS—of my voice! I could listen back to everything I’d asked of the assistant. I wonder too if these assistants will eventually break the silence on their terms; interrupting our lives with unsolicited intrusions. “Simon, don’t you think you should turn Netflix off and address your personal hygiene?” For now, at least they only speak when awoken. But then, what about that disturbing “laughing Alexa” bug? Eep. Recently, we collectively freaked out when digital assistants were shown to be more proficient than us at making phone calls; so proficient that they dumb down and adopt our nervous umm-ing and ahh-ing for greater believability. Maybe this is cool, and I can send the object to fulfill my next speaking engagement.
New technology hits the market and arrives in our homes long before we can truly appreciate the change it brings. Much of this feels like an experiment where you and I are the lab rats because that’s what it is. We’re increasingly aware of the threat to our privacy and the way we live—yet we continue to lust after connected objects. I invite the Internet into my kitchen knowing full well that it’s a two-way transaction; that while plenty is coming in, there’s also plenty going out.
I’m sure you imagine me sat here under a tin foil hat, and that you think I sound a little grumpy, but it’s just a reflection on my first few months with this object. In all honesty, I’m in awe of what it can do when it lives up to expectations. Being able to summon songs into my house with my voice is undeniably thrilling for a music obsessive. Requesting an air quality update to help schedule a run is good for my lungs. I think too about the wonders these assistants are performing for those less able than myself. The positive impact voice assistants will have going forward can’t be underestimated, and I’m genuinely excited to think about how we might use voice as a design material, and as a bridge between our ideas and those that could benefit from them.
And if you’ve been wondering: do I also whisper “thank you” to the object in my kitchen? Well yes, I do. The simplicity of its form belies an artificial intelligence, listening; a complex machine, learning. In my home. With that in mind, it pays to be vigilant, and above all, to remain courteous. Stay sharp.