Alexa’s latest failure is not an individual incident, but a larger issue

Sourojit Ghosh(G)
3 min readDec 29, 2021
Image by Rahul Chakraborty on Unsplash

If you’ve been following technology news over the past week, you might have noticed Amazon Alexa’s latest fiasco, in which the smart assistant recommended the ‘penny challenge’ to a 10-year old child. When asked to suggest a ‘challenge to do’ by the child, Alexa recommended to “plug in a phone charger about halfway into a wall outlet, then touch a penny to the exposed prongs”. The child’s mother was on hand to prevent the child from carrying this out and risking a dangerous electrocution, as she later narrated the incident on Twitter. This tweet was circulated by other users as news agencies also picked it up, leading to a statement by Amazon where they announced updates to Alexa to prevent it from recommending such dangerous activities in the future.

However, close inspection of this incident could reveal that this issue is larger than Amazon applying a quick fix to prevent such recommendations in the future. This incident is an evidence of loose content moderation, letting dangerous content seep through filters and be picked up in searches by Alexa.

To understand this further, let us take a quick look at how Alexa answers our questions. For instance, when a user asks Alexa, ‘what is a dragon?’, it is equivalent to them conducting an Internet search. Alexa, as a competitor to Google, uses Bing’s search engine instead. It performs the search and speaks out the top result, starting with the name of the source. It also crowdsources information and answers to questions it does not know how to respond to, through its platform Alexa Answers. These responses are routinely vetted by “a combination of automated systems, community members, and Alexa Answers moderators” and only after being approved do they go ‘live’.

The trouble with searches and crowdsourcing information is that if a large-enough group of people search for or participate in something on the Internet, that surge in activity pushes the content high up the relevance index and brings it to the notice of content crawling algorithms. There is no room in the algorithms to decide whether it should pick it up, they must act on the rapid user buy-in. Take the penny challenge for example. It rose as a TikTok trend in 2020, with several hundreds of users attempting the challenge and receiving thousands of collective views. It makes sense that search algorithms observing this massive amount of engagement would pick up such content during these surge periods.

However, it would then fall on human content moderators to act as a line of defense, recognizing the dangers of recommending such content. This is the sad reality of the content moderation industry, as it takes an incredible psychological toll on the individuals who daily view disturbing and possibly traumatizing content for a living. It is quite possible that some content slips through the cracks, and I cannot blame them for it.

To Alexa’s credit, it did do a better job the last time a massively dangerous challenge went viral, profusely asking users to not do the Tide Pod challenge. However, its failure to censor against the penny challenge reflects that there is work to be done. Amazon’s response to this resulted in some ‘updates’, which users claim results in Alexa not answering the question ‘tell me a challenge to do’, the original question at fault. This cannot be the overall solution, as there can be several other questions that can recommend dangerous content to users. This incident brings to light a larger problem of content moderation and indexing, which will likely need broad changes to how content moderation is currently done and how algorithms select answers to recommend.

--

--

Sourojit Ghosh(G)

PhD Candidate, Human-Centered Design and Engineering, University of Washington