Tuesday, August 4, 2015

Turn on YouTube’s Speech-to-Text Closed Captions ... They’re Hilarious!

The other day I discovered that YouTube has introduced automated, possibly on-the-fly closed captioning for a lot of its videos.

To see if the video you’re watching has this option, simply click on the Settings gear icon in the lower righthand corner of the video.  If you see the Subtitles/CC option, select English (auto-generated) ...


... and you’re ready for instant autohilarity. (Aside: when you click English (auto-generated) you'll also find you can select a number of foreign languages — from Afrikaans to Zulu — into which Google will translate its English captions.)

I made the discovery while watching this video posted by Sean Hodgins:


I don’t mean to make fun of Sean — it’s obviously not his fault YouTube has attached some nutty closed captions to his video.  But unfortunately for the guy Google’s speech-to-text algorithms seem to have a tough time figuring out what the hell he’s saying.

“Murni” Like Swelling (Don’t You?)

In fact, Sean has just sat down and commented, “Man, it’s hot already,” quickly adding: “I’m already, like, sweating,” when YouTube swings into action, dutifully transcribing Sean’s words as ...

Fun fact to know and tell: according to Google Translate, murni is the Indonesian word for pure.
A bit later Sean explains how much easier it is to edit Vine videos than it used to be, before external content could be spliced in.  “Now everyone just edits them on their computer,” he points out, “which is super easy, and then puts them on Vine.”  But YouTube, which back at 0:13 in the video has already turned “Vine” into “buying” and the words “the Vine app” into “%uh by a nap,” now comes up with an even more exotic interpretation  ...


Sean Gets His “Butterworth”

In other spots YouTube manages to botch only a single word — but even then its version turns out to be magnificently inept.  After demonstrating how to build an Arduino-based controller for shooting Beme app videos without having to hold the phone against the body, as the Beme app typically requires, Sean concludes, “Yeah, see?  That’s how it works.  Super simple.”


OK, now it’s your turn.  Take a look at this intriguing transcription and see if you can guess what Sean is really saying:


That’s a toughy, isn’t it?  Indeed, out of context Sean’s real words may sound just as enigmatic as YouTube’s rendition: “This booster actually,” he says before arriving at this caption, “sucks power when it’s not doing anything, and this is just some button.”

Talk the Talk, Then “Do” the Walk

What’s more, because Nature  — and Google  — abhors a vacuum, YouTube even captions Sean’s words when he’s not talking at all.  Late in the video, with the music swelling and Sean marching outside to greet the day, YouTube’s closed caption nicely sums up his will and determination ...


Of course, my guess is that Google has deliberately introduced this “feature” in beta (or, if their dev team is being honest with themselves, in alpha). As they did with Google Maps in 2005, the Googleplexers figured they’d publish the tool early and gradually, inexorably work out the kinks.

Until that day arrives, though, turn it on, tune it in, and watch those priceless captions drop out.