When kids can talk to Elmo & the Cookie Monster about reading

If your 5-year old could talk to Sesame Street’s Elmo or the Cookie Monster like he talks to Grandma on Skype, could he become a better reader? That’s what a new partnership between Sesame Workshop and ToyTalk is betting on.[1] And this partnership could open up new inroads in the use of speech recognition systems to advance literacy for all our children.

Here’s the idea: “conversational technology” could be used to develop literacy, especially at the preschool level. Sesame Workshop[2] and ToyTalk[3] announced plans a few weeks ago to sign a research partnership agreement to explore how to use conversational technology to teach preschool literacy. The result could be, Elmo and the Cookie Monster talking to your child in a two-way conversation like he talks to Grandma on Skype.

How did we get to this possibility and why is the Sesame Workshop/ToyTalk partnership in the literacy news?

The first part of the answer lies in “good news.” Speech recognition systems have truly come of age and could be a valuable, cost-effective tool to advance literacy development.

The second part of the answer lies in “bad news.” Half of our nation’s fourth graders are unable to read at a basic level (National Assessment of Educational Progress report on reading for 2011); and only one in three U.S. students is able to read and understand grade-level material ─ unfortunately across all school grades.

So researchers are continuously on the hunt for ways to advance child literacy. Folks such as Marilyn Jager Adams, visiting professor in the Cognitive and Linguistic Sciences Department at Brown University, and longtime participant in NAEP’s committees in reading, believe that speech recognition technology can and should be used to advance early childhood literacy.[4] The approach would include developing speech-recognition-based reading software for our schools. This is not a far-fetched idea.

Automatic speech recognition is more than 20 years old and commonplace in many industry sectors. Medical and law professionals use voice recognition to dictate notes and transcribe information. Newer uses include military applications, navigation systems, automotive speech recognition, ‘smart’ homes designed with voice-command devices, and gamers interacting via voice commands with video games. Automatic speech recognition is used for telephone call-routing and directory assistance, captioning live TV to permit viewing in noisy places, enabling folks to talk to their computers and mobile phones via voice command (to issue commands and ask devices to transcribe voice mail and send written copies to email).

Speech recognition systems are not, however, widely in use in education. And where they are playing a role, the focus seems to be on assisting children with disabilities ─ not all children. The National Center for Technology Innovation, for example, identifies a range of populations that may benefit from speech recognition technologies:1) learning disabilities, including dyslexia and dysgraphia; 2) repetitive strain injuries, such as carpal tunnel syndrome; 3) poor or limited motor skills; 4) vision impairments; 5) physical disabilities; and 6) limited English Language.[5] No doubt, some of the 50% of the nation’s poor fourth grade readers fall within some of these named populations. Many do not.

The Center identifies numerous benefits for these populations from speech recognition technologies: improved access to the computer, increases in writing production, improvements in writing mechanics, increased independence, decreased anxiety around writing, and improvements in core reading and writing abilities. Specifically for the latter, the Center explains how speech recognition tools assist students with learning disabilities in reading and writing: “In allowing students to see the words on screen as they dictate, students can gain insight into important elements of phonemic awareness, such as sound-symbol correspondence. As students speak and see their words appear on the screen, the speech-to-text tool directly demonstrates the relationship between how a word looks and sounds.” This, the Center notes, can be especially helpful for students with learning disabilities and effective in remediating reading and spelling deficits.

The Center calls out another benefit of speech recognition technologies ─ in the “error correction process.” Because no speech recognition product is completely accurate, “it requires users to check the accuracy of each word uttered as sentences are being dictated. When an error is made, the child must then find the correct word among a list of similar words and choose it. This process necessitates that the user examine the word list closely, compare words that look or sound alike, and make decisions about the best word for the specific situation. This can give kids with learning disabilities a boost in reading and spelling as they learn to discriminate between similar words.”

Adams’ research takes a broader perspective ─ calling for speech recognition technologies to help people learn to read and read to learn. If computers were given “ears,” they could listen to students as they read, offer help or prompt further thought at just the right moments ─ all the while making records of students’ progress and difficulties. This technology could provide tailor-made interactive support and guidance. This is what becoming a good reader depends on.

Adams is not arguing that speech recognition technology approaches would take the place of other literacy approaches. She notes other significant reading efforts. The federal “Reading First” initiative has a focus on making sure all children leave the primary grades having securely learned and understood the basics of the alphabet. And the “Common Core State Standards” initiative focuses on ensuring that students have guidance and practice with increasingly more sophisticated and informative reading texts as they move through the grade levels. Understanding the infrastructure of the alphabet and using reading skills to comprehend texts at an increasingly sophisticated level are both critical components of children’s literacy development throughout their schooling.

The speech recognition “silver bullet,” if it can be called that, points to the challenges Adams sees students facing during the “intermediate reading period.” This is when students are “first gaining the ability to read with fluency and ongoing comprehension. It is with this intermediate challenge that most of our students fall by the wayside.”

During the intermediate reading period, speech recognition technology could be used to help students read and understand texts on their own with the support, instruction, skills and practice to help them through these tasks.

This is the type of research informing the new partnership between Sesame Workshop and ToyTalk, that is in the literacy news. ToyTalk is a company well down the road in exploring children’s speech recognition systems. ToyTalk has developed apps like the Winston Show, where children can talk with animated characters (parents give their permission via email). ToyTalk’s system collects children’s speech patterns to feed into a continually updated database. The more children talk to the animated characters, the better the developers at ToyTalk get at understanding what children are saying (accuracy is important in developing effective speech recognition systems for children).

Sesame Workshop is targeting preschoolers who typically do not speak as clearly and who pause more often when searching for words. Sesame Workshop’s testing has discovered that children who see a two-dimensional Elmo on a screen ( tablet or TV) assume it’s a game with prompts. But when they see a “live-action” character like Elmo, they treat it more like a Skype call with Grandma.

Sesame Workshop has been testing mobile apps that use ToyTalk’s proprietary PullString technology to use a combination of speech recognition meant to understand children’s speech patterns, artificial intelligence and prewritten scripts that respond to what a child has said. The first products from this partnership are expected out next year. Next will come products that would more formally teach children to read. This could include technology that can tell a child whether they’re pronouncing a word correctly, that asks them to come up with a word that rhymes with “dog,” or that asks them to discuss their feelings ─ all through two-way conversations with characters like Elmo and the Cookie Monster.

What would it have been like, to have my son at 5 be able to talk on a regular basis to the Cookie Monster ─ to practice his reading? I think he would have waited eagerly by the phone for the Cookie Monster’s call to the literacy conversation.


[1] “Sesame Workshop Tackles Literacy With Technology,” Elizabeth Jensen, Oct.19, 2014 (NY Times). http://www.nytimes.com/2014/10/20/business/media/sesame-workshop-to-tackle-preschool-literacy-with-technology.html?_r=0

[2] Sesame Workshop is the nonprofit educational organization behind Sesame Street and other programs (e.g., radio, books, videos, interactive media/technology efforts, collaborations with research/ innovation lab ─ the Joan Ganz Cooney Center). Sesame Workshop’s mission is to use the educational power of media to help children everywhere reach their highest potential. http://www.sesameworkshop.org/about-us/workshop-at-a-glance/

[3] ToyTalk, a children’s speech recognition company, is an award-winning, family entertainment company that creates conversational characters. The Winston Show is an iPad app where kids and characters have real conversations in a new dimension of make-believe. SpeakaZoo is a zoo app where kids talk with the animals. It’s made for the iPad, iPod Touch, and iPhone. SpeakaLegend is a talk & touch speech recognition app (e.g., befriend legendary creatures on quest to find the Unicorn, or through SpeakOrTreat visit scariest neighborhood in town to fill up candy bag). https://www.toytalk.com/about/

[4] “Technology for Developing Children’s Language and Literacy: Bringing Speech Recognition to the Classroom” by Marilyn Jager Adams, September 21, 2011. See: http://www.joanganzcooneycenter.org/publication/technology-for-developing-childrens-language-and-literacy-bringing-speech-recognition-to-the-classroom/

[5]“Speech Recognition for Learning,” National Center for Technology Innovation, at brainline kids: http://www.brainline.org/content/2010/12/speech-recognition-for-learning_pageall.html