Monday, February 18, 2013

The new Web Speech API => a Flash Cards Application!

In the last couple of months I've been helping my daughter memorize her multiplication and division tables.  To help her get faster we've used traditional flash cards, which work pretty well.  One day I thought to myself "a computer could easily listen to a student's voice and show appropriate cards...  it could even keep score."  I looked around to see if there were any JavaScript libraries that worked with the microphone to provide speech recognition, but didn't find anything worthwhile.  The best I could get was to use Emscripten to port a C library.  I put it on the back burner and worked on some other things for awhile.  

Then I saw a post about Google Chrome version 25.  In it is implemented the new Web Speech API.  HTML5Rocks did an article covering some of the capabilities here.  In short, the API allows you to add speech recognition to your apps!  Just what I needed!

So off to work on the app...  The API is pretty simple.  These are the steps to use it:
  1. You create an instance of the object, via new webkitSpeechRecognition()
  2. Set continuous to true or false to specify whether it should stop after giving a result or keep listening continuously.
  3. Set interminResults to true or false to specify if you want only the final result or all results.
  4. Provide callbacks via onstart, onresult, onerror, and onend.
I hooked it all up and built the application around it.  I think it turned out pretty well.  I use localStorage to track high scores and store custom lists.

For the main page I went Vanilla JS (well almost... I used the Handlebars VM for templates). So the page has a pretty light footprint (35k).  And 15k of that is Google Analytics, so the actual application code (including the Handlebars VM)  is only 20k.  My experience is that Vanilla JS is a little harder than typical jQuery, Backbone, etc... but it is definitely light and fast.  

Anyway... here's the link to the app:  https://iambrandonn.github.io/FlashCards/

Have fun!

6 comments:

  1. This is a really cool project. I'm learning to program and I love seeing behind the scenes posts like this.

    Just FYI, I found this through http://sideshowhq.com/d/13-03-2013. Thanks for the blog post!

    ReplyDelete
  2. Hey mate this is awesome. I saw you on chrome experiments. Great work man.

    I was wondering if you could give me some advice.

    I'm creating a simple web based drawing app that uses speech recognition.

    I have created a simple page here: http://michaelashton.com.au/x/index.html and the project is on github here: https://github.com/a5hton/speechdraw

    It has a 16x16 pixel grid. I would like to be able to draw on this grid by using simple words. For example if you say "right", the pixel to the right will be colored black. If you say "down" the pixel below the last one will be colored black.

    You can say up, down, left or right and the corresponding pixels will be colored.

    Saying "erase" will switch to erase mode, colouring the pixels back to their original color.

    Saying "lift" will lift the pen off the page.

    Saying "draw" will enable the draw mode.

    Could you please help me work out how to make this happen. Please see the simple page at http://michaelashton.com.au/x/index.htm to get an understanding.

    Thank you!

    Cheers,

    Michael

    ReplyDelete
    Replies
    1. Sorry for the slow response. You already have the majority of the work done. You will just need to check each time a response is received if the text matches the word "down", "lift", etc... and make the appropriate actions at that time.

      Delete
  3. Brandon, your work is fantastic.

    As an educator and developer of open source software involving interactive online audio, I'd like to speak with you by phone, at your convenience, regarding the possibility of collaborating on a book for educators to create their own online educational applications. My email is in this document: http://ulm.edu/~beutner/vita/CV-Beutner.doc

    Here are some working examples of what I do:
    http://ulm.edu/~beutner/index.html#Interactive_Audio

    As a request, please contact me by email with a phone number; I'll call you by phone.

    ReplyDelete
  4. Thanks for this! I've used it as a base for a project to help my daughter learn addition. One thing, if you use https://iambrandonn.github.io/FlashCards/ then it only asks for permission once!

    ReplyDelete
    Replies
    1. Ah, yes. If I remember right github wasn't supporting https when I first published it, so that's good to know that they now are. Thanks!

      Delete