J-Say Starts to Bridge Speech and Text Gap for Visually Impaired Computer Control

Tuesday, February 10, 2015

An image of a smart phone with Braille across the screen.

There are certain pieces of software that are just fascinating to me. A computer that speaks through text-to-speech applications provides worlds of access; a computer that responds to voice commands through speech recognition gives a greater number of users equal access to technology. But what about a computer that does both? And what if that computer could also allow for Braille output?

I give you J-Say, an application that allows computers running the text-to-speech program Job access with Speech (JAWS) for Windows and the speech recognition software Dragon Naturally Speaking to run simultaneously. It's something of a bridge software: it allows a JAWS user the ability to type and control a computer with his or her voice, while getting immediate auditory feedback. As with running JAWS alone, this combination also allows a user to receive feedback through a Braille display.

Once you've dictated the text, JAWS immediately reads it back. A user can correct any errors as they go. Users can also open files, URLs, or run applications by simply saying "run X", where X refers to whatever they'd like to open at that moment. It is also possible to dictate into a recorder, transfer the file to one's PC, and have J-Say convert the spoken audio to text.

On a webpage in Internet Explorer, a user can bring up a list of links or headings -- as I would from my keyboard using JAWS -- and navigate between different page elements by voice. Typing into a search box is also no problem with J-Say.

For further background, I contacted J-Say's developer Brian Hartgen. He began development of J-Say in 2003 after a 2001 audio magazine piece that suggested JAWS and Dragon together offered an opportunity to create a better experience for blind Dragon users. Typical J-Say clientele now ranges from blind users looking for greater productivity to blind users who don't have hands or arms.

I was curious as to what would happen without J-Say – what made it so essential to these programs working well together? For one thing, as Mr. Hartgen explained, there is no audio feedback of dictated text. For another, a user couldn't emulate existing JAWS commands such as bringing up link lists or other elements on the web. You couldn't create direct shortcuts to folders. J-Say also has its own feature set independent of JAWS and Dragon, which includes a calendar, an audio player, and radio.

I was interested to find out if I could use J-Say to test website compatibility with Dragon Naturally Speaking, in the same way a sighted Dragon user might. Dragon is an application that is used more than text-to-speech applications but doesn't get the same attention. Unfortunately I was disappointed: it does not use the standard Dragon commands on the web, so my experience with J-Say would not be comparable to someone using Dragon alone. J-Say users are encouraged to use J-Say's own command library, as in many cases its commands provide blind users with additional spoken or braille-based confirmation that they have been correctly executed.

I also wanted some firsthand feedback from frequent J-Say users. Terry Bray, Sue Martin, and Pranav Lal graciously answered all of my questions.

Mr. Bray and Mr. Lal began using J-Say in 2004 and 2005, respectively. Both had hand injuries that made typing at stretches difficult, and Mr. Bray also has a learning disability. Ms. Martin started using J-Say at its inception through her work at the United States Department of Veterans Affairs assisting veterans with complex injuries. All three were JAWS users who required the use of speech-recognition software later, rather than users of speech recognition who required the later use of JAWS through vision loss.

Neither Mr. Bray, Mr. Lal, nor Ms. Martin uses J-Say exclusively, as they are still able to use a keyboard to some degree. Much of my earlier descriptions of J-Say's functionality came through their detailed descriptions.

All three J-Say users I was in touch with told me that J-Say was a faster and more enjoyable way to interact with their computers. Though designed primarily with Microsoft products Internet Explorer, Outlook, and Word in mind (and to capitalize on the three primary uses of a computer), its functionality can be extended to most applications. Mr. Lal said he has been able to use it with Firefox, Excel, and PowerPoint. Ms. Martin said she particularly enjoyed the ability to multi-task – operating her computer with a wireless headset while making dinner.

Though a fan of J-Say in many respects, Mr. Lal pointed out some drawbacks:

Editing of text can be slow;
Poorly-marked up webpages are faster and easier to navigate with a keyboard;
As a programmer in Visual Basic, .Net, and Python who also writes J-Say scripts, Mr. Lal cannot program using J-Say;
Few programmers use Dragon, let alone Dragon and JAWS together, so little work has been done to make the complex environments of speech recognition and programming compatible;
Proofreading is a must in handling the occasional recognition errors that appear in general usage of J-Say; and
As a Linux user, Mr. Lal would appreciate a J-Say concept extended to this platform.

The only other concern was dictating with ambient noise, but Ms. Martin said that Dragon was pretty good at filtering it out in most cases.

My final question concerned regional accents. Mr. Hartgen, J-Say's developer, is from England. Mr. Bray is from Toronto, Ms. Martin hails from the southern States, and Mr. Lal is based in India. How would J-Say handle all their different speech patterns? It turns out that this is Dragon's domain, and speech profiles can be set up for each user. This includes a question about a person's age: Ms. Martin admitted lying to her software when asked this...

Recognition of speech patterns presents an accessibility challenge too. In her work with veterans, Ms. Martin encountered many with unique speech patterns, due to injury or stroke. Through continued use of J-Say and its correction facilities, the software got better at understanding its users. As with anything, the more practice one gets, the better the technology and its user become at working together.

You can hear J-Say in action. All told, I am fascinated by this product. As the only available package to bridge speech output and input, it truly seems revolutionary.