Alan Gahtan's Canadian Legal Resources

Voice Dictation Software in the Law Office: Getting on Speaking Terms with your PC

The Litigator: Journal of the Ontario Trial Lawyers' Association

By Alan Gahtan - November 1998

Introduction

Not long ago, to those who might even have pondered it, the notion that one could dictate into a computer, at a natural speaking pace, and see the formatted words magically appear on-screen, was fanciful. As with so many other cutting-edge innovations, particularly in the field of office automation, continuous-speech voice-recognition has now entered the technological mainstream and is gaining acceptance among a growing segment of businesses and professions, including the legal profession. As will be seen below, lawyers are among the few that have been singled out by some voice recognition software companies for special product consideration.

The latest voice-recognition software generally consists of a suite of integrated applications, the core of which allows the user to speak into a PC-connected microphone at a natural conversational pace (up to as many as 160 words per minute), and view the words on the PC’s display as they are instantly transcribed into letters, memos or billing records, in popular document formats such as Word 97 and WordPerfect 8. Document formatting is typically achieved through the use of simple voice commands, such as "bold", "new paragraph", etc.

Unless you are reasonably computer literate to the extent of having a basic understanding of your operating system and of the functions of typical word processing, spreadsheet and accounting applications, the use of voice-recognition technology may seem more bewildering than beneficial. For most, however, the learning curve is not particularly steep, and the primary consideration will be hardware-related i.e. whether your PC has the effective processing power, and disk and memory capacity, to run these resource-intensive, speech-enabled applications.

There are currently a variety of voice recognition programs, each claiming distinctive features and advantages. These programs include: Dragon System’s Dragon NaturallySpeaking 3.0, I.B.M.’s ViaVoice 98, Philips Electronics' FreeSpeech 98, and Lernout & Hauspie’s Voice Xpress. Several of these programs are available in flavours that are variously described as "home", "office", "executive", "standard", "professional", etc.

In order to illustrate the nature of these products, it is sufficient to examine one of the more popular programs, Dragon NaturallySpeaking, which is of particular interest to our profession because of it’s availability in a "Legal Suite" version. There are several different versions of NaturallySpeaking to choose from. The least expensive is the "Standard Edition", which can be purchased as a stand-alone product or bundled with WordPerfect Legal Suite 8.0. However, the bundled version provides less flexibility in that it can only be used within WordPerfect. A mid-tier "Preferred Edition" provides improved accuracy and additional business features, including

Dragon NaturallyMobile for transcribing recorded speech, and text-to-speech. Lawyers on the go will appreciate this feature, which includes support for dictation using mobile recording devices such as hand-held recorders. The premiere version is the "Professional Edition", which offers greater customization, advanced macro support and other power-user benefits. The "Legal Suite" version combines the "Professional Edition" with a language model optimized for legal terminology, including Latin words and phrases, law and reporter abbreviations, and a good smattering of legalese.

Getting NaturallySpeaking up and running is straightforward. The installation wizard analyses the amount of available RAM and recommends the appropriate recognition technology to take advantage of any extra memory for improved accuracy and performance. It should take less then an hour to load the software from CD-ROM and complete the training module. The system is trained to a particular users voice by listening to the user read a passage from 3001: The Final Odyssey, Dave Barry in Cyberspace or Dogbert's Top Secret Management Handbook. The spoken words are processed via the PC’s soundcard. The program then analyzes the voice’s sound frequencies (lower frequency vowels sounds and higher frequency consonant sounds) and begins a complex series of pattern-matching algorithms. While the set-up can be accomplished in less than an hour, the best results will not likely be seen until the user has spent several weeks to a month of regular interaction with the program, in effect training the program.

 

NaturallySpeaking comes with an active vocabulary of 30,000 to 64,000 commonly used words and the ability to retrieve additional words from a 230,000+ total vocabulary. Users can easily place new words, frequently-used legal specialty words and phrases, and names, in the active vocabulary. Language usage data programmed into the product is designed to maximize word recognition accuracy.

The program utilizes a number of techniques to achieve its impressive level of accuracy, which depending upon the review one reads, has been rated at anywhere from less than 90 percent to upward of 98 percent. Vocabulary Builder is a tool which is used to allow NaturallySpeaking to adapt to a particular user's type of work, and thereby improve recognition accuracy. This tool analyses the user's existing documents and builds a custom language model based on the text. It not only adds words commonly used by a particular user; it also teaches the system about that user's writing style.

The latest versions of NaturallySpeaking incorporate Dragon's BestMatch technology to provide even higher levels of recognition accuracy. This technology represents a collection of several advances, including the addition of "trigrams" to the language modelling. This means that NaturallySpeaking will analyse the words appearing before and after the word it is attempting to recognize, and will suggest the best possible match based on the context in which the words are likely being used. All of this occurs instantaneously and behind the scenes.

One noteworthy feature is the delayed error-correction function. One can dictate freely, ignoring any errors observed in the process, and then return to revise the document as necessary without worrying about adversely affecting recognition performance. Correcting errors or revising text is facilitated by an editing feature called Select-and-Say, available when NaturallySpeaking is used with Word 97 or WordPerfect 8.0. To select a particular word or phrase, one need only say "select" followed by the word or phrase. The selected text can then be replaced by dictating the new word or phrase. It can also be deleted or formatted by verbalizing the appropriate command.

NaturallySpeaking requires a relatively powerful system to operate effectively. Although the software may run on a PC equipped with a Pentium (or equivalent) 133-166 MHz processor and 32-48 MB of RAM, the best voice-recognition accuracy, and feature maximization, will be obtained only with higher-end PCs (266-400 MHz) and abundant on-board memory (64-96 MB).This is particularly the case if you plan to use the BestMatch language model that is optimized for legal terminology.

Using a high quality sound card (or notebook with built-in 16-bit audio), and a microphone with active noise-cancellation, is also important. If you are planning to buy a new PC for use with NaturallySpeaking, visit the vendor’s Web site beforehand for a list of certified PCs.

Voice recognition programs offer a level of automation that brings together such classic law office tasks as file dictation, transcription, editing and document generation. The technology, while potentially time-saving and cost-effective when used proficiently, is not as likely to gain user approval unless a financial investment in hardware (or upgrades), and a time investment in set-up and training, is made.

Lawyers who are already fairly self-sufficient when it comes to practice automation (e.g. those who regularly operate desktop or laptop systems doing online research, file memos, e-mail, desktop faxing, time management and fee billing) may be thinking about – even excited about – taking practice automation to the "next level" by venturing into the voice-recognition foray. If you don’t yet feel that you’re practice style, computer savvy, or pocketbook are ready for the challenge, you’re not alone. But that may soon change once pier experience with this technology is more widely shared, the technology itself yields a higher level of precision, and the affordability of high-end processors is no longer an issue


Related Sources: Canadian Legal Resources | Cyberlaw Encyclopedia | Entrepreneur Resources | Canadian Technology | Precedents | Alan Gahtan

© 2005 Alan M. Gahtan. All Rights Reserved | Use is subject to these Legal Terms
Disclaimer: Not all materials may be applicable in your jurisdiction. Not intended to be a substitute for professional advice. No implied endorsement of, or affiliation with, any linked sites. Path to individual pages may change - please link to home page only.   Linking Info