Speech Design: When and How
|
 |
|
 |
|
 |
At
the beginning of the 21st century, the major stumbling block to
the widespread acceptance of using speech in the user interface is
no longer slow speed or poor accuracy.
Convenience and
usability are the issues that must now be addressed before
speech will penetrate larger markets. |
The most successful users of speech recognition are currently
those who must use it or are highly motivated to get it working
for some other reason, like people with various physical
disabilities. These users are willing to spend the extra time and
work through convenience and usability issues because of their
special needs. This is too much of a commitment for the average
user, who just wants to sit down and start using a product to
complete a task. If you are interested in marketing your solution
to a wide range of customers, convenience and usability must be
addressed at the beginning of your development.
Conducting a basic usability study is the best way to gain an
understanding of convenience and usability for your targeted
customers. A basic usability study should include:
- Gathering
feedback from potential customers and a study of their work
habits. This can provide ideas for new or simplified
interfaces that utilize speech technologies.
- Having the
user participate in the development of your solution from the
very beginning is crucial to your success. Keep your focus on
your objective of user acceptance and satisfaction.
- Try to
organize a representative focus group of target users and
analyze their current approach to job tasks. Ask them what
they feel is missing or inefficient in their workflow and how
they feel it could be fixed. Be specific and ask how they
think speech could be employed to fill this need, and where
they feel speech might be unnecessary.
Once the working environment and needs
of your end-users are understood, the next step is to consider all
existing technologies. In addition to speech, this should include
anything that the customers already use or could use in their
work. For example, if your customers work in a mobile environment,
you could consider the use of PDAs and wireless connections. Then
determine if the technology available to you will accommodate
their needs. Many factors can influence this decision. For
instance, what is the physical work environment? Noisy
environments such as stock exchanges and airports may preclude the
use of speech as a reliable interface to an application. Critical
control applications, such as in jet planes, might not tolerate
even a rare misrecognition.
In addition to environmental factors and critical needs, you need
to consider social factors. For example, the users themselves may
be very resistant to the adoption of a new method of working. If
their resistance is high, it may be very difficult to implement
speech as a new way to accomplish tasks. You may need to make
speech a secondary, or alternative interface to be used by a
subset of the users.
If you have determined that speech would be an appropriate
interface for your customers, you need to begin architecting your
speech design. speech designs can be categorized into three major
groups along a continuum from Speech as an Add-on to Designed for
Speech to Speech Required.
- Speech as
an Add-on
If you have an existing application, this type of solution
would represent the least amount of work because you don't
need to make any coding changes. The GUI of your application
remains unchanged and speech features are usually provided by
means of a third-party add-on. Your application remains
totally unaware of the presence of speech. For example, you
could install a commercially available speech application and
dictate into your application's free-text fields without
making any changes to your application.
- Designed
for Speech
This type of speech-enabled application requires minor changes
to the GUI and may be a solution for upgrading an existing
GUI. The target application is usually aware of the speech
components that are directly integrated into the program.
Speech features are presented via an interface in which they
supplement existing features. This is the preferred mode for a
speech design, as it offers the most flexibility through
allowing multi-modal input. It can also shorten the learning
curve for new users and increase productivity in most
situations. For example, a kiosk that provided graphical and
well as audio instructions would allow users to read the
instructions as well as hear them. This would greatly increase
the chances of users knowing what to do or say.
- Speech
Required
In this category, the entire application is designed with
speech as the primary user interface. This usually entails a
complete re-write of your interface code and is the most work.
Telephones and mobile devices are prime targets for this mode
of speech user interface because there is no other easy-to-use
input mechanism available to the user. The application may or
may not have a GUI, but many features will be accessible only
by speech. Adding voice prompts and responses can allow for a
program that is both hands-free and eyes-free. An example of
this type of application would be in an operating room or lab
where users are wearing gloves and need to be looking at the
patient or specimen and not at a computer screen.
After the scope of your project has
been defined by studying usability factors and you have determined
a place for it along the continuum of speech user interface
categories, you are ready to start working on your design. A good
rule of thumb is that any task that is not enhanced by a speech
design should be left to other methods. Keyboards and mice, when
they are available, are very robust input devices. Speech inputs,
no matter how well they are implemented, are less robust because
of their dependence on a number of more error prone hardware
components and software services. However, a speech user interface
that has taken into account the needs of the end-user and is well
designed may offer benefits that cannot be achieved in any other
way. Speech user interfaces offer developers a new set of tools to
enhance the productivity of people who use their applications. |
TOP
HOME
Embedded Digital System
Co.,Ltd. CANADA
嵌入数码系统公司
Email: embedigital@yahoo.com
copy right © 2002 All Rights
Reserved
|
|
|
|
|
|