INTRODUCTION:
The
main purpose of XSEC project is to develop a user authentication and
identification system that works upon biological features of the user and
delivers high precision to be used for different security purposes. User
authentication is a binary decision system that decides whether a supplied
feature belongs to the claimed person or an impostor tries to log in.
Obviously, acceptance of an impostor to the system is completely intolerable as
a security system is the case. On the other hand frequent rejection of genuine
users may be disturbing. So the system should be able to clearly differentiate
biological characteristics among different people while compensating in-person
differences. System is aimed to have complete robustness against channel
transformations and acceptable noise levels so to be able to work in practical
applications with an error rate close to zero.
Biometric methods
of identification are currently being used to replace the less secure
ID/Password method of user authentication, that is, verifying that people are
who they say they are. Using biometric identifiers for personal authentication
reduces or eliminates reliance on tokens we must carry with us, or the arcane
strings of letters and numbers we are forced to memorize. Tokens, such as smart
cards, magnetic stripe cards, and physical keys can be lost, stolen, or
duplicated. Human memory is notoriously unreliable; according to recent
estimates, at least 40% of all help desk calls are password or PIN-related.
Losses attributed to fraud, identity theft, and cyber vandalism due to password
reliance run well into the billions. Although passwords have traditionally been
used for personal authentication, they have nothing to do with a person's
actual identity!
The best method for
a particular solution depends on the type of users, desired accuracy, desired
transaction speed, cost parameters and any cultural issues or sensitivities.
Technology today provides us tools like
Fingerprint
characterization, Hand geometry (palm print) characterization, Voice pattern
characterization, Iris pattern characterization, Retinal patterns
characterization, Facial feature characterization etc.
XSEC has chosen facial feature characterization and voice feature
characterization to identify people.
XSEC
ARCHITECTURE:
XSEC authenticates persons based upon there voice and
face characteristics. Whole architecture is divided into two different modules
that work independently upon speech and face characterization respectively.
Block diagram shown below gives a rough idea about the functionality of the
System.

Face characterization module takes input from a digital camera at
Point-1, while voice characterization module takes input speech sample from a
microphone at point-2.
SYSTEMS
OVERVIEW:
Voice Characterization module
Speaker recognition systems are
grouped into two categories that are generally called speaker verification and
identification respectively. Speaker verification is the determination from a
voice sample if a person is who he or she claims to be. On the other hand
speaker identification determines which one of a group of known voices best
matches the input voice sample.
Another classification for voice
recognition systems takes the text dependence properties of the system into
account. Text dependent systems are those for which the speech used to train
and test the system is constrained to the same word or phrase. On the other
hand for text independent systems training and testing speech are completely
unconstrained.
Roughly speaking, a factor that
constitutes the information content of spoken utterances is speaker
characteristics + spoken phrase + emotions and additional noise, channel
transformations etc. For speaker recognition case, our concern is how to
distill speaker dependent characteristics from all others. So we may refer to
speaker verification as a three-state problem.
1. First step is to extract speaker dependent features
from spoken utterances. For this purpose different feature sets have been
proposed to reflect speaker’s identity best most famous of which are listed
below...
·Mel-frequency
Cepstrum coefficients
·Liner
prediction Cepstrum coefficients
·Log-Cepstrum
coefficients
2. Second stage is to build a statistical model that will
successfully reflect the characterization of the chosen feature set. Below is
the list of popular statistical models that are widely used and proved to be
successful.
·Vector
Quantization
·Gaussian
Mixture Models
·Hidden
Markov Models
·Neural
Network Architectures
3. The third step which is the decision making step is
where we compare he input voice to the claimed speaker model and make a
decision about the identity of the speaker.
XSEC uses Mel-frequency
Cepstrum coefficients method to extract peoples’ voice characteristics with
techniques like RASTA filtering and cepstral mean normalization to compensate
the performance degradation due to handset mismatches.
XSEC implements a unique technique for handset
identification and normalization for the handset differences. For statistical
modeling XSEC uses hybrid architecture consists of HMM and ANN. Hence, it gives
higher accuracy than its other competitors in the market.
Face Characterization module
Along with voice identification XSEC realizes a face authentication system.
Face recognition functionality can be divided into two sub-parts, modeling of
person’s image followed by the recognition phase. Broadly speaking, face recognition
problem can be solved by using HMM (Hidden Markov Modeling) or with Eigen
Faces. XPEG uses its own technique that makes several compromises between two
methods stated above and delivers higher recognition efficiency.
BACKBONE OF XSEC ENGINE:
Faceprint & Voiceprint database:
User is required to give his voice sample and
few training face samples when he accesses the system for the first time, i.e.
Training session. Supplied faces are in a prescribed face orientation and spoken
utterance is a predefined text of acceptable length that successfully reflects
the vocal properties of the language and the user. Once the speech samples are
taken from the user, it is processed and a property matrix that reflects the
user’s voice is saved in the database. This property matrix is technically
called the voiceprint. Similarly, system computes to get a property matrix
corresponding to the persons’ face and is stored in the database, which is
technically called faceprint. So the data kept in the database is not
the speech sample or picture but the voiceprint and faceprint of the user.
Login phrase:
It is vital to practicality of
the system that required login phrase duration should be as short as possible.
XSEC works with login phrases of 1 picture and 3-4 words randomly generated and
flashed to the user.
Decision-making:
XSEC tries to determine user’s identity by face
recognition methods. Once user identity is determined XSEC
uses voice authentication methods to verify this identity. User needs to supply
his spoken utterance that will be used for authentication and the claimed
speaker information is available to the system, the feature matrix is extracted
from the speech data and compared to the voiceprint of the claimed speaker. If
the similarity score exceeds a predefined threshold value, the user is
accepted.
BIOMETRIC FEATURE SECURITY:
If a criminal steals or
guesses the password, it is very easy to have it changed. There is a fear,
however, that if a criminal gets hold of a biometric template, the damage is
irreparable - there is no way to change that part of your body. XSEC-2.1 works
in a network environment and keeps all the user specific data on a secured
server computer. Only way to introduce a new user template is through XSEC-2.1
client, hence template manipulation chances are well taken care of.
DRIVERS FOR XSEC PROJECT
There
are four main benefits to use biological verification:
1.Improve Security:
This is the key objective of using voice and face
biometrics to improve the security of sensitive information and reducing fraud.
A password is what people know. But the voiceprint and face-print is what they
have.
2.Reduce Costs:
Using an automated authentication process reduces salary expenses,
toll-costs and call hold times and allows live agents to focus on
revenue-generating calls.
3.Improve Service:
Speaker verification combined with speech recognition
(rather than touchtone) allows callers to interact naturally with the
application, so callers can actually enjoy the automated experience. The
combination of speech recognition and speaker verification makes it possible to
simultaneously identify AND authenticate callers in one step, allowing them to
simply speak an Account ID or their name to access the account and be authenticated.
4.Save Time: Whereas standard authentication questions and associated hold
times may take an average of 80 seconds, speaker verification happens
instantaneously, especially when the voiceprint and faceprint is associated
with the Account ID, removing the time needed for a separate password.
MARKET
OPPORTUNITY
As enterprises and
organizations increasingly turn to biometrics to ensure their customers’
safety, the market opportunities for voice/face authentication are infinite.
Market
|
Application
|
Drivers
|
Financial Services
|
Access to Banking, Brokerage,
|
Reduce Financial Risk
|
Telecom
|
Call Center Applications
Unified Messaging
Auto Attendant
|
Reduce Fraud
Protect Personal Information
Competitive Advantage
|
Retail
|
Order Entry
Personalized Service
|
Reduce Fraud
Increase Revenue
|
Enterprise and IT
|
Access to Intranet, Extranet
and Corporate Applications
|
Increase Security
Reduce Cost
|
Travel
|
Frequent Customer Services
|
Convenience
Personalization
|
Internet
|
Authenticate Users for
Internet Banking and e-Commerce
|
Reduce Financial Risk
|
Hospitals,Insurance
|
Access to Patient Information
Authorize Drug Prescription
Authorize Insurance Payment
|
Protect Personal
Privacy Reduce Fraud
|
Government/Military
|
Access to Sensitive Information
Parolee Tracking
|
Increase Security
Reduce Cost
|