Physics of Sound & Vibration

          An Inter-Disciplinary Resource Website to Effects on Human Electrodynamic Physiology

 

                                     www.uncg.edu/~t_hunter/sound.html

 Main Menu

Site Map

 

Patent No. 6006188  Speech signal processing for determining psychological or physiological characteristics using a knowledge base (Bogdashevsky, et al., Dec 21, 1999)

Abstract

A speech-based system for assessing the psychological, physiological, or other characteristics of a test subject is described. The system includes a knowledge base that stores one or more speech models, where each speech model corresponds to a characteristic of a group of reference subjects. Signal processing circuitry, which may be implemented in hardware, software and/or firmware, compares the test speech parameters of a test subject with the speech models. In one embodiment, each speech model is represented by a statistical time-ordered series of frequency representations of the speech of the reference subjects. The speech model is independent of a priori knowledge of style parameters associated with the voice or speech. The system includes speech parameterization circuitry for generating the test parameters in response to the test subject's speech. This circuitry includes speech acquisition circuitry, which may be located remotely from the knowledge base. The system further includes output circuitry for outputting at least one indicator of a characteristic in response to the comparison performed by the signal processing circuitry. The characteristic may be time-varying, in which case the output circuitry outputs the characteristic in a time-varying manner. The output circuitry also may output a ranking of each output characteristic. In one embodiment, one or more characteristics may indicate the degree of sincerity of the test subject, where the degree of sincerity may vary with time. The system may also be employed to determine the effectiveness of treatment for a psychological or physiological disorder by comparing psychological or physiological characteristics, respectively, before and after treatment.

Notes:

SUMMARY OF THE INVENTION

The present invention provides a speech-based system for assessing psychological, physiological or other characteristics of a test subject. The system includes a knowledge base that stores one or more speech models, where each speech model corresponds to a characteristic of a group of reference subjects. Signal processing circuitry, which may be implemented in hardware, software and/or firmware, compares the test speech parameters of a test subject with the speech models. In one embodiment, each speech model is represented by a statistical time-ordered series of frequency representations of the speech of the reference subjects. The speech model is independent of a priori knowledge of style parameters associated with the voice or speech. The system includes speech parameterization circuitry for generating the test parameters in response to the test subject's speech. The speech parameterization circuitry includes speech acquisition circuitry, which may be located remotely from the knowledge base. The system further includes output circuitry for outputting at least one indicator of a characteristic in response to the comparison performed by the signal processing circuitry. The characteristic may be time-varying, in which case the output circuitry outputs the characteristic in a time-varying manner. The output circuitry also may output a ranking of each output characteristic. In one embodiment, one or more characteristics may indicate the degree of sincerity of the test subject, where the degree of sincerity may vary with time. The system may also be employed to determine the effectiveness of treatment for a psychological or physiological disorder by comparing psychological or physiological characteristics, respectively, before and after treatment.

--------------------------------

The present invention has additional applications in any field where psychological or physiological testing is currently used. Moreover, because the present invention can perform these assessments in a relatively short period of time, based on a short speech sample, it can reduce the expense and effort to conduct such tests. Further, the invention allows these assessments to be employed in applications for which conventional testing would be subject to unacceptable time and money constraints. Such applications include, without limitation, rapid airline passenger security screening, rapid psychological screening in a managed health care environment, and monitoring of compliance and motivation of substance abusers under treatment.

An important aspect of the present invention is that it can be easily trained to associate speech parameters with psychological or physiological characteristics regardless of the (non-speech based) assessment employed to quantify those characteristics. The system operator need only administer the assessment, e.g., Myers Briggs, to a statistically significant group of reference subjects, and record speech samples from each homogeneous group determined by the assessment. Determination of the number of subjects necessary to achieve statistical significance is known in the art, and is described in L. M. Crocker and V. Alqina, Introduction to Classical and Modern Test Theory, New York: Holt, Rinehart and Winston, 1986, which is incorporated by reference herein. Based upon this empirical data, the speech-based system of the invention then creates a knowledge base representing the desired assessment in the "speech domain." In this manner, the system is easily trainable to administer any test using a rapid characterization of a test subject's speech.

Further, the invention does not relate to a particular psychological or physiological theory about what specific speech characteristics distinguish one homogeneous group from another. Moreover, it does not require any a priori knowledge of speech, although it may be adapted to take such information into account. Rather, as described above, it is based upon an empirical analysis of speech using a broad speech model. In one embodiment, speech is characterized with an LPC model based upon a time-ordered series of frequency characteristics, e.g., eight cepstral vectors per phrase. This time/frequency representation provides a description of speech that is much broader than (and independent of a priori knowledge of) the specific dimensions of speech or speech style elements employed by the prior art. This LPC model also accounts for the relative phase of different frequencies, unlike most, if not all, of the known prior art. This broad model is then empirically correlated with a psychological or physiological assessment. This relatively full, yet still compact, characterization permits the system a great deal of flexibility in the types of assessments that may be carried out.

The invention is also not location dependent. That is, the test subject does not need to be proctored by a test administrator located within the same room. Rather, the speech acquisition circuitry may be located remotely from the signal processing circuitry that performs the comparison with the knowledge base. For example, the test subject's speech may be digitized by the subject's home computer and transmitted by modem (e.g., over the Internet) to a central location that provides remote physiological or psychological assessment services. The results are displayed on the home computer. This adaptation is easily implemented using existing technology.

Those skilled in the art will recognize that the present invention may be employed to associate speech parameters with not only psychological and physiological conditions, but any other condition present in an individual. This can be achieved as long as the correlation between a subject's condition and the subject's speech parameters can be verified as significant through testing independent of the present invention.

Note that all patents and other references cited herein are incorporated by reference herein in their entirety.

Although the invention has been described in conjunction with particular embodiments, it will be appreciated that various modifications and alterations may be made by those skilled in the art without departing from the spirit and scope of the invention. For example, as mentioned above, a wide variety of well-known speech comparison techniques may be adapted for implementation in the present invention. The invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims.