We evaluate the security of human voice password databases from an information theoretical point of view. More specifically, we provide a theoretical estimation on the amount of entropy in human voice when processed using the conventional GMM-UBM technologies and the MFCCs as the acoustic features. The theoretical estimation gives rise to a methodology for analyzing the security level in a corpus of human voice. That is, given a database containing speech signals, we provide a method for estimating the relative entropy (Kullback-Leibler divergence) of the database thereby establishing the security level of the speaker verification system. To demonstrate this, we analyze the YOHO database, a corpus of voice samples collected from 138 speakers and show that the amount of entropy extracted is less than 14-bits. We also present a practical attack that succeeds in impersonating the voice of any speaker within the corpus with a 98% success probability with as little as 9 trials. The attack will still succeed with a rate of 62.50% if 4 attempts are permitted. Further, based on the same attack rationale, we mount an attack on the ALIZE speaker verification system. We show through experimentation that the attacker can impersonate any user in the database of 69 people with about 25% success rate with only 5 trials. The success rate can achieve more than 50% by increasing the allowed authentication attempts to 20. Finally, when the practical attack is cast in terms of an entropy metric, we find that the theoretical entropy estimate almost perfectly predicts the success rate of the practical attack, giving further credence to the theoretical model and the associated entropy estimation technique.
Worcester Polytechnic Institute
Electrical & Computer Engineering
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted. If you have any questions, please contact email@example.com.
Yang, C. (2014). Security in Voice Authentication. Retrieved from https://digitalcommons.wpi.edu/etd-dissertations/79
Gaussian Mixture Model, Entropy, Voice, Mel Frequency Cepstral Coefficients