A study of computation speed-ups of the GMM-UBM speaker recognition system
September 5, 1999
The Gaussian Mixture Model Universal Background Model (GMM-UBM) speaker recognition system has demonstrated very high performance in several NIST evaluations. Such evaluations, however, are concerned only with classification accuracy. In many applications, system effectiveness must be evaluated in light of both accuracy and execution speed. We present here a number of techniques for decreasing computation. Using data from the Switchboard telephone speech corpus, we show that significant speed-ups can be obtained while sacrificing surprisingly little accuracy. We expect that these techniques, involving lowering model order as well as processing fewer speech frames, will apply equally well to other recognition systems.