Domain mismatch compensation for speaker recognition using a library of whiteners
November 1, 2015
The development of the i-vector framework for generating low dimensional representations of speech utterances has led to considerable improvements in speaker recognition performance. Although these gains have been achieved in periodic National Institute of Standards and Technology (NIST) evaluations, the problem of domain mismatch, where the system development data and the application data are collected from different sources, remains a challenging one. The impact of domain mismatch was a focus of the Johns Hopkins University (JHU) 2013 speaker recognition workshop, where a domain adaptation challenge (DAC13) corpus was created to address this problem. This paper proposes an approach to domain mismatch compensation for applications where in-domain development data is assumed to be unavailable. The method is based on a generalization of data whitening used in association with i-vector length normalization and utilizes a library of whitening transforms trained at system development time using strictly out-of-domain data. The approach is evaluated on the 2013 domain adaptation challenge task and is shown to compare favorably to in-domain conventional whitening and to nuisance attribute projection (NAP) inter-dataset variability compensation.