Graph-embedding for speaker recognition
September 30, 2010
Conference Paper
Author:
Published in:
INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, 26-30 September 2010, pp. 2742-2745.
R&D Area:
Summary
Popular methods for speaker classification perform speaker comparison in a high-dimensional space, however, recent work has shown that most of the speaker variability is captured by a low-dimensional subspace of that space. In this paper we examine whether additional structure in terms of nonlinear manifolds exist within the high-dimensional space. We will use graph embedding as a proxy to the manifold and show the use of the embedding in data visualization and exploration. ISOMAP will be used to explore the existence and dimension of the space. We also examine whether the manifold assumption can help in two classification tasks: data-mining and standard NIST speaker recognition evaluations (SRE). Our results show that the data lives on a manifold and that exploiting this structure can yield significant improvements on the data-mining task. The improvement in preliminary experiments on all trials of the NIST SRE Eval-06 core task are less but significant.