Multi-lingual deep neural networks for language recognition
                  December 13, 2016
      
      
  
    
                  Conference Paper
      
      
  
    Author:
  
      Published in:
  
      SLT 2016, IEEE Spoken Language Technology Workshop, 13-16 December 2016.
      
  
    R&D Area:
  
            
  
    Summary
              Multi-lingual feature extraction using bottleneck layers in deep neural networks (BN-DNNs) has been proven to be an effective technique for low resource speech recognition and more recently for language recognition. In this work we investigate the impact on language recognition performance of the multi-lingual BN-DNN architecture and training configurations for the NIST 2011 and 2015 language recognition evaluations (LRE11 and LRE15). The best performing multi-lingual BN-DNN configuration yields relative performance gains of 50% on LRE11 and 40% on LRE15 compared to a standard MFCC/SDC baseline system and 17% on LRE11 and 7% on LRE15 relative to a single language BN-DNN system. Detailed performance analysis using data from all 24 Babel languages, Fisher Spanish and Switchboard English shows the impact of language selection and the amount of training data on overall BN-DNN performance.
          