Two new experimental protocols for measuring speech transcript readability for timed question-answering tasks
This paper reports results from two recent psycholinguistic experiments that measure the readability of four types of speech transcripts for the DARPA EARS Program. The two key questions in these experiments are (1) how much speech transcript cleanup aids readability and (2) how much the type of cleanup matters. We employ two variants of the four-part figure of merit to measure readability defined at the RT02 workshop and described in our Eurospeech 2003 paper [4] namely: accuracy of answers to comprehension questions, reaction-time for passage reading, reaction-time for question answering and a subjective rating of passage difficulty. The first protocol employs a question-answering task under time pressure. The second employs a self-paced line-by-line paradigm. Both protocols yield similar results: all three types of clean-up in the experiment improve readability 5-10%, but the self-paced reading protocol needs far fewer subjects for statistical significance.