Blending speech output and visual text in the multimodal interface

Type de publicationArticle de revue
Année de publication2008
AuteursDowell, J., & Shmueli Y.
RevueHuman Factors

Objective: Simultaneous reading and listening with a redundant display of visual text with speech output was investigated to determine how variations in verbal working memory capacity and content complexity affected comprehension. Background: Previous work has found some evidence of a benefit for displays that blend speech and visual text; content complexity and verbal working memory capacity are likely to significantly determine this benefit. Method: In the experiment reported here, a multimodal display of e-mail messages was compared with speech output alone and with a purely visual display. Comprehension of the messages was examined in relation to verbal working memory capacity and the complexity of the messages. Thirty-two users participated in the study, which used a repeated measures design. Results: The data show that the multimodal interface did not affect comprehension relative to a purely visual interface, even when the content was more complex, although it did improve the comprehension of complex information relative to a purely auditory interface. Lowercapacity participants were neither especially advantaged nor disadvantaged by the multimodal interface. Participants expressed a marked preference for the multimodal display of the more complex sentences. Conclusion: The experiment suggests that a redundant multimodal display will neither assist nor disrupt understanding when compared with a purely visual display, but it will assist understanding of complex content when compared with speech output alone. Application: Redundant displays of visual text and speech have potential application in multitask situations, in multimedia presentations, and for devices with small screens.

DOI10.1518/ 001872008X354165