Gus Cooney.

The CANDOR corpus: Insights from a large multimodal dataset of naturalistic conversation

Authors

Reece, A., Cooney, G., Bull,P.
Chung, C., Dawson, B., Fitzpatrick, C.,
Glazer, T., Knox D., Liebscher, A., & Marin, S.

Description

Reece, A.,* Cooney, G.,* Bull,P., Chung, C., Dawson, B., Fitzpatrick, C., Glazer, T., Knox D., Liebscher, A., & Marin, S. (2023). The CANDOR corpus: Insights from a large multimodal dataset of naturalistic conversation. Science Advances.

https://doi.org/10.1126/sciadv.adf3197

People spend a substantial portion of their lives engaged in conversation, and yet, our scientific understanding of conversation is still in its infancy. Here, we introduce a large, novel, and multimodal corpus of 1656 conversations recorded in spoken English. This 7+ million word, 850-hour corpus totals more than 1 terabyte of audio, video, and transcripts, with moment-to-moment measures of vocal, facial, and semantic expression, together with an extensive survey of speakers’ post-conversation reflections. By taking advantage of the considerable scope of the corpus, we explore many examples of how this large-scale public dataset may catalyze future research, particularly across disciplinary boundaries, as scholars from a variety of fields appear increasingly interested in the study of conversation.

Interests & hobbies

Embarking on adventures through skiing, immersing myself in diverse cultures through rugs and textiles, and finding serenity in the art of surfing – these are the passion that shape my life.