MultiCAT: Multimodal Communication Annotations for Teams

Welcome to the web UI for the MultiCAT dataset! We provide here a web UI for exploring the dataset, as well as links to papers and systems that use this dataset.

Abstract

Successful teamwork requires team members to understand each other and communicate effectively, managing multiple linguisticand paralinguistic tasks at once. Because of the potential for interrelatedness of these tasks, it is important to have the ability to make multiple types of predictions on the same dataset. Here, we introduce Multimodal Communication Annotations for Teams (MultiCAT), a speech- and text-based dataset consisting of audio recordings, automated and hand-corrected transcriptions. MultiCAT builds upon data from teams working collaboratively to save victims in a simulated search and rescue mission, and consists of annotations and benchmark results for the following tasks: (1) dialog act classification, (2) adjacency pair detection, (3) sentiment and emotion recognition, (4) closed-loop communication detection, and (5) vocal (phonetic) entrainment detection. We also present exploratory analyses on the relationship between our annotations and team outcomes. We posit that additional work on these tasks and their intersection will further improve understanding of team communication and its relation to team performance.

Paper and code

  • The PDF of our NAACL 2025 Findings paper and code for the benchmark analyses (and this web UI) can be found here: https://github.com/adarshp/MultiCAT.
  • When exploring the different tables, metadata describing the columns will be displayed at the top of the page. The metadata is loaded from the file datasette_interface/metadata.yml in the Github repository.

Exploring the data

The dataset is served via a single SQLite database.

An entity relationship diagram representing the database schema is available here: ERD diagram.

Feel free to try different SQL queries, use the programmatic APIs provided by Datasette, or simply download the whole SQLite database.

multicat

14,098 rows in 8 tables

utterance, entrainment, participant, trial, team, ...

Citation

If you use this dataset, please cite our NAACL 2025 Findings paper that introduces the dataset.

We will replace the metadata of the citations below with the ones from the official ACL Anthology entry when it is published.

BibTeX Format

    
@inproceedings{pyarelal-etal-2025-multicat,
    title = "{M}ulti{CAT}: Multimodal Communication Annotations for Teams",
    author = "Pyarelal, Adarsh  and
      Culnan, John M  and
      Qamar, Ayesha  and
      Krishnaswamy, Meghavarshini  and
      Wang, Yuwei  and
      Jeong, Cheonkam  and
      Chen, Chen  and
      Miah, Md Messal Monem  and
      Hormozi, Shahriar  and
      Tong, Jonathan  and
      Huang, Ruihong",
    editor = "Chiruzzo, Luis  and
      Ritter, Alan  and
      Wang, Lu",
    booktitle = "Findings of the Association for Computational Linguistics: NAACL 2025",
    month = apr,
    year = "2025",
    address = "Albuquerque, New Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-naacl.61/",
    pages = "1077--1111",
    ISBN = "979-8-89176-195-7"
}
    
    

APA Format

Adarsh Pyarelal, John Culnan, Ayesha Qamar, Meghavarshini Krishnaswamy, Yuwei Wang, Chen Chen, Md Messal Monem Miah, Shahriar Hormozi, Jonathan Tong & Ruihong Huang (2025). MultiCAT: Multimodal Communication Annotations for Teams. In Findings of the Association for Computational Linguistics: NAACL 2025.

Funding Acknowledgment

  • The creation of this dataset was funded by the Army Research Office and was accomplished under Grant Number W911NF-20-1-0002. The grant was awarded through the Defense Advanced Research Projects Agency (DARPA).
  • Continued support (documentation updates, replying to questions from dataset users, etc.) is supported by Army Research Office (ARO) Award Number W911NF-24-2-0034.