Skip to content

The Interactive Emotional Dyadic Motion Capture (IEMOCAP)

#Metrics-WAR UAR

Adapter:

Linear Classifier

Data description:

The Interactive Emotion Motion Capture (IEMOCAP) database is a performative, multimodal, and multispeaker emotion database. It contains approximately 12 hours of audio-visual data, including video, audio, facial motion capture, and text transcription. It includes conversational sessions with actors performing improvisation or scripted scenarios.

Dataset structure:

Amount of source data:

The dataset is split into Neutral (1708), Angry (1103), Sad (1084), Happy (595), Excited (1041), Scared (40), Surprised (107), Frustrated (1849), and Other (2507).

Amount of evaluation data:

The evaluation dataset are 5531 instances (Neutral, Angry, Sad, Happy) from the source dataset, where samples from the Excited and Happy categories are combined.

Data detail:

KEYSEXPLAIN
iddata id
sentencethe content of the speech
labelemotion label

Sample of dataset:

{
  "id": Ses01F_impro01_F000,
  "sentence": "Excuse me.",
  "label": "neu"
}

Citation information:

@article{busso2008iemocap,
  title={IEMOCAP: Interactive emotional dyadic motion capture database},
  author={Busso, Carlos and Bulut, Murtaza and Lee, Chi-Chun and Kazemzadeh, Abe and Mower, Emily and Kim, Samuel and Chang, Jeannette N and Lee, Sungbok and Narayanan, Shrikanth S},
  journal={Language resources and evaluation},
  volume={42},
  pages={335--359},
  year={2008},
  publisher={Springer}
}

Licensing information:

IEMOCAP License