SF Evaluation Data

SNIPS

#Accuracy, Robustness

Adaptation Method:

CTC Decoder: Features output by the upstream model are passed through two LSTM layers and a linear classifier with a fully connected layer. The input dimension is equal to the feature vector dimension, and the output dimension is equal to the number of slot types.

Data Description：

The SNIPS Natural Language Understanding benchmark is a dataset containing over 16,000 crowdsourced queries in English, distributed across 7 user intents of varying complexity: SearchCreativeWork (e.g., find me a robot TV show), GetWeather (e.g., is it windy in Boston, Massachusetts right now?), BookRestaurant (e.g., I want to book a highly rated restaurant in Paris for tomorrow night), PlayMusic (e.g., play the last track by Beyoncé on Spotify), AddToPlaylist (e.g., add Diamonds to my road trip playlist), RateBook (e.g., give six stars to Of Mice and Men), SearchScreeningEvent (e.g., check the screening time for Wonder Woman in Paris).

Dataset structure：

Amount of source data：

Training set: 13,084 items, Validation set: 700 items, Test set: 700 items.

Amount of Evaluation data：

The evaluation data volume is the public test set of 700 items.

Data detail：

KEYS	EXPLAIN
id	Path to the data's MP3 file
text	Text corresponding to the MP3 file
label	Slot type for each token

Sample of source dataset：

{
    "id":"Aditi-snips-test-0",
    "text":"BOS I'D LIKE TO HAVE THIS TRACK ONTO MY CLASSICAL RELAXATIONS PLAYLIST EOS"	
    "label":"O O O O O O music_item O playlist_owner playlist playlist O AddToPlaylist"
}

Citation information：

@article{coucke2018snips,
  title={Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces},
  author={Coucke, Alice and Saade, Alaa and Ball, Adrien and Bluche, Th{\'e}odore and Caulier, Alexandre and Leroy, >   David and Doumouro, Cl{\'e}ment and Gisselbrecht, Thibault and Caltagirone, Francesco and Lavril, Thibaut and others},
  journal={arXiv preprint arXiv:1805.10190},
  year={2018}
  }
  ```

Licensing information：

Creative Commons Zero v1.0 Universal

SF Evaluation Data ​

SNIPS ​

Adaptation Method: ​

Data Description： ​

Dataset structure： ​

Amount of source data： ​

Amount of Evaluation data： ​

Data detail： ​

Sample of source dataset： ​

Citation information： ​

Licensing information： ​