LID评测数据
Common Language
#准确率-Accuracy
适配方法
线性分类器(Linear Classifier),上游模型输出的特征,经过全局平均池化,再输入到线性分类器中,此线性分类器包含两层Linear层和一层ReLU激活层,线性分类器输入维度与特征向量的维度相等,输出维度与类别数相等。
数据描述
该数据集是从Common Voice中提取出来的,用于训练语种识别系统。数据集共包含45个语种,录音的总时间为45.1小时(即每种语言1小时的数据)。 该数据集已经被并分为训练集、验证集和测试集。从中选出10个语种构建语种识别系统。
数据集构成和规范
源数据量
训练集30h, 验证集7.5h, 测试集7.5h
评测数据量
评测数据选取源数据集的十个语种:"Chinese_China", "Dutch", "English", "Greek", "Italian", "Mangolian", "Russian", "Spanish", "Swedish", "Welsh"
共包括6.7h训练集,1.7h验证集,1.7h测试集。
数据字段
训练集、验证集、测试集数据记录在对应的json文件中,json文件包含“labels”以及“meta_data”两个字段,其中“labels”记录标签的类别和编码,“meta_data”记录具体数据样本的相对路径、标签以及数据集类别(训练集、验证集和测试集)
数据集样例
{
"labels": {
"Chinese_China": 0,
"Dutch": 1,
"English": 2,
"Greek": 3,
"Italian": 4,
"Mangolian": 5,
"Russian": 6,
"Spanish": 7,
"Swedish": 8,
"Welsh": 9
},
"meta_data": [
{
"path": "Chinese_China/train/chch_trn_sp_75/common_voice_zh-CN_20785511.wav",
"label": "Chinese_China",
"speaker": "train"
},
{
"path": "English/train/eng_trn_sp_503/common_voice_en_21530721.wav",
"label": "English",
"speaker": "train"
},
{
"path": "Italian/train/itln_trn_sp_555/common_voice_it_20265825.wav",
"label": "Italian",
"speaker": "train"
},
{
"path": "Spanish/train/spa_trn_sp_227/common_voice_es_19635818.wav",
"label": "Spanish",
"speaker": "train"
},
{
"path": "Welsh/train/wls_trn_sp_546/common_voice_cy_19197031.wav",
"label": "Welsh",
"speaker": "train"
}
]
}
评价指标
准确率Accuracy
引用
@dataset{ganesh_sinisetty_2021_5036977,
author = {Ganesh Sinisetty and
Pavlo Ruban and
Oleksandr Dymov and
Mirco Ravanelli},
title = {CommonLanguage},
month = jun,
year = 2021,
publisher = {Zenodo},
version = {0.1},
doi = {10.5281/zenodo.5036977},
url = {https://doi.org/10.5281/zenodo.5036977}
}
开源协议
CC BY 4.0