Evaluation Data (Small Sample Image Classification)
ImageNet-1K
#Evaluation Metric-Acc@1, Acc@5
Data Description:
ImageNet-1K is a subset of the ImageNet dataset used for ILSVRC2012. It consists of 1000 classes with approximately 1.2 million images in the training set and 50,000 images in the validation set. The original validation set is used as the test set. Take 15 images from each class in the original training set to form the training set, take 15 images from each class to form the validation set, and use the original validation set as the test set.
Dataset structure:
Amount of source data:
The dataset is divided into auxiliary training set (15,000), validation set (15,000), and test set (50,000).
Amount of test data:
All 50,000 test examples from the original test set.
Data detail:
KEYS | EXPLAIN |
---|---|
images | RGB images for input |
labels | Class labels for each image |
Sample of source dataset:
RGB image:
label:
n01440764
Citation:
@inproceedings{deng2009imagenet,
title={Imagenet: A large-scale hierarchical image database},
author={Deng, Jia and Dong, Wei and Socher, Richard and Li, Li-Jia and Li, Kai and Fei-Fei, Li},
booktitle={Computer Vision and Pattern Recognition},
pages={248--255},
year={2009},
organization={IEEE}
}
Place365-Standard
#Evaluation Metric-Acc@1, Acc@5
Data Description:
The Places365-Standard dataset is a scene recognition dataset with 1,800,000 images from 365 scene categories, including 36,500 validation images. The original validation set is used as the test set. Take 15 images from each class in the original training set to form the training set, take 15 images from each class to form the validation set, and use the original validation set as the test set.
Dataset structure:
Amount of source data:
The dataset is divided into auxiliary training set (5,475), validation set (5,475), and test set (36,500).
Amount of test data:
All 36,000 test examples from the original test set.
Data detail:
KEYS | EXPLAIN |
---|---|
images | RGB images for input |
labels | Class labels for each image |
Sample of source dataset:
RGB image:
label:
airfield
Citation:
@article{zhou2017places,
title={Places: A 10 million image database for scene recognition},
author={Zhou, Bolei and Lapedriza, Agata and Khosla, Aditya and Oliva, Aude and Torralba, Antonio},
journal={IEEE transactions on pattern analysis and machine intelligence},
volume={40},
number={6},
pages={1452--1464},
year={2017},
publisher={IEEE}
}
Dataset Copyright and Usage Information:
Creative Common License (Attribution CC BY)
Stanford Cars
#Evaluation Metric-Acc@1, Acc@5
Data Description:
The Stanford Cars dataset contains 16,185 images of 196 car classes. It is split into 8,144 training images and 8,041 test images, with each class roughly divided into a 50-50 ratio. The image classes in the evaluation task are based on car brands. The original validation set is used as the test set. Take 15 images from each class in the original training set to form the training set, take approximately 15 images from each class to form the validation set, and use the original validation set as the test set.
Dataset structure:
Amount of source data:
The dataset is divided into auxiliary training set (2,940), validation set (2,931), and test set (8,041).
Amount of test data:
All 8,041 test examples from the original test set.
Data detail:
KEYS | EXPLAIN |
---|---|
images | RGB images for input |
labels | Class labels for each image |
Sample of source dataset:
RGB image:
lebel:
Acura Integra Type R 2001
Citation:
@inproceedings{krause20133d,
title={3d object representations for fine-grained categorization},
author={Krause, Jonathan and Stark, Michael and Deng, Jia and Fei-Fei, Li},
booktitle={Computer Vision Workshops},
pages={554--561},
year={2013}
}
CUB-200-2011
#Evaluation Metric-Acc@1, Acc@5
Data Description:
CUB-200-2011 is the most widely used dataset for fine-grained visual classification tasks. It contains 11,788 images belonging to 200 subcategories of birds, with 5,994 images for training and 5,794 images for testing. The original validation set is used as the test set. Take 15 images from each class in the original training set to form the training set, take approximately 15 images from each class to form the validation set, and use the original validation set as the test set.
Dataset structure:
Amount of source data:
The dataset is divided into auxiliary training set (3,000), validation set (2,994), and test set (5,794).
Amount of test data:
All 5,794 test examples from the original test set.
Data detail:
KEYS | EXPLAIN |
---|---|
images | RGB images for input |
labels | Class labels for each image |
Sample of source dataset:
RGB image:
label:
Black_Footed_Albatross
Citation:
@article{wah2011caltech,
title={The caltech-ucsd birds-200-2011 dataset},
author={Wah, Catherine and Branson, Steve and Welinder, Peter and Perona, Pietro and Belongie, Serge},
year={2011},
publisher={California Institute of Technology}
}
FGVC-Aircraft
#Evaluation Metric-Acc@1, Acc@5
Data Description:
FGVC-Aircraft dataset contains 10,000 aircraft images from 100 different models, each model having 100 images. Two-thirds of the data are used as the training set, and the remaining one-third is used as the test set. The original validation set is used as the test set. Take 15 images from each class in the original training set to form the training set, take 15 images from each class to form the validation set, and use the original validation set as the test set.
Dataset structure:
Amount of source data:
The dataset is divided into auxiliary training set (1,500), validation set (1,500), and test set (3,333).
Amount of source data:
All 3,333 test examples from the original test set.
Data detail:
KEYS | EXPLAIN |
---|---|
images | RGB images for input |
labels | Class labels for each image |
Sample of source dataset:
RGB image:
label:
Boeing
Citation:
@article{maji2013fine,
title={Fine-grained visual classification of aircraft},
author={Maji, Subhransu and Rahtu, Esa and Kannala, Juho and Blaschko, Matthew and Vedaldi, Andrea},
journal={arXiv preprint arXiv:1306.5151},
year={2013}
}
Food-101
#Evaluation Metric-Acc@1, Acc@5
Data Description:
The Food-101 dataset consists of 101 food categories, with each category having 750 training images and 250 test images, totaling 101k images. The test image labels have been manually cleaned, while the training set may contain some noise. The original validation set is used as the test set. Take 15 images from each class in the original training set to form the training set, take 15 images from each class to form the validation set, and use the original validation set as the test set.
Dataset structure:
Amount of source data:
The dataset is divided into auxiliary training set (1,515), validation set (1,515), and test set (25,250).
Amount of source data:
All 25,250 test examples from the original test set.
Data detail:
KEYS | EXPLAIN |
---|---|
images | RGB images for input |
labels | Class labels for each image |
Sample of source dataset:
RGB image:
label:
apple-pie
Citation:
@inproceedings{bossard2014food,
title={Food-101--mining discriminative components with random forests},
author={Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc},
booktitle={European Conference on Computer Vision},
pages={446--461},
year={2014},
organization={Springer}
}
DTD
#Evaluation Metric-Acc@1, Acc@5
Data Description:
DTD(Describable Textures Dataset) is a texture image dataset containing 5,640 images grouped into 47 human-perceived categories. Each category has 120 images, with two-thirds used as the training set and the remaining one-third used as the test set. The original validation set is used as the test set. Take 15 images from each class in the original training set to form the training set, take 15 images from each class to form the validation set, and use the original validation set as the test set.
Dataset structure:
Amount of source data:
The dataset is divided into auxiliary training set (705), validation set (705), and test set (1,880).
Amount of test data:
All 1,880 test examples from the original test set.
Data detail:
KEYS | EXPLAIN |
---|---|
images | RGB images for input |
labels | Class labels for each image |
Sample of source dataset:
RGB image:
类别标签:
banded
Citation:
@inproceedings{cimpoi2014describing,
title={Describing textures in the wild},
author={Cimpoi, Mircea and Maji, Subhransu and Kokkinos, Iasonas and Mohamed, Sammy and Vedaldi, Andrea},
booktitle={Computer Vision and Pattern Recognition},
pages={3606--3613},
year={2014}
}