FlagEval

ColorBench

Data Description

ColorBench is a multimodal dataset designed to comprehensively evaluate the capabilities of Vision-Language Models (VLMs) in color understanding, including color perception, reasoning, and robustness. This dataset was introduced in the paper titled "ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness."

The ColorBench dataset consists of 5,814 image-question pairs, covering diverse application scenarios and real-world challenges for VLM evaluation. It includes 3 major categories and 11 fine-grained tasks, specifically designed to assess VLMs' color-related comprehension abilities.

The 11 fine-grained tasks are:

Color Counting: 102 questions
Color Proportion: 81 questions
Color Recognition: 76 questions
Color Robustness: 4,858 questions
Color Comparison: 101 questions
Object Recognition: 77 questions
Color Mimicry: 70 questions
Color Extraction: 96 questions
Color Blindness Test: 157 questions
Color Illusion: 93 questions
Object Counting: 103 questions

Dataset Structure

Amount of source data:

Test set: 5,814 items

Amount of Evaluation data:
The evaluation dataset consists of 956 items from the original test set corresponding to color perception and reasoning abilities.

Data Detail

KEY	EXPLANATION
`idx`	Data ID
`id`	ID within the category
`type`	Evaluation category
`task`	Task name
`filename`	Image filename
`image`	Image (PIL format)
`prompt`	Prompt
`question`	Question
`choices`	Options
`answer`	Answer
`image_url`	Original image URL

Sample of Source Dataset

json

{  
  'idx': 0,  
  'id': 1,  
  'type': 'Perception',  
  'task': 'Object Recognition',  
  'filename': 'ObjectRecognition/1.jpg',  
  'image': <PIL.PngImagePlugin.PngImageFile image mode=RGBA size=1682x1072 at 0x7F6762778C10>,  
  'prompt': 'which state is not light pink in this image? Select from the following choices. (A) ID (B) OK (C) TX (D) MO',  
  'question': 'Which state is not light pink in this image?',  
  'choices': ['ID', 'OK', 'TX', 'MO'],  
  'answer': '(B)',  
  'image_url': 'https://www.shutterstock.com/image-vector/grunge-watercolor-map-usa-united-states-2169490363'  
}

Citation Information

bibtex

@article{liang2025colorbench,  
  title={Colorbench: Can vlms see and understand the colorful world? a comprehensive benchmark for color perception, reasoning, and robustness},  
  author={Liang, Yijun and Li, Ming and Fan, Chenrui and Li, Ziyue and Nguyen, Dang and Cobbina, Kwesi and Bhardwaj, Shweta and Chen, Jiuhai and Liu, Fuxiao and Zhou, Tianyi},  
  journal={arXiv preprint arXiv:2504.10514},  
  year={2025}  
}

Licensing Information

Apache License 2.0

ColorBench ​

Data Description ​

Dataset Structure ​

Data Detail ​

Sample of Source Dataset ​

Citation Information ​