Evaluation Metrics
1. Image Quality and Diversity
Image quality refers to the visual quality of the generated images, including image clarity, image content rationality, etc. Image diversity refers to whether there is enough difference between the images generated by the model. The generated images should not be identical.
There is no objective evaluation method for image quality and diversity, so we enumerate the metrics based on pre-trained image backbone networks.
1.1 Inception Score (IS)
Inception Score evaluates the visual quality of the generated image. It defines the clarity of the image as the clarity of the classification of the objects in the image ("whether the generated image can be clearly classified into a certain category"). The clearer the image is classified, the higher the image quality is.
1.2 Frechet Inception Distance (FID)
Inception Score only considers the generated images but ignores the images in the training set. It does not consider the diversity of the generated images, either. Frechet Inception Distance measures the quality and diversity of the generated image by calculating the distance between the feature distribution of generated images and ground-truth images. The closer the distribution is, the better the evaluation result is.