Skip to content

Evaluation Metrics

1. Quality

1.1 Perceptual Evaluation of Speech Quality (PESQ)

The Perceptual Evaluation of Speech Quality is used to measure the subjective perceived quality of speech quality, and can evaluate performance in enhancing speech clarity, naturalness, and noise suppression. By simulating the characteristics of the human auditory system, it compares the enhanced speech with the reference speech to predict the subjective opinion score of the enhanced speech. PESQ is a value between -0.5 and 4.5, the higher the value, the better the speech quality and the better the performance of the speech enhancement model

1.2 Short Time Objective Intelligibility (STOI)

Short Time Objective Intelligibility is a human-designed objective assessment that focuses on the articulateness and recognizability of speech. STOI estimates the intelligibility of speech by comparing the time-domain similarity between the enhanced speech and the reference speech based on the similarity of the short-time spectrum envelope. STOI is a value between 0 and 100; the closer the value is to 100, the better the speech intelligibility and the better the performance of the speech enhancement model.