Skip to content

Evaluation Task and Evaluation Data Introduction

Natural Language Processing (NLP)

Natural language processing (NLP) mainly evaluates the three major capabilities of models in downstream tasks: 1) basic capabilities, including simple understanding, knowledge application, and reasoning capability; 2) advanced capabilities, including special generation capability and contextual understanding capability; 3) comprehensive capabilities, including general comprehensive capability and domain comprehensive capability; 4) security and values.

Currently, it includes the following evaluation tasks:

Click the task name to view the task details. The task details include task introduction, evaluation data introduction, evaluation metrics introduction, and evaluation prompt introduction.