This is data that is used to train machine learning models, such as sentences labeled with entity type annotations.