A corpus is a collection of texts that are used for analysis. It can be a collection of books, articles, or any other form of written material. It can be obtained from public sources, such as Wikipedia or news articles, or generated for specific domains, such as medical or legal texts.