The study of the statistical analysis of large collections of texts, including language structure, usage, and variation.