This involves the use of large collections of texts (corpora) to identify linguistic patterns and features.