Text corpus data analysis, with full support for international text (Unicode).
Functions for reading data from newline-delimited 'JSON' files, for normalizing
and tokenizing text, for searching for term occurrences, and for computing term
occurrence frequencies, including n-grams.