In my earlier post, I’d posted links to the Project Gutenberg Ngram data I had computed for e-books of all languages. If you are interested in only the English data, get these files instead.
These two files are splits of a compressed file which contains all of the Project Gutenberg English e-books downloaded about a week […]













