A Natural Language Processing Benchmark Framework

Bruce H. Cottman, Ph.D.
3 min readMay 14, 2021

I introduce KILT, a benchmark framework for natural language models. I also show how to retrieve close to one million public text or PDF documents. Some of these documents are raw text, some are clean text, and some include categorical labeling.

Thousands of PDF, Word, and Text Documents to Download for your NLP Project.Photo by Emil Widlund on Unsplash

List of Lists of Public NLP Datasets.

--

--

Bruce H. Cottman, Ph.D.

I write my blog utilizing decades of experience in investment, programming, and data science.