( to ) ? be : ! be;
sidebar left sidebar right
  • Project Gutenberg Ngram data: English only

    In my earlier post, I’d posted links to the Project Gutenberg Ngram data I had computed for e-books of all languages. If you are interested in only the English data, get these files instead.

    These two files are splits of a compressed file which contains all of the Project Gutenberg English e-books downloaded about a week […]

  • Topic extraction using Wikipedia data

    In an earlier article, I mentioned that I was trying to use Wikipedia data to do news article clustering to make it easy for me follow news feeds. I have made some progress. I’ve written an algorithm to produce a list of Wikipedia articles relevant to the input text. Input text […]

  • Ways to process and use Wikipedia dumps

     
    Wikipedia is a superb resource for reference (taken with a pinch of salt of course). I spend hours at a time spidering through its pages and always come away amazed at how much information it hosts. In my opinion this ranks amongst the defining milestones of mankind’s advancement.
    Apart from being available through http://www.wikipedia.org, the […]

101,229 views

Prashanth Ellina is powered by WordPress

No Complaints Shifter Series Theme by Buzzdroid.com
Computers blogarama - the blog directory Blog Flux Directory Blog Directory & Search engine Computer Blogs - Blog Catalog Blog Directory Computers blogs Bloggeries Blog Directory blog directory Computers Blog Blog Search, Blog Directory p Listed in LS Blogs the Blog Directory and Blog Search Engine Blog Review Blog search - categorized blog directory Link With Us - Web Directory Find Blogs in the Blog
Directory Blog Directory