I recently came to know that NCERT was providing all the text books (class 1 – 12) for download. However, I found their interface hard to use for browsing through. So I wrote a crawler in python to get all the books to my webserver.
Along with storing the books, I generated thumbnails for every chapter. Navigation pages have also been generated to help browse through the books easily. You can access this dump here.
The crawler code is here.
These are the steps to run it yourself.
# scrape ncert site to get details about book pdf urls
python get_ncert_books_data.py > ncert_books_data
# create a directory to store all downloaded files
mkdir ncert_books
# download all the pdfs and cover page images
cat ncert_books_data | python download_ncert_books.py ncert_books
# generate thumbnails for each pdf (requires ImageMagick)
find ncert_books/ -iname "*.pdf" > all_pdf_files
sed -i "s/^/0\t/" all_pdf_files
python generate_thumbnails.py
# resize the book cover images
python resize_image_thumbnails.py ncert_books
# generate the navigation pages
python generate_navigation_pages.py ncert_books
For the thumbnail generation to work, you will need ImageMagick. Read this for more information.
Tags: crawler, ncert, python, textbooks















slawek wrote,
I got such error when trying to launch the first python script… I need to take a look and see what’s going on
Traceback (most recent call last):
File “get_ncert_books_data.py”, line 97, in
main()
File “get_ncert_books_data.py”, line 63, in main
book_links = get_book_urls(listing_page_html)
File “get_ncert_books_data.py”, line 19, in get_book_urls
script_data = script_data[0]
IndexError: list index out of range
Thanks,
Slawek
Link | February 7th, 2008 at 12:01 am
Games that teach you to blog | social media and green horses wrote,
[...] to reach the front page, vote other submissions first. …better software, help the community develop better software. Send your feedback, the bugs you find,your [...]
Link | February 7th, 2008 at 3:33 am
prashanthellina wrote,
Slawek, download this page and see if it has a “SCRIPT” tag (http://www.ncert.nic.in/textbooks/testing/first.htm). I ran the script again but it works fine. You must be getting a different version of the page. If you do not find the SCRIPT tag, send the html to me at prashanthellina AT gmail DOT com.
Link | February 7th, 2008 at 9:10 am
Spitfire wrote,
good blog; keep it going!
Link | February 27th, 2008 at 9:10 am
Sreeram wrote,
Nice Blog! and thanks for the NCERT books.
Link | March 1st, 2008 at 7:23 pm
prashanthellina wrote,
Thanks. Glad you like it.
Link | March 1st, 2008 at 8:35 pm
Vinay wrote,
Quiet interesting! I don’t think you are messing up with their copyright issues of copying the books
(kiddin’). Indeed it was true that i found it hard to navigate around the site to find information that I was looking into. Good work there.. pulling all datas!
Link | March 4th, 2008 at 8:24 am
Mustaqur Rahman wrote,
Many Thanks. Much easier to download !!. The NCERT site is awsome!!
Thanks again
Link | March 23rd, 2008 at 10:19 pm
prashanthellina wrote,
Glad you found it useful. Enjoy!
Link | March 23rd, 2008 at 10:21 pm
devashish wrote,
useful page
thanks
Link | March 24th, 2008 at 2:18 pm
A.K.Balamurugan wrote,
Dear Mr prashanthellina
I really don’t understand what I should do to get the books. What has to be done with scripts?
Please help
Link | March 29th, 2008 at 1:59 pm
prashanthellina wrote,
A.K Balamurugan, use this link to download the books -> http://www.prashanthellina.com/docs/ncert_books.
Link | March 31st, 2008 at 9:53 pm
Dr.Manoranjan S Sanasam wrote,
Dear Prashanth Ellina,
The link to download the books-> http://www.prashanthellina.com/docs/ncert_books. does not contain the new texbooks for class VIII.
Please help.
Link | April 1st, 2008 at 6:44 pm
prashanthellina wrote,
Dr. Manoranjan, you can use this link to get latest uploaded books from NCERT -> http://ncert.nic.in/textbooks/testing/Index.htm .
Link | April 1st, 2008 at 8:14 pm
shubham wrote,
course is very hard
Link | April 5th, 2008 at 7:25 pm
Raghu wrote,
Excellent Collection. Thank you very much. I could get all except the Science (for English Medium) for 6th Standard and the Environmental Studies (English Medium) for 3rd Standard. You had helped me a lot by giving me a golden opportunity to engage my children during this vacation (off course, a litle bit amidst their play.
Link | April 5th, 2008 at 9:51 pm
prashanthellina wrote,
Raghu, I’m glad you found this resource helpful. The credit goes to NCERT for putting up the books for download. I’ve just organized them better. If you have not already used Wikipedia, allow me to introduce you to it. Wikipedia is a vast online encyclopedia that is free to use and maintained by netizens. I would suggest that you introduce your kids to this resource (http://en.wikipedia.org) . I wish them happy learning!
Link | April 6th, 2008 at 11:49 am
Raghu wrote,
I could not find the lesson chapters in Environmental Studies (looking around) for 3rd standard and Science (english medium) for 6th standard.
These two are required for my children of 3rd & 6th standard.They already
familiar to Wikipedia. Thanks.Please repost.
Link | April 7th, 2008 at 4:12 pm
sumit wrote,
i am searching for class 5th ncert text books can u please help me out
Link | April 7th, 2008 at 10:15 pm
prashanthellina wrote,
Raghu, those books are missing from my dumps. Please head over to the ncert site to find them (if they are uploaded at all). Sumit, you will find the books in the link provided.
Link | April 8th, 2008 at 7:16 pm
NIHAL wrote,
Nihal wrote
i am searching for class 5th ncert text books can u please help me out
Link | April 8th, 2008 at 11:33 pm
Anu wrote,
You can directly download from ncert textbooks. method Go to google search>tpye- ncert> click on ncert textbooks
Link | April 15th, 2008 at 2:13 pm
basavaraj wrote,
i need 8th standard text books online. i want to take print out of the text bookd.
Link | April 15th, 2008 at 3:17 pm
priti wrote,
THANK YOU so much for the books.
you have no clue how happy i am.
have been looking for them for a while.
danku.
regards
Link | April 18th, 2008 at 1:11 pm
prashanthellina wrote,
Glad you found them here! Enjoy.
Link | April 18th, 2008 at 6:37 pm
Akshay wrote,
SIMPLY BRILLIANT WORK ….
I THANK YOU FOR YOUR SPLENDID EFFORTS .
Link | April 19th, 2008 at 11:00 am
Rahul wrote,
Thanks Dear! God bless you! It is ssssooooo usefulllllllll! Thanks again!
Link | April 21st, 2008 at 3:49 am
akshay wrote,
please bring books at market
Link | April 21st, 2008 at 10:11 pm
rishabh wrote,
can i have a link for various competitive books for iit-jee ,aieee,like hc verma , op tandon etc
Link | April 22nd, 2008 at 5:42 pm
prashanthellina wrote,
Akshay and Rahul, I am happy you found this resource useful. Enjoy! Rishabh, I don’t have this info. Google is your friend.
Link | April 23rd, 2008 at 5:49 pm
angela wrote,
brilliant
Link | April 26th, 2008 at 7:11 pm
kuldeep wrote,
there is vry late srvice of book.
Link | April 30th, 2008 at 2:59 pm
Radhakrishnan wrote,
Dear Sir,
How to get the mathematics guide of class 9, please advice me in which site it is available.
with thanks
Rk
Link | May 2nd, 2008 at 2:43 pm
prashanthellina wrote,
Hi Rk, I don’t have information about this.
Link | May 4th, 2008 at 9:57 am
nikhith wrote,
Iam searchinf class v evs textbook can u help me
Link | May 4th, 2008 at 11:37 pm
a qadeer wrote,
iam in desperate need of science book (english medium) for class VIII can any one help when i go to download at ncert its just has contents when i load contents and try to download the chapters its giving this error ” file://Downloads//book_publishing/class8/science/2.pdf”could not be found. pls hel p me out
Link | May 6th, 2008 at 6:25 pm
~R~ wrote,
hey there, thanks a lot!i desperately needed to download the economics book cuz it isnt out in the stores yet….so thanks !
Link | May 8th, 2008 at 4:42 pm
teja wrote,
nice site for the ncert books….thanks prasanth
Link | May 12th, 2008 at 10:04 am
SHREYA GANDHI wrote,
THESE BOOKS ARE RELATED TO MY SCHOOL SYLLABUS . BUT I NEED SCIENCE AND TECHNOLOGY OF CLASS 8 AS IT IS VERY URGENT FOR ME .
Link | May 14th, 2008 at 6:46 pm
nidhi.s harshan wrote,
i am in class 8 studing in bahrain online text of science is only in urdu and hindi can you keep science text in english also
Link | May 14th, 2008 at 7:01 pm
nidhi.s harshan wrote,
class 8th science test its english version i need
Link | May 14th, 2008 at 7:04 pm
nidhi.s harshan wrote,
THESE BOOKS ARE RELATED TO MY SCHOOL SYLLABUS . BUT I NEED SCIENCE AND TECHNOLOGY OF CLASS 8 ENGLISH VERSION AS IT IS VERY URGENT FOR ME .
Link | May 14th, 2008 at 7:06 pm
vino wrote,
you need to update the class 8 and 5 books
the books that are there now are old
they are not based on 2005 ncf
Link | May 16th, 2008 at 9:32 am
shahudheen wrote,
I can see people around the glob are using this site. Pls let me know how can i get 8th standard ncert text books. It is not available in Kerala. Any solution pls inform me.
thanx and regards
shahudheen
Link | May 16th, 2008 at 11:49 am
nidhi.s harshan wrote,
I want to downlod science text of class 8th from site. but in site i saw only in urdu and hindi version. please keep science text in english version also.
Link | May 17th, 2008 at 12:19 pm
sumathi wrote,
Thanks for the books.Really very helpful.
Link | May 21st, 2008 at 5:51 pm
Sayandipa Choudhury wrote,
i m a student of Class VIII in K.V.Silchar, Assam. the NCERT Books in the whole N.E. Region are not available till now, kindly help me to get these books.
Link | May 21st, 2008 at 11:23 pm
Gaargi wrote,
Hi! Could you please help me get NCERT CIVICS textbookx for class IX and X… here’s the hitch…that were in use in schools in 2000? Its for a research purpose and very urgent. Would really appreciate any help/ pointers
Link | May 25th, 2008 at 10:26 pm
Saikumar wrote,
Thanks a lot, bloody government create websites which are not user friendly.
i thank you for this stuff(Ncert books)
Link | May 27th, 2008 at 2:19 pm
a qadeer wrote,
am in desperate need of social science(english medium) for class VIII can any one help when i go to download at ncert its just has contents when i load contents and try to download the chapters its giving this error ” file://Downloads//book_publishing/class8/science/2.pdf”could not be found. pls hel p me out
Link | May 27th, 2008 at 2:38 pm
K Chaudhary wrote,
many thanks for your downloadable ncert books. been really helpful. keep up your good work.
Link | June 3rd, 2008 at 4:27 pm
Athira wrote,
I am not able to see the science text book for VI standard. Only index is available. Please help
Athira
Link | June 5th, 2008 at 11:18 am
shanthi wrote,
Pls help me to get NCERT Social science text book for 8th std.
Link | June 9th, 2008 at 10:52 am
Suni wrote,
Only Index is available for Class 3 – Environment Studies – Looking around.
Please help
Link | June 11th, 2008 at 1:37 am
SANJAI NANDAKUMAR wrote,
This is the most useful site I have seen in recent times. Kudos for the effort in publishing NCERT textbooks. please maintain this site like this so that more and more people are benefitted. Thank you for such a wonderful effort.
Sanjai
Link | June 12th, 2008 at 3:13 pm
Murruli M N wrote,
Dear prashanth
I am looking for NCERT old textbooks like Bal Bharti Part-III, IV and V because in my sons school they are still using Bal Bharti series instead of RimJim series. Can you help me in finding these book in pdf format for download. It will be a great help for me.
Thanks
Murruli
Mysore
Link | June 19th, 2008 at 1:49 pm
govind wrote,
Dear Prshanth
This is a very good effort for NCERT books. However I did not find some of the books online such as “Hindi Vykaran aur Rachana” for Class VI-VIII. This is to be available with NCERT. This book is not available in Mumbai in the market. Its strange that though this book is not available anywhere or short in print, it is recomanded for last few years for CBSE course. Is anybody that how to get it.
Link | June 23rd, 2008 at 3:06 pm
jay wrote,
i need giudes for NCERT text books for class -VII, can you help
Link | June 26th, 2008 at 8:35 am
Himanshu wrote,
I am getting the same error as slawek
I got such error when trying to launch the first python script… I need to take a look and see what’s going on
Traceback (most recent call last):
File “get_ncert_books_data.py”, line 97, in
main()
File “get_ncert_books_data.py”, line 63, in main
book_links = get_book_urls(listing_page_html)
File “get_ncert_books_data.py”, line 19, in get_book_urls
script_data = script_data[0]
IndexError: list index out of range
Thanks,
Slawek
Link | June 27th, 2008 at 11:10 pm
Ila wrote,
Thank you so much! I am Indian-born but raised in the US and now I’m learning Hindi… the books for little kids are perfect for a beginner like me. I appreciate it so much!
Ila
(in Colorado, USA)
Link | June 29th, 2008 at 11:56 pm
NITIN wrote,
NCERT SITE IS A VERY GOOD SITE [.] I AM NOT ABLE TO GET 6TH STD MATHS BOOK WITH SOLUTIONS {Q & A} , CAN YOU KINDLY SEND TO ME.
Link | July 2nd, 2008 at 4:28 pm
navtej wrote,
how can i download chapters plz help me
Link | July 4th, 2008 at 11:14 pm
Jennefer wrote,
Oh thank God! I found it all here. Thank you very much. You don’t know how happy i am. I just wanna give you an award.
Link | July 5th, 2008 at 8:36 am