I have been playing around with Wikipedia data and tried doing some byte pushing on my Dreamhost web space. Since this is shared web space, the processing power and memory available are limited. I was able to create database tables in mysql by parsing the wiki xml dump and some extra processing as well to construct some custom derived tables but I had to constantly write code keeping in mind the resource constraints. Although it is fun doing this, it detracts from my actual goal (wikipedia data). I decided to build my own “server” for doing stuff like this, which would double as a “home theatre”.
Tons of memory
I will need a lot of memory to keep data in memory for doing various things. This is my primary need. Also, Linux loves extra memory as it tries to use every last bit to cache programs and data from disk - potentially a great way to boost performance.
Reasonable processing capability
Top of the line processors cost a ton and I am not after the best in processors. I want a decent processor that has a good amount of L2 cache and is overclockable.
I will be trying to keep the cost down and pick components that are at the sweet spot on the price-performance plot.
High memory bandwidth and low latency
Since my requirement is to move GB’s of data into the processor and back to the hard disk, I need the memory bandwidth to me good between the processor - RAM - HDD.
Good on-board graphics and audio support
A graphics chipset that is well supported by Linux so I can turn on the eye-candy in Ubuntu.
Support for over-clocking
The processor, RAM and motherboard should be amenable to overclocking so I can crank up the heat!
Less power consumption (because I plan to run this machine 24X7)
Since I will be running this machine all the time, I want minimal power to be consumed. Very few people realize that expense for electricity is a significant part of operational cost.
I plan to expose this machine on the internet. To make it more secure, I plan to host the webserver in a DMZ. I want to run a virtual machine (Ubuntu) as DMZ. Virtualization support can be handy here.
Ports (Firewire, HDMI, USB, E-SATA)
USB2.0 ports are pretty standard now-a-days. I want to have a couple of SATA’s available for adding more HDD at a later date. HDMI will be useful if a buy a flat-panel TV some time.
When I bought a computer a couple of years back, the value choice was obviously AMD. But the equation has changed since then and after a little research I found Intel core 2 duo processors beat the crap out AMD’s. The only area where AMD processors did better was memory bandwidth because of HyperTransport.
Intel released a line of quad core processors recently. Most of these are priced very high except Intel Core 2 Quad 6600 available in Bangalore, India for a little over Rs 11,000 (approx \$280). I was very tempted to go for this one but chose Intel Core 2 Duo E4500. Although this one has only 2MB L2 cache and does not have virtualization support, I picked it for its excellent value (Rs 4800 or approx \$120) and low power consumption (65W peak). Also I plan to run processes in batch mode which can run over-night, so an additional few minutes taken because of lower muscle is not an issue for me.
The E4500 model is ideal for overclocking.
Since I am trying to build a low-cost machine, I decided not to consider ECC and buffered RAM (the motherboard would not have supported it anyway). I got 4 x 1GB Transcend 800Mhz sticks for Rs 1350/- (\$34) each so I could use the dual-channel slots on my motherboard to the max. These are quite overclockable.
The criteria for motherboard was that it should support the processor’s FSB (and more for later) and memory bus speed. In addition to these, it would have to support atleast 4GB ram, have 4 SATA ports, firewire, Gigabit Ethernet, HDMI, atleast 5.1 onboard sound, decent onboard graphics. I picked GA-G33-S2H from Gigabyte based on Intel G33 chipset (Rs 4550/- or \$116) (specs). The BIOS has support for overclocking (voltage and multipliers can be tweaked).
Seagate Barracuda 320GB 7200 rpm, Samsung DVD/CD read-writer, Microsoft mouse, i-Ball keyboard, Viewsonic Touchscreen monitor.
I put the machine under load by making it crunch some Wikipedia data and although I have not measured the timings, I know from previous observation that it is blistering fast. I have Compiz turned on in Ubuntu and the UI is highly responsive. When processing big files, I have noticed that the files are pre-cached and this does speed up things quite a bit. I am convinced that adding more RAM is a good option instead of going for the best processor in the market to improve responsiveness and perceived UI speed. 2GB of RAM should be enough for desktop usage.
The processor, motherboard, ram and cabinet cost me Rs 19,300 (\$494).
Calculating PI using bc
time echo “scale=5000; 4*a(1)” | bc -l -q
Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
Hardinfo is handy benchmarking tool for Linux.
AMD Athlon(tm) 64 Processor 3800+
Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
The hardinfo test results are CPU oriented. I did not build this machine to excel at these. This machine is meant for munching on GB’s of data. So memory bandwidth between CPU - RAM - HDD is a more important factor. I have not found a good benchmark to calculate this aspect. Do mail me if you know such a benchmark.
Overall, I am extremely happy with this configuration and definitely recommend it if you have similar requirements.
how to choose hardware components
CPU comparison charts at Tom’s Hardware
Computer Warehouse, Bangalore - Price List
Note: I looked up the prices in computer warehouse web site and bought the components in S.P road, Bangalore in Ankit Infotech.