
Almost a year since my last update on Project:Photon, and progress is depressingly low.
So for both new readers and those without a perfect memory, a quick reminder of Project:Photon's primary objective: a small and silent device for high-capacity storage with full backup facilities (mirroring and versioning), capable of access from both MacOS X and Windows.
This posting is an update on where I have reached.
Okay, let's start with a photo:
As you can see, it's still using the lid of a shoe box, with a cornflakes box providing very crude protection from dust/etc.
I spent a fair amount of time looking for a suitably sized container, but I've given up on that. At some point I'll buy some sheets of acrylic and build a box myself.
So, on the small front, I've been reasonably successful. I would have liked it more slimline, but it's looking to be 2-2.5 inches, which is good enough for me.
But is it silent? Not at the moment. The hard-drives make a noise. I'm hoping this will be gone once I get it in a case though. The drives are a pair of WD5000KS SE16, and according to silentpcreview.com, the drive is the quietest 3.5" desktop drive so I couldn't really do much better there.
Once I have a case I might investigate if I want to get any sound-proof materials added, but I doubt that will be necessary.
Next part: high-capacity. Well, at the time I bought the components, half a terabyte was uncommon and quite impressive. Nowadays, everyone and their dog has 500GB drives. Still, it will be plenty of space for me to store my data for the next few years.
That's 3/3 this far. Unfortunately it falls down at the next two points: mirrored backups and continuous versioning. And I'll be going a little more in-depth into where I've got on these two subjects...
The purpose of buying two drives was to implement mirrored backup, known as RAID-1. This means that every file is written to both disk drives at the same time, and in the event of a disk dying, data is backed up on the other. It has the added benefit of making reading data faster, as half of the data can be read from each disk at once.
Although there are RAID options in the BIOS, the Jetway J7F2 mainboard does not actually support hardware RAID. Apparently you need to install drivers for the operating system to make it work that way.
Given that I can already do Linux software RAID, there didn't seem to be a significant advantage to attempting to get drivers to work, especially since the Linux distribution I'm using is not one of the major ones and so drivers are less likely to work smoothly.
So far I've had mixed success with mdadm (the Linux RAID tool), and I'm not certian what my next steps here will be.
Part of me is thinking that buying a hardware RAID controller is the way to go. These are not cheap and would require a big PCI card being slotted into my device, making it bigger than I'd like.
With the my current disk space problems (My Windows partition has just 85MB free!!!) I am very much thinking of skipping mirroring for now, and perhaps just implementing a regular sync.
My second backup methodology is continuous data protection. I find it both alarming and upsetting that I always have to explain what this is to people. CDP is instant file versioning. If you're familiar with version control (CVS/SVN/etc), think of it as an auto-commit every time you save a file.
The reason for CDP/versioning is simple - whilst mirroring data protects against disk failure, it does not protect against user error: accidentily deleting or over-writing a file. CDP does protect against this.
There are many devices/software available that claim to implement CDP, but what they actually implement is snapshotting - backing up data at a regular interval, maybe every minute, maybe once an hour. That is not true continuous data protection - as the name states, with CDP, the protection is continuous, not in intervals. With CDP, every individual change is backed up immediately, and there is no chance of missing any changes.
So, how do I plan to implement CDP in Linux?
Initially I looked at a couple of virtual file systems (Wayback and CopyFS), that sat over the real filesystem and manage data going in and out of it. I didn't like this concept, so I moved on to looking for software which actually hooks into the kernel disk write events. I found and discounted FAM and gamin. FAM used dnotify, which was inefficient and buggy. Gamin used inotify, the successor to dnotify, but was essentially a replacement for FAM. I considered for a while using inotify directly, through inotify-tools. However, there is a fatal flaw with using inotify for CDP - all it does is notify, so if you change a file three times you will get three notifications, but no guarantee that when you deal with the first notification the third file change has not already occurred.
Continuous Data Protection requires write-queueing, so that I can deal with events in the order they happen, and be certain that I'm looking at the right data.
So, I am now attempting to get Dazuko working. Dazuko hooks in at a very low level and gives the queueing that I need. Unfortunately, with it being at such a low level, it requires a kernel rebuild to get it going, and this is something I have no experience with.
I am again faced with a decision to deliberate over: I need my NAS up and running ASAP, so do I leave the CDP to one side for now?
I want to give a quick bit of information about the Linux distribution I am using, Puppy Linux.
Puppy has made many things more difficult, but it is also the only distro that I've made real progress with.
Puppy runs entirely in RAM, but can store data and preferences in a file which gets written back to disk at shutdown (this minimizes writes and extends the life of USB pen drives and flash disks).
Other Live disk distros I've used seemed to prefer taking over the entire 512MB CompactFlash disk, and requiring another disk be mounted in order to store changes.
Whilst Puppy is able to use .deb and .rpm packages, these often seem to make assumptions about things which exist on their target system, but not in Puppy, and I've spent a lot of time searching for various dependencies in order to get simple tools working.
It would be nice if there was another Linux distro that offered the same advantages as Puppy, but was more mainstream and easier to get software for, more likely to get help with, and so on. So far I've not found anything.
So, that's a quick update of where I currently am with Project:Photon. Any comments/questions/advice/whatever are welcome.
There have been 2 comments.