Loading [MathJax]/extensions/TeX/AMSsymbols.js
Jump to content

Recommended Posts

Posted

Several people are working on collaborative P2P backup software.

 

The general idea is to use a distributed hash table (sometimes called a tuplespace) in conjunction with erasure coding. This means that peers housing a copy of a portion of the data you've backed up can go down, but thanks to the erasure coding none of your data is lost.

 

You might check out http://allmydata.com. They're doing something similar.

 

In terms of "danger", you'd hope their software signed (and hopefully encrypted) your data with a key held by you. Otherwise, yes, there is potential danger with data tampering.

Posted

the array will lose data if two drives fail. the odds of that happening before i can get a replacement drive are acceptably low. also, get 2/3 of the space made more financial sense.

 

how exactly is is more likely to fail than a single drive anyway?

Posted

RAID 5 stripes data across several drives. There is no redundancy. The chance of failure of the whole cluster is greater with each disk you add, because one failure renders the cluster useless.

 

RAID 1 mirrors data, which lowers your chance of failure.

Posted

i'm just saying RAID 5 gives me some added read performance and a bit more data security.

 

sure, it isn't as good as mirroring the data entirely but i don't expect a double drive failure unless something pretty extreme happens. and if there was a double drive failure it would bork a RAID 1 as well.

Posted
  insane_alien said:
how exactly is is more likely to fail than a single drive anyway?

 

Very high. If you're like most people you bought matched drives from the same batch, and probably installed them new.

 

RAID-5 ensures drives are exposed to the same environmental conditions and the same sort of read/write load.

 

Because everything is so symmetrical between all the drives, there's a good chance drive failures will be too. Whatever problem caused one drive to fail is likely exhibited by the other drives in the same batch, especially if they've been subject to the same environmental conditions and load.

 

Couple this with another problem: after installing a replacement drive, all drives in the array will be under load as all data across the entire array is read off the original disks and the parity data recalculated and written to the new disk. Placing disks under load for such a prolonged period of time, especially when an extremely similar drive just failed, is a great opportunity to bring about failures in other drives.

 

All that said: RAID is no replacement for backups. If, for whatever reason, a software problem corrupts your filesystem, your RAID will happily write out the corruption along with parity to ensure that your corrupt filesystem is safe from a single drive failure. But it's still a corrupt filesystem...

Posted
  bascule said:
You might check out http://allmydata.com. They're doing something similar.

There doing something costly

 

http://www.download.com/P2P-Backup/3000-2242_4-10496612.html

is doing something free

 

Is there a cheap raid that I can plug any combination of disks I happen to have into and it will tell me if one fails and one failing won't result in me losing anything?

 

No comments, no comments on CNet aether (http://www.cnet.com.au/downloads/0,239030384,10496612s,00.htm) and nobody has rated it on CNet, Is it that new? or has nobody had the courage/foresight/stupidity to be the first try it yet?

 

Id like to try it but can't find any evedance at all from anyone anywhere on the web to say it's even real.

Posted
  alan2here said:

 

When it comes to preserving the integrity of my data, I wouldn't go with a random free service from a no name person you can't even contact.

 

Peer to peer backup is hard. Tahoe, which is the software underlying Allmydata, breaks your data up into chunks and typically sends (I believe) 12 chunks to individual peers, only 4 of which you actually need to get a particular chunk of your data back (it uses erasure encoding)

 

With this software, you have no clue what they're actually doing, beyond it being P2P, and as far as I can tell there's no one to vouch for it whatsoever.

 

  Quote
Is there a cheap raid that I can plug any combination of disks I happen to have into and it will tell me if one fails and one failing won't result in me losing anything?

 

You might check out Drobo, but it certainly doesn't fall into the cheap category:

 

http://www.drobo.com/

Posted

you could put a bunch of disks together with lvm2. i don't think you can setup hotswapping and redundancy of the type seen with the drobo(i am slightly suspicious of that video those first two drives must have been pretty small) but it would make the disks operate as one regardless of their size/speed/cache you could even chuck a usb/CD/DVD/firewire drive on the lvm array.]

 

EDIT: hmmjust looking around on the site, with 4 1TB drives you would get 2.7TB of space for data storage. the rest is redundancy. i suspect it uses something like lvm with a virtual RAID.

Posted
  insane_alien said:
i am slightly suspicious of that video those first two drives must have been pretty small

 

From what I understand the device is actually filesystem aware and only copies filesystem data and metadata, rather than being just a dumb storage device which ensures the entire disk is properly mirrored whether there is data on it or not.

 

So it's not so much that the disks were small as there wasn't that much actual data on them.

 

  Quote
EDIT: hmmjust looking around on the site, with 4 1TB drives you would get 2.7TB of space for data storage. the rest is redundancy. i suspect it uses something like lvm with a virtual RAID.

 

It's a lot closer to Sun's ZFS:

 

http://opensolaris.org/os/community/zfs/

Posted
  Quote
So it's not so much that the disks were small as there wasn't that much actual data on them.

 

ahh its just he was saying that it's at 85% capacity a lot.

 

as for ZFS, it looks awesome wonder if/when it'll be ported(properly) to linux

Posted
  insane_alien said:
as for ZFS, it looks awesome wonder if/when it'll be ported(properly) to linux

 

Since ZFS is licensed under the CDDL, a proper port (i.e. linked into the kernel) isn't possible because the CDDL is incompatible with the GPL.

 

Someone's trying to run it on FUSE, but last I read he was encountering data corruption (due to bugs in FUSE itself) and beyond that, painfully slow performance.

 

Your best bet for ZFS is probably FreeBSD 7.0 (RC1 is out now) or Nexenta (Ubuntu/OpenSolaris)

Posted

yes i know about the liscence incompatabilities but i also heard solaris was considering a switch to GPL and that the GPLv3 would allow it(linux kernel is GPLv2, well, parts of it at any rate.)

 

personally, i don't see what the fuss is, they are both open source so why should there be a problem(okay, i admit right now that i have a naive view on legislation regarding this sort of crap, it has never made much sense to me and i doubt it ever will as i have a very utilitarian veiwpoint.)

 

i'm sticking to arch, might try the FUSE port later seeing as ntfs-3g functions very well for me. for the time being i'll stick with my current setup but i'm getting a craving for that drobo thing, looks cool if nothing else. maybe when i fill up my drives(again).

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.