Finally have a use case where I'd actually benefit from a raid cache :o

Started by Tom, February 13, 2017, 08:20:39 AM

Previous topic - Next topic

Tom

I've been mucking with CI for myself, and for work. On my local box, I've run into a bit of a IOPs bottleneck :(

I've got it set up to build multiple docker images in parallel. Not only are they installing quite a few large things, but docker seems to be pretty inefficient when building images. TONS of random accesses and tar commands... Just how it works I suppose.

Linux md has gotten new support for a raid5 cache, which can run in two modes, write-through, and write-back. former will write data to both the cache disk and the array before returning an OK, and the latter will return OK once its comitted to the cache. assuming a fast enough cache disk, and battery (and on board capacitor storage on the ssd) backup, data loss from a ungraceful shutdown shouldn't happen too often  in write-back mode.

Being that I'm broke, and we're trying to save, I definitely won't be getting any cache disks for a while. So I'm going to try a raid10-far config. Right now the lvm volume group is defraging itself, and then I'll shrink it, and convert the raid5 to a raid10-far using the online conversion support in md raid :D

My poor poor 4x1TB raid5 that I use for VM disks on my big server is struggling pretty hard :(
<Zapata Prime> I smell Stanley... And he smells good!!!

Melbosa

Good luck!  At least you only dealing in the 4 disc scenario.  I've seen our large SANs relevel arrays that took weeks to a month to complete.
Sometimes I Think Before I Type... Sometimes!

Tom

Quote from: Melbosa on February 13, 2017, 09:19:34 AM
Good luck!  At least you only dealing in the 4 disc scenario.  I've seen our large SANs relevel arrays that took weeks to a month to complete.
Yeah, I've had a big raid5 or 6 array take days to reshape/repair.

It'd probably be faster if I just slapped some disks on there, and coppied things over. but meh. we'll see how long this lvm defrag takes.
<Zapata Prime> I smell Stanley... And he smells good!!!

Tom

Ok, so I lied. I just slapped two old 1TB drives in there as a raid0, added it to the vm0 LVM volume group (where vm volumes go), and told LVM to move all data off the old raid5.

This will take significantly less time. The old pvmove I did was doing a bunch of copying and causing all kinds of seeking and random access. Since I posted first it got to like 25%? Felt like it was taking too long. The new process will happen a lot quicker. It's going at a nice 50-130MB/s vs the old like ~10-20MBps. Already at 3% and I just started it.

Once that's done, I'll remove the old md0 raid5 array. Maybe (maybe not...) replace them with the 2tb drives and create a new raid10-far array for the vms to live on. Should give (near) the performance of raid0 with the security of raid1. It's a special raid10 too, its integrated, so theres no raid0+1 or raid1+0 layering going on and is directly and easily extendable.

I was going to set up my little external SAS enclosure, but the cable I have for it is /so/ short, and I can't reorganize things atm such that the enclosure will fit close enough to the big vm server for that to work. :(

Ah well.

p.s. comming up on 5% now :D
<Zapata Prime> I smell Stanley... And he smells good!!!

Melbosa

Sometimes I Think Before I Type... Sometimes!

Tom

Thanks :)

In the end I hope I notice a real difference. But that raid5 read modify write penalty is pretty harsh, so a raid10 should help a lot for small read/write's and iops in general.
<Zapata Prime> I smell Stanley... And he smells good!!!

Tom

Dang. Just created the raid10-f2 array, and it'll be another 11~ hours before it's done resyncing. Wish I could tell it to skip that step and assume everything is zeros, but that'd require logic it just doesn't have (keeping track of every single block and which one has been "allocated" once already).

I technically could just zero all four drives an "--assume-clean" when I create, but meh. This'll be done in the morning tomorrow so I'm not too worried.
<Zapata Prime> I smell Stanley... And he smells good!!!

Melbosa

Sometimes I Think Before I Type... Sometimes!

Tom

No kidding. Not quite finished yet. 84%. another couple hours to go at least. but as we all know, spinning rust gets slower the further you go.
<Zapata Prime> I smell Stanley... And he smells good!!!

Tom

<Zapata Prime> I smell Stanley... And he smells good!!!

Tom

And now pvmove'ing the old raid0 over to the new raid10. woo. probably be a while yet again lol.
<Zapata Prime> I smell Stanley... And he smells good!!!

Tom

<Zapata Prime> I smell Stanley... And he smells good!!!

Lazybones

Quote from: Tom on February 14, 2017, 05:06:38 PM
Man, it was going pretty good, except now its down to 9MB/s :(
If you are using spinning rust the speed will depend on if it is reading mostly inner or outer tracks as well as how much seeking is needed.

Tom

Quote from: Lazybones on February 14, 2017, 05:37:42 PM
Quote from: Tom on February 14, 2017, 05:06:38 PM
Man, it was going pretty good, except now its down to 9MB/s :(
If you are using spinning rust the speed will depend on if it is reading mostly inner or outer tracks as well as how much seeking is needed.
Yeah, the inner track issue wouldn't slow it down this much. I'm not sure why It'd drop this low. there should be little if any random seeking involved.
<Zapata Prime> I smell Stanley... And he smells good!!!

Tom

So wow, performance on this raid10 array is ATROCIOUS. I don't exactly know why its so bad. Its worse than single disk performance for sequential writes. Going to have to spend some time figuring that out.
<Zapata Prime> I smell Stanley... And he smells good!!!