Archive for November, 2012

06
Nov
12

Swapping Raid Disks – Part 3 Something crazy with dd

Ok so in earlier parts of this series i showed how i migrated two raid partitions of a disk i wanted to use exclusively for data.  The original disk topology looked as follows

Disk /dev/sda: 243031 cylinders, 255 heads, 63 sectors/track

Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sda1   0+ 45689- 45690- 367001600 fd Linux raid autodetect
/dev/sda2 * 45689+ 45702- 13- 102400 fd Linux raid autodetect
/dev/sda3   45702+ 243031- 197329- 1585042432 83 Linux
/dev/sda4   0 - 0 0 0 Empty

/dev/sda1 & /dev/sda2 have now been migrated away and can be reclaimed   The problem is i dont want 3 separate partitions i just want one partitions which will include  all the space.   For the benefit of the reader. The disk is a 1.8 TB disk; the first 2 partitions take up 300GB and the last partitions takes up 1.5TB of data.  I want to end up with one partitions that contains 1.8TB and i don’t want to delete any of the data on the data partitions (about 90%) used.

I want to resize an ext2/3/4 partition so i my first port of call is resize2fs.  I know that i can change the disk topology and get resize2fs to grow the disk;  however i have never had to grow a disk backwards, so off to the man pages.  It didn’t take long before you see a pretty big warning

“make sure you create it with the same starting disk cylinder as before! Otherwise, the resize operation will certainly not work, and you may lose your entire filesystem”

Ok so it looks like resize2fs is out unless i can first move the sdc3 partition to the beginning of the disk.  hanging around in irc i had a lot of people telling me to give (g)parted a go, although most where sceptical if it was possible at all.  I took a look at (g)parted however as far as i can tell all (g)parted is only able to copy a partitions from one location two another.  If i had 1.5TB avalible space at the beginning of the disk i would have been able to copy /dev/sdc3, change the last cylinder, resiz2fs and everything would be sorted.  however As mentioned we only had 300GB avalible.

The night was getting late and the hacker in me started to consider some more exotic, dangerous and some would say down right stupid solutions.  I eventually arrived at dd.  I thought perhaps i could just to something like

dd if=/dev/sdc3 of=/dev/sdc bs=1M conv=noerror

In theory this would create on block device /dev/sdc3 containing the file system on /dev/sdc3.  The obvious problem here is that dd will start over writing the beginning of /dev/sdc3 before it has finished copying all of the data from /dev/sdc3.  Would this work? I figured in theory dd would read the partition table into memory, pick the two start positions and continue the copy process until it reaches the end of /sdv/sdc3.  It would neither know or care that we, or more accurately it, was overwriting the beginning of /dev/sdc3

At this point i should but if the a disclaimer

THIS IS VERY DANGEROUS, POSSIBLY STUPID.  

DO NOT DO THIS UNLESS YOU KNOW WHAT YOU ARE DOING

AND YOU DONT MIND LOSING YOUR DATA

Ok that out of the way i gave it a go and to my supprise it seem to have worked. If you are doing this on a remote machine i strongly recommend you use tmux or screen, if the dd is interrupted you have almost certainly lost your data.

[root@server ~]$ df -h | grep sdc
/dev/sdc3 1.5T 1.4T 98G 94% /data/disk1
[root@server ~]$ umount /dev/sdc3
[root@server ~]$ dd if=/dev/sdc3 of=/dev/sdc bs=1M conv=noerror
[root@server ~]$ mount /dev/sdc /data/disk1
[root@server ~]$ df -h | grep sdc
/dev/sdc 1.5T 1.4T 98G 94% /data/disk1
[root@server ~]$ umount /dev/sdc
[root@server ~]$ e2fsck -f /dev/sdc
[root@server ~]$ resize2fs /dev/sdc
[root@server ~]$ mount /dev/sdc /data/disk1
[root@server ~]$ df -h | grep sdc
/dev/sdc              1.8T  1.4T  462G  75% /data/disk1

So there we are looks like it wasn’t so insane after all. Comments most welcome

UPDATE: I had a problem mounting the new disk via its UUID.  tune2fs -l showed the same UUID which was present in /etc/fstab;  However using the mount command didn’t work.

[root@server ~]# tune2fs -l /dev/sdd | grep UUID
Filesystem UUID:          67966bfd-92b5-47b8-a545-277a4bea8be5
[root@server ~]# grep 67966bfd-92b5-47b8-a545-277a4bea8be5 /etc/fstab 
UUID=67966bfd-92b5-47b8-a545-277a4bea8be5 /data/disk2             ext4    noatime,nodiratime,nodelalloc        1 2
[root@server ~]# mount /data/disk2/
mount: special device UUID=67966bfd-92b5-47b8-a545-277a4bea8be5 does not exist

I have fixed this by doing running the following steps, not sure which one fixed it will test on the next system

[root@server ~]# cd /dev/disk/by-uuid/
[root@server ~]# ln -sv ../../sdd 67966bfd-92b5-47b8-a545-277a4bea8be5
[root@server ~]# mount /data/disk2/
[root@server ~]# umount /data/disk2/
[root@server ~]# rm 67966bfd-92b5-47b8-a545-277a4bea8be5
[root@server ~]# blockdev --rereadpt /dev/sdd
[root@server ~]# partprobe /dev/sdd
[root@server ~]# mount /data/disk2/

Also note that this server hasn’t been rebooted yet. Will update after it has.

Edit: System rebooted without issue, haven’t re-tested the mounting by uuid (or i don’t remember the results :)). however i preformed this procedure a couple of times without issue

Advertisements
06
Nov
12

Swapping Raid Disks – Part 2 Fixing /boot

In the previous part post Swapping Raid Disks – Part 1 MDADM we showed how to completely swap all disks in a Linux Software Raid array.  One of the Raid partitions we swapped was the /boot partitions.  We now need to ensure that grub is installed to these new partitions so the system can still boot once we destroy the old disks.  You can use the device map file and grub-install to do this however i will be using the grub cli.

The first thing to do is run grub and use the find command to see how grub addresses the disks with the /boot partitions.  The (hdX,X) values will more then likley be different on your system. you will also need to do this as root

[root@server ~]#grub
Probing devices to guess BIOS drives. This may take a long time.
GNU GRUB  version 0.97  (640K lower / 3072K upper memory)
[ Minimal BASH-like line editing is supported.  For the first word, TAB
lists possible command completions.  Anywhere else TAB lists the possible
completions of a device/filename.]
grub> find /grub/stage1
find /grub/stage1
(hd0,1)
(hd1,1)
(hd8,1)
(hd9,1)

From this we can see that grub can see 4 /boot partitions ((hd0,1), (hd2,1), (hd8,1) & (hd9,1)). We now need to ensure that grub is installed on all of these partitions. If you know which disks are the new disks and which are the old disks you can get away with installing grub just on the new disks but it will do know harm to install on all disks.

Installing grub requires three steps. Telling grub which device we will be working on with a map; Telling grub which partitions is the boot/root partitions and installing grub

grub> device (hd0) /dev/sda
device (hd0) /dev/sda
grub> root (hd0,1)
root (hd0,1)
 Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd0)
setup (hd0)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"...  26 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd0) (hd0)1+26 p (hd0,1)/grub/stage2 /grub/grub.conf"... succeeded
Done.

Here is an example of hd9 (/dev/sdj) as well

grub> device (hd9) /dev/sdj
device (hd9) /dev/sdj
grub> root (hd9,1)
root (hd9,1)
 Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd9)
setup (hd9)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd9)"...  26 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd9) (hd9)1+26 p (hd9,1)/grub/stage2 /grub/grub.conf"... succeeded
Done.

you will also need fix the grub.conf/menu.1st file to ensure it specifies the correct root drive to use. In my case i know my boot partitions are on hd8 & hd9 however as above you can create entries for all hdX partitions show in the find command. Below is a copy of my modified menu file

title CentOS
	root (hd8,1)
	kernel /vmlinuz-2.6.32-220.17.1.el6.x86_64 [Removed for simplicity]
	initrd /initramfs-2.6.32-220.17.1.el6.x86_64.img
title CentOS (if the first disk in the array dies you will need to use this)
	root (hd9,1)
	kernel /vmlinuz-2.6.32-220.17.1.el6.x86_64 [Removed for simplicity]
	initrd /initramfs-2.6.32-220.17.1.el6.x86_64.img
title CentOS (backp hd0 wont work after you format the old disks)
	root (hd0,1)
	kernel /vmlinuz-2.6.32-220.17.1.el6.x86_64 [Removed for simplicity]
	initrd /initramfs-2.6.32-220.17.1.el6.x86_64.img
title CentOS (backp hd1 wont work after you format the old disks)
	root (hd1,1)
	kernel /vmlinuz-2.6.32-220.17.1.el6.x86_64 [Removed for simplicity]
	initrd /initramfs-2.6.32-220.17.1.el6.x86_64.img

If you are using a system with grub v2 you can use the search command to set the root instead of having 4 seperate menu items. CentOS dose not support grub v2 yet and i have not needed to do this on another system. however Arch linux has a good article on grub v2 which should help
At this point you should reboot and check that at least the first two menu options work and allow you to boot. If they do not the last two options should work. Review the steps you have taken to see if you have made any errors. If they come up successfully you should be able to format the original partitions and reuse them.

In my situation i wanted to keep all the data and reclaime the space at the beginning of the disk so i decided to do something crazy with dd (part 3 coming soon)

06
Nov
12

Swapping Raid Disks – Part 1 MDADM

We have a 2 disks (/dev/sda & /dev/sdb) which contains 2 SW RAID partitions partitions and one data partition.  The SW raid partitions where used for /boot and /.  the data partition was used as an hadoop data partitions.  hadoop is designed to perform best by making sequential reads/writes from a disk.  By having OS partitions on the same disk as the data partition we noticed that we where causing some issues with hadoop performance and decided to move the OS partitions to dedicated disks (/dev/sdi & /dev/sdj).

This was relatively simple.  New disks where installed and we used sfdisk and mdadm to configure them.

The first thing to do was to dump the partitions table for the current disk(s)

sfdisk -l /dev/sda -O partition

This preduced the following partitions table

Disk /dev/sda: 243031 cylinders, 255 heads, 63 sectors/track

Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sda1 0+ 45689- 45690- 367001600 fd Linux raid autodetect
/dev/sda2 * 45689+ 45702- 13- 102400 fd Linux raid autodetect
/dev/sda3 45702+ 243031- 197329- 1585042432 83 Linux
/dev/sda4 0 - 0 0 0 Empty

However we did not want the data partitions on the new disks so we had to modify the partition table a little first so it looked like this:

Disk /dev/sdi: 243031 cylinders, 255 heads, 63 sectors/track

Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sdi1 0+ 45689- 45690- 367001600 fd Linux raid autodetect
/dev/sdi2 * 45689+ 45702- 13- 102400 fd Linux raid autodetect
/dev/sdi3 0 - 0 0 0 Empty
/dev/sdi4 0 - 0 0 0 Empty

We configure our new disk as follows:

sfdisk -I partition --force /dev/sdi

We then updated the raid configuration to add the new partitions

mdadm --manage /dev/md0 --add /dev/sdi2
mdadm --manage /dev/md1 --add /dev/sdi1
mdadm --manage /dev/md0 --fail /dev/sda2
mdadm --manage /dev/md1 --fail /dev/sda1
mdadm -D /dev/md0
mdadm -D /dev/md1

Wait until the sync has completed.  you can monitor progress with the below command

watch cat /proc/mdstat

Once all data has synced to both raid partitions repeat the above steps for sdb/sdj.  Once the raid partitions are configured in such a way that sdi and sdj or the two active partitions and have been completely synced you can remove sda and sdb from the raid configueration

mdadm --manage /dev/md0 --remove /dev/sda2
mdadm --manage /dev/md0 --remove /dev/sdb2
mdadm --zero-superblock /dev/sda2
mdadm --zero-superblock /dev/sdb2

mdadm --manage /dev/md1 --remove /dev/sda1
mdadm --manage /dev/md1 --remove /dev/sdb1
mdadm --zero-superblock /dev/sda1
mdadm --zero-superblock /dev/sdb1

continue to swapping raid disks – part 2 fixing /boot for info on how to fix grub and the /boot partitions