Tuesday, November 21, 2006

RAID setup!

Well, after quite a battle I finally got RAID to work on my desktop machine!

I love Gentoo, and every time I try something new it has been one big adventure and sometimes a steep learning curve.

Not to be outdone, the documentation for setting up RAID is very precise and to the point, but there is a catch in Gentoo, not all installations is the same, therefore there is the documentation, but it is generic. If the popo hits the fan, then it is hours of Googling and searching various forums to fins out where the problem is.

In my case the first setup was perfect, except that udev did not play nice, and I only found this out three days later after countless hours trying various ways to force my system to boot with a broken udev.

With the latest unstable udev-103 I got one long list of errors stating something was not right with my udev installation. After many reboots and different tries of reinstalling every system package (including of course udev) and trying to configure my system, I finally read on the udev mailing list that I'm not alone. They suggested downgrading to udev-096, but still no joy.
The errors were less, but something did not quite fit...

The third and final day I reverted back to the traditional install (I have a stage-4 backup of my entire system so it takes less than 30 minutes to reinstall) after yet another failed attempt at RAID, only to be greeted by the exact same error!

All the time I was convinced that it was a RAID/udev issue and this led me to the solution! Searching now was much easier since the error can be replicated on various setups and finally I found the very simple solution.

The udev-103 symptoms were a never ending scrolling error being written to /var/log/syslog:

udevd[826]: get_ctrl_msg: unable to receive user udevd message: Socket operation on non-socket
udevd[826]: get_netlink_msg: unable to receive kernel netlink message: Socket operation on non-socket
udevd[826]: get_ctrl_msg: unable to receive user udevd message: Socket operation on non-socket
udevd[826]: get_netlink_msg: unable to receive kernel netlink message: Socket operation on non-socket
udevd[826]: get_ctrl_msg: unable to receive user udevd message: Socket operation on non-socket
udevd[826]: get_netlink_msg: unable to receive kernel netlink message: Socket operation on non-socket
etc.


This output would keep on scrolling by until you hit reboot. The simple cause was that udev did not populate /dev after the initial install, and the rather simple solution?

Boot from LiveCD
create the RAID arrays
mount /dev/md(x) /mnt/gentoo
cp -rp /dev/* /mnt/gentoo/dev/
reboot


As simple as that! Once that was done the machine booted up no problems and I can finally shout: "Eureka!"

For further reading:
BIOS RAID setup
Software RAID setup
RAID0 with lilo
The solution to my problem
I read allot more articles than those mentioned, but they are the ones that helped me on the right track.

Happy RAIDing!

No comments: