[This is really a HPUX question, so if anyone can point me to a better forum in which which to post this question, then please do!]
Hi list,
One of our B180L controllers crashed out the other day during a large digital compilation (in excess of 270,000 vectors, plus two large homing loops of 65,536 times 30 vectors, plus 'debug' compile option). I strongly suspect the failure is associated with this (maybe a large temporary file somewhere filling up the disk?), but when the system re-booted, it spotted a 'bad cylinder group' on the disk and attempted a 'repair'. The automatic repair wasn't able to fix it, so it booted into root, to allow a 'manual fix' using 'fsck'. We have tried this several times, and on one occasion it indicated that the bad files were in a directory we simply don't care about, but the 'fix' hasn't worked.
Thus we cannot get the system to boot into the normal windows 'common desktop environment' in order that we might check and/or delete any trash directories/whatever - the logical volume affected is the one containing not only the directory for the windows system ('X11'), but also the main 'var/hp3070' stuff etc.
We really can't believe that we are facing the prospect of either a full system re-build from scratch, or from a system recovery tape, for want of knowing a specific command to tell the operating system to simply 'ignore' the bad section of disk, so that it will boot up as normal (hell, if this was MS Windows you probably wouldn't even know about it, it would just ignore the problem automatically).
Any suggestions or help gratefully received!
Thanks, Tim
Hi list,
One of our B180L controllers crashed out the other day during a large digital compilation (in excess of 270,000 vectors, plus two large homing loops of 65,536 times 30 vectors, plus 'debug' compile option). I strongly suspect the failure is associated with this (maybe a large temporary file somewhere filling up the disk?), but when the system re-booted, it spotted a 'bad cylinder group' on the disk and attempted a 'repair'. The automatic repair wasn't able to fix it, so it booted into root, to allow a 'manual fix' using 'fsck'. We have tried this several times, and on one occasion it indicated that the bad files were in a directory we simply don't care about, but the 'fix' hasn't worked.
Thus we cannot get the system to boot into the normal windows 'common desktop environment' in order that we might check and/or delete any trash directories/whatever - the logical volume affected is the one containing not only the directory for the windows system ('X11'), but also the main 'var/hp3070' stuff etc.
We really can't believe that we are facing the prospect of either a full system re-build from scratch, or from a system recovery tape, for want of knowing a specific command to tell the operating system to simply 'ignore' the bad section of disk, so that it will boot up as normal (hell, if this was MS Windows you probably wouldn't even know about it, it would just ignore the problem automatically).
Any suggestions or help gratefully received!
Thanks, Tim
What we did in outline was: 'tar'ed the entire logical volume onto the other server; removed the volume, using the 'lvremove' command; re-created it using 'lvcreate'; re-made the file system using 'mkfs'; extracted the contents of the volume back from the tar file. It looks as though the process of re-creating the logical volume has ignored the bad sector, and so the machine now boots up as you'd expect it too. It is still a little 'fragile', but this may be because the disk is in a poorly state - time will tell!
Tim