Well. . . .
It looks like I'm going to be busy!
I am going to peruse my list of previously saved Tech-Tips and import, (make that read: "brute-force cut-and-paste-and-mangle-and-cram-and-push-and. . . ), them into this blog.
I am going to try (!!) to insert them based on the date they were originally published. Hopefully that will work. If it doesn't, there are going to be a WHOLE BUNCH of Tech-Tips published today. (. . . and tommorow, and the next day, and the day after. . .)
Unfortunately, most of these Tech-Tips are saved in M$ Outlook's Rube Goldberg version of "HTML" which makes importing or copying them into any other HTML based application a Non-Trivial task. . . . . It's doable, but it's a lot like making your way through a thick jungle - you slash-and-cut your way through the thickest web of obscure HTML code imaginable. So - if the spacing is a bit wierd, or things don't align "just so" - you'll understand why. As I get better at "this blogging thing", I will try to go back through and clean up what I can.
Update: I have discovered a very useful tool - especially when you're fighting HTML filled with M$ cruft - called HTML Tidy, located over at SourceForge. It's a command-line utility that is definitely worth a look.
When I am finished, I hope to have a reasonably comprehensive collection of Tech-Tips here for your viewing pleasure.
Thanks for visiting!
(p.s.)
If you've seen previous versions of these posts - perhaps as e-mails when they were originally distributed - you may notice that they're not exact word-for-word imports. Since it was necessary for me to go to what were already hysterically great lengths just to get the content IN here, I took the liberty to edit for clarity and/or meaning whenever I felt the articles would be improved therefrom.
Jim
Technical information of interest to the Software QA community as well as others interested in the odd things that can happen with computers
Welcome to the QA Tech-Tips blog!
Some see things as they are, and ask "Why?" I dream things that never were, and ask "Why Not".
Robert F. Kennedy
“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb ?xml:namespace>
Robert F. Kennedy
“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb
Monday, December 28, 2009
Holiday Edition: The Case of the Vanishing Internet!
Hello everybody!
I hope you all have had (are having!) a wonderful Holiday Season and an Excellent New Year!
This tech-tip discusses a problem that might crop up when a new computer moves into the house. . . .
The Holidays have come-and-gone, and that shiny new laptop, desktop, or wicked game console that you’ve had your eye on has finally shown up under your Christmas Tree. Fancy video, latest and greatest Windows operating system (or maybe even OS/X), the latest Wireless, Gigabit, and Bluetooth – it’s all there. Everything is right with the world except for one teeny-tiny thing. . . .
All of a sudden – for no apparent reason – your connection to the Internet suddenly dies.
You scratch your head, check out a few things, and then reset your router and/or modem (cable or DSL). Once you do that, things are just fine again for a while. But then, without warning, your internet connection vanishes yet again!
I hope you all have had (are having!) a wonderful Holiday Season and an Excellent New Year!
This tech-tip discusses a problem that might crop up when a new computer moves into the house. . . .
Here’s the scene:
The Holidays have come-and-gone, and that shiny new laptop, desktop, or wicked game console that you’ve had your eye on has finally shown up under your Christmas Tree. Fancy video, latest and greatest Windows operating system (or maybe even OS/X), the latest Wireless, Gigabit, and Bluetooth – it’s all there. Everything is right with the world except for one teeny-tiny thing. . . .
All of a sudden – for no apparent reason – your connection to the Internet suddenly dies.
You scratch your head, check out a few things, and then reset your router and/or modem (cable or DSL). Once you do that, things are just fine again for a while. But then, without warning, your internet connection vanishes yet again!
This is odd. . . . You never had that problem before, right? Even when the Internet connection dies, you (usually) can still reach any other machine on your network; but the connection to the Internet is just plain-'ole-gone. It might die in a matter of hours, maybe a day or two, but it dies – seemingly at random. And a router (or modem) reset seems to always bring it back.
The issue here is – in all probability – your new computer.
Not that your new computer has anything wrong with it, but it may be causing you network problems because of the new IP addressing protocol – IPv6. This is because many of the newer operating systems come with IPv6 enabled by default. Even good ‘ole Windows XP isn’t immune. The latest round of Service Packs for XP often installs – and enables! – IPv6.
Not that your new computer has anything wrong with it, but it may be causing you network problems because of the new IP addressing protocol – IPv6. This is because many of the newer operating systems come with IPv6 enabled by default. Even good ‘ole Windows XP isn’t immune. The latest round of Service Packs for XP often installs – and enables! – IPv6.
Unfortunately, a lot of the older hardware – routers, modems, etc. – aren’t equipped to handle the new addressing protocol yet, and they choke. It’s not even “older” hardware – new stuff can have the same issues. Even if you’ve installed the latest-and-greatest firmware updates.
Additionally, it’s possible for older, (pre-existing), computers to experience problems with IPv6 causing them to behave strangely, or loose their network connection for no apparent reason.
The solution is equally simple: Go to the network configuration page for each networking adapter your computers use – both Ethernet and Wireless if your computer has both – and disable the IPv6 protocol. Once you’ve checked, (and disabled IPv6 as needed), on all the computers on your network, give your router and/or modem one last reset and your problem should be over. That is, until the next new computer shows up!
Have a Wonderful Holiday Season!
Jim "JR" Harris
Jim "JR" Harris
Thursday, October 29, 2009
Microsoft is not alone when it comes to fouling things up!
Orignially published on 10/29/09 as
"WOW! Did Fedora pooch THAT revision. . . ."
(Shaking my head in wonder. . . .)
I decided - *after* doing a soup-to-nuts backup, and verifying I could bare-metal restore - to upgrade Storage3 (my RAID box) from Fedora 10 to Fedora 11.
I do the download of the update files, let Fedora reboot into the "update" part of the install, and. . . Zippiddy-doo-da! I get as far as "searching for hardware devices" and it goes brain-dead.
So. . . Figuring there might be a real simple workaround, I did a web search, and found Fedora Bug 507411 - which, (in the developer's own reply to the bug!), states:
Reference citation:
https://bugzilla.redhat.com/show_bug.cgi?id=507411
=========== begin inserted text ==============
Comment #4 From Dave Lehman 2009-08-14 12:54:42 EDT -------
(In reply to comment #0)
> Description of problem:
> Install fails at "finding storage devices". Last week I tried the install with
> the install DVD when I ran into the anaconda bug, but I had now way to get the
> report out of the install environment. Now I've tried the install with the Live
> CD and "Install to Hard Drive" with the same result. I've attached the anaconda
> bug report.
> the install DVD when I ran into the anaconda bug, but I had now way to get the
> report out of the install environment. Now I've tried the install with the Live
> CD and "Install to Hard Drive" with the same result. I've attached the anaconda
> bug report.
Fedora 11 does not support the use of partitioned md devices.
We are working on at least not crashing when we find them for F12.============ end inserted text ==============
The user’s reply – immediately after this post – was a model of diplomatic caution. He refrained – manfully – from making the usual references to both Microsoft, Vista, and their apparent ability to eff-up a wet-dream.
However, he made it abundantly clear that "not support[ing] the use of partitioned md [RAID] devices" was an absolute masterpiece of technological advancement.
WHAT?! They "[do] not support the use of partitioned md devices". . . . Holy smokes Batman!!
What is even *MORE* a "Holy Smokes, Batman!!" is the "Oh, we want to make it so we don’t crash when we find them – for Fedora *12*"
Don’t "crash"? Didn’t you guys test the upgrade path with, at least, *simple* RAID devices installed? I would have expected – from Fedora at least – to have handled the upgrade seamlessly.
Especially in my case since my machine's system drive is most definitely NOT a RAID device. . . .
There were several other comments like this from people who had "md" devices on their F-10 boxes – and when they tried to update to F-11 - they got the classic Zippo (with the Red Hat Emblem)!
The bad news is that I have half an upgrade sitting on my machine – somewhere.
The good news is that it appears to NOT trash the existing install. I guess what I’ll have to do is reboot back into F-10, edit out the F-11 installation startup from Grub, and then manually remove all the artifacts – hopefully they are all in /tmp or such like – where I can get to them without effing up the entire system.
Looks like I’m staying at F-10 for a while!!
Jim
Wednesday, August 12, 2009
Windows 7 - The Right Stuff? (Part 1)
Originally published on 8/12/09 as
Windows 7 - The Right Stuff? (Part 1)
Today’s topic: Windows 7.
By now everyone knows that Microsoft’s release of Windows Vista was, shall we say, something less than stellar.
It’s slow. It’s a resource hog. It takes massive amounts of disk-space to install. It doesn't play well with others. And – my biggest beef – it’s *WAAAAY* too noisy visually. Microsoft promised us the Mercedes with a 12 cylinder engine – and we ended up getting a little grey Volkswagen with three flat tires.
With Windows 7, they come a lot closer to their original ideal. It may not be a 12 cylinder roadster, but it’s certainly not something I’d be ashamed to be seen with.
My System:
My target system is a Compaq Presario FP5000 series laptop, with an AMD-64 X2 processor and two gigs of memory. It’s not the slowest, smallest system out there, but it’s also not a drool-inspiring ultra-gamer beast either. IMHO, it represents a close approximation to a "reasonable" machine that would be owned by mere mortals like us that don’t have thousands to spend on their PC’s.
I downloaded (from Microsoft TechNet):
- Windows 7 Ultimate
- Windows 7 Professional
- Windows 7 Home Premium
- Windows 7 Home Premium (x64)
- Windows 7 Professional – all by itself
- Windows 7 Home Premium, Home Premium (x64), and Ultimate as a multi-boot install.
The one "non-default" setting I chose was to setup the Windows Update settings later – and when I *DID* set it up, I set it to simply notify of available downloads.
I installed Avira Anti-Virus, (http://www.free-av.com/), and EasyBCD, (a Vista / Windows 7 boot options editor – http://neosmart.net/), into each of the installs.
The Install Process:
The install for Windows 7 is a typical Windows install, reminiscent of XP, except you don’t get 30 minutes of commercials telling you how wonderful Microsoft and Windows 7 will be for you.
The nice thing about the install is that – even without custom drivers installed – Windows does an excellent job of detecting hardware and configuring sane options.
Another nice point: On my Compaq laptop, installing either XP or Vista was a multi-multi-step process:
- Install the base OS
- Find and install the drivers for your machine’s specific hardware.
This is a non-trivial exercise – many drivers are simply NOT available! - Update
- Install updates to the drivers.
- Re-update.
- Install more updates.
- Re-update again.
- Etc.
On Windows 7, the first update after install provides all the drivers that the base install missed – like the Synaptics touchpad drivers, NVIDIA graphics and motherboard drivers, etc., for both the 32 and 64 bit versions. (It should be noted that driver support for the 64 bit versions of Vista was sparse, to say the least.)
Additionally, the updates are *small* - the 32 bit systems took about 90 megs of updates, and the 64 bit system’s update was 105 megs or thereabouts. Unlike Vista or XP, you don’t have to wait hours and hours for all the updates to download and install.
After a reboot – and a short pause to set things the way I like ‘em – I was ready to go.
Multi-boot Installs:
Windows 7 does one thing that – depending on your point of view – is either a good thing or a bad thing. The initial install of the first instance of Windows 7 to a clean hard drive creates *TWO* partitions:
- A 100 meg "System" (boot) partition.
- An install partition for the operating system.
The advantages of a separate system partition may be a lot more subtle.
The folks at Microsoft, in all probability, decided on a separate "boot" partition to overcome the issues that Vista had with installing multiple instance of itself, or installing it with other operating systems – like XP or Linux.
To be perfectly fair, a "typical" (one OS per box) user won’t even notice the difference.
I do have one gripe: Though Windows 7 handles multiple installs relatively well – they missed the boat as far as labeling the systems were concerned. I am quite sure that Microsoft – and its installers – know how to determine the version of a previous OS install, so providing a better label than "Windows 7" for every Windows 7 instance it finds should not be difficult. They could have - at the very least - have labled them as "64 bit" or "32 bit". Or maybe used partition-labels, if they existed?
The workaround to this is to manually edit the boot labels using BCDedit from the command line – or download a boot manager program like EasyBCD.
Install Footprint:
After the install, setup and update, a check of the hard disk properties shows that the 32 bit versions weigh in at just less than 10 gigs total disk used, and the one 64 bit version used just a tad over 11 gigs of hard drive space. Compare this with Vista which used closer to 20 or 30 gigs of space on my machine after install.
Next: Using Windows 7
Windows 7 - The Right Stuff? (Part 2)
Originally published on 8/12/2009 as
Windows 7 - The Right Stuff? (Part 2)
In part one we looked at the installation and update process for Windows 7. Now, let’s fire it up!
Using Windows 7:
The first thing you notice when you launch Windows 7 is that it is not as visually noisy as Vista is. No gadgets. No “side-bar” hogging precious desktop space. You also do not get the plethora of system-tray icons demanding your attention.
The gadgets and side-bar are not gone – they are just not enabled by default in Windows 7.
Likewise, a lot of the resource-hogging eye candy has been toned down by default. It’s still there; you just have to ask for it.
What you *DO* notice is a clean, uncluttered workspace done in a more muted, pastel color-scheme that borders on minimalist. Even the task-bar is a muted blue-grey with the START button being the only real splash of color on it.
It’s almost as if Microsoft, knowing how they shot themselves in the foot with all the hoopla and eye-candy surrounding Vista, decided to tone the visuals waaaay down in Windows 7 – and let the operating system itself do the talking.
Start-up:
The one big difference you will notice when you start up Windows 7 is the new splash screen. Four illuminated color sprites dance around the center of the screen, eventually morphing into an illuminated Windows logo that seems to pulsate with energy. I don’t know who the graphic artists were, but they obviously watched way too many Sci-Fi movies. It’s actually not bad, but it is kind-of weird in an eerie New Age / Alien way.
Start-up speed is good. It’s not really any slower than a cold XP start-up. Your computer is up and running and you are either presented with your desktop – or your login prompt, (if you use passwords), in a reasonable amount of time.
Performance:
My setup, (a Compaq Presario FP5000 series laptop, with an AMD-64 X2 processor and two gigs of memory), was rated by Windows 7 with a “Windows Experience Index” of about 2.5 out of a possible 8, which is about the same as Vista rated it. However the difference in performance is striking:
- Application launch times are visibly faster.
- Applications run more smoothly and evenly.
- Applications that were originally written for XP seem to play better with Windows 7 than Vista.
- Media – played through the new Media Player on Windows 7 – runs smoothly, even if being accessed from a network share.
- The disk tools – error checking and defragmentation – are considerably faster in Windows 7 than in Vista.
When I originally installed Vista Ultimate – and then did a defrag – it took over thirty hours to complete. The same defrag, using Windows 7 Ultimate, was done in a matter of minutes. Repeating a defrag on a previously defragged drive in Vista did not take any less time than the original. However re-defragging a previously defragged drive in Windows 7 takes much less time – as would be expected.
- Performance in Ultimate was a tad more sluggish than either of the Home Premium installs, but not by much. Most of these comments were taken on the Ultimate install – and then confirmed on the two Home Premium installs later on.
Windows 7 still uses the Vista security model including the UAC. However the UAC paradigm has been tweaked to make it much less annoying. For example, if you click on a control panel option that requires Admin privilege – and you are an administrative user – you don’t always get the UAC. Many of the more commonly used features silently elevate privilege for you. Serious system changes – such as installs or explicit privilege elevation – still require the UAC, but it’s much less annoying.
[Update: Windows 7 appears to be able to "remember" privilege elevation in the same way that many Linux distributions do - i.e. if you do something that requires Admin - and you have successfully elevated in the last "X" amount of time - it doesn't require elevation again.]
The 64 bit versions of Windows 7, (as judged by my experience with Home Premium 64), are actually useful. They run well, drivers and updates are available, and there seems to be no problem with either 64 or 32 bit software.
System and network performance – as judged by the media testing I did above – are vastly superior in Windows 7 when compared to the performance under Vista.
On the subject of media, two applications that come with Windows 7 are the new IE-8, and Media Player 12, both of which have been vastly improved over their Vista-era counterparts.
Internet Explorer 8:Windows 7 also introduces a new network grouping paradigm – Homegroups – which are a special kind of network group exclusive to Windows 7 that, (supposedly - I've not tested it), use 128 bit encryption and secure tunneling to connect computers in a homegroup. Note that homegroups are *NOT* workgroups. That’s a separate setting. It's buried - but it's still there.
Internet Explorer 8 – compared to IE-7 and IE-6 – is actually a usable web browser. Many of the annoying features of IE-7 have been cleaned up. It gives you much more control over plug-ins and application extensions, and is all-in-all much more civilized. It’s also much faster, comparable in speed to IE-6 or Firefox. In my opinion, it gives me everything I use IE6 for -compatibility with those sites that insist on a M$ browser - with the advantages of tabbed browsing and an improved user interface.
Media Player 12:
Media Player 12 is also a much improved version of the Windows Media Player. Besides supporting streaming – it now natively supports most common media types, including Divix and Xvid files – so you don’t have to go searching for additional codecs, or download additional media players just to play files from your AVI library.
Homegroups allow multiple computers running Windows 7 to create what is supposed to be a tighter and more secure group session between them.
What I have not tested:
- I have not tried an “Upgrade” install from either Vista or XP so I cannot comment on it.
- I have not tried it with a wider selection of applications yet – but based on what I see so far, I don’t expect problems.
- I have not tried placing either Ultimate or Professional on a domain. That’s on my “to-do” list, but I want to thoroughly work it in a stand-alone mode first.
- I have not tested to see if they have fixed the “Vista Virtual Store does not sync” bug yet. That’s also on my “to-do” list.
Based on my initial experience with Windows 7 – comparing it to both Vista and Windows XP – I believe that with Windows 7, Microsoft has finally hit the mark they were trying for with Vista. It is cleaner, faster, smaller, less annoying and – overall – a much better computing experience, especially if you want to migrate to 64 bit code.
Performance wise, I believe it’s more like Windows XP than Vista. You start it, it’s there, and it does the job.
There are still things that may need work – or maybe I just have to get used to the way they work in Windows 7. These are – by comparison to the incredible annoyances with Vista – virtually microscopic, the kinds of things you would normally expect to bump into when changing operating systems.
Is Windows 7 the answer to a maiden’s prayer? I guess it all depends on the maiden in question.
In other words, it depends on who is using it, and what they’re using it for. What I can say is that it appears to be the successor for Windows XP that Microsoft has been looking for. If you liked WinXP or Win2k – you might want to give Windows 7 a try.
Jim
Sunday, July 26, 2009
Large Disk Support?
Originally published on 7/26/2009 as
QA Tech-Tip for July - large disk support?!
While perusing the Ext2 Installable File System (for Windows) web site, (http://www.fs-driver.org/), I ran into an interesting factoid.
This problem rears its ugly head again from time to time – I may have even mentioned this before – but it’s worth mentioning again: Support for "very large" (larger than 128 / 137 gigabytes), ATAPI disks is not necessarily enabled in either Windows 2000 – or XP! – even if you have the latest-and-greatest service packs installed. (http://www.fs-driver.org/troubleshoot.html - go to the bottom of the page)
Note: SATA drives, (serial ATA – the newest drive type), doesn’t have this problem because the way it handles access to the disk is completely different. If you have purchased a new computer within the last two or three years, especially if it’s a laptop/notebook, it probably has SATA drives, and you don’t need to worry about this – unless you plan to add older ATAPI, (also known as EIDE or PATA), drives to the system later on.
Also; Windows Vista, (and later versions of Windows), handle this correctly *if* the hardware supports LBA / LBA-48 properly. Virtually any computer new enough to have a Pentium III or better processor should support LBA / LBA-48. Older computers, given large hard drives, either "hang" (refuse to boot) or display something weird for the drive’s size, which is a dead-giveaway something is terribly wrong.
Linux (usually!) handles this cleanly – even if the original installation was done with "small" (less than 128 / 137 gigabytes), hard drives installed – because it keeps track of the actual, visible, space on the drive.
Earlier versions of Windows, however, do not handle this cleanly at all. To make things worse, Windows does *not* check to see if the drive’s size is a "supported" size, it just reads the ATAPI drive parameter block stored in the drive’s firmware – and reports whatever is returned as if everything is happy.
This causes a problem, as you might well guess.
- The Windows NTFS file system is designed to support up to four terabytes of drive space per drive. (This has been true since Windows NT.)
- Windows reads the ATAPI parameter block, gets the reported drive size, and returns it *without checking* to see if the drive is larger than can be used safely.
- Windows *assumes* (bells should be ringing here!) that any reported drive size less than four TB is readable in its entirety, *without regard* to whether or not ATAPI LBA / LBA-48 support is enabled.
Checking for and correcting this problem involves editing the registry as described in Microsoft’s Knowledge Base articles KB305098 (for Windows 2000) and KB303013 (for Windows XP).
Viz.:
- Please begin by updating to Windows 2000 Service Pack 3 (or higher) or Windows XP Service Pack 1 (or higher), if necessary.
- Start regedit.exe.
- Go to the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Atapi\Parameters and add, (or modify if it is already present), a registry key value:
key value: EnableBigLba
type: DWORD
value: 1 - Reboot your computer
Monday, February 9, 2009
Resurrecting a Faild Hard Drive (on a Linux system)
Originally published on 2/9/09 as
QA Tech-Tip - "Resurrecting the Dead!"
Author's note:
This article describes how I managed to recover and restore a corrupted Western Digital My Book World II NAS device which used a variant of Linux as the base operating system.
It originally appeared as an article I contributed to the Hacking the My Book World Wiki site: (http://mybookworld.wikidot.com/rescue-procedure-take-2) and was subsequently re-published as a link in the February 2009 QA Tech-Tip, "Rersurrecting the Dead!"
Despite the fact that this article primarily involves the use of a Linux system and Linux system commands to recover the data, I have also used Linux systems - and the associated rescue utilities like dd_rescue - to recover data from other operating systems as well. Since these tools and techniques are - essentially - independant of the file system being recovered, it should be useful to anyone wishing to recover potentially lost data due to a file-system crash.
It is my hope that the information contained herein, and the knowledge gained from it, may help you with your own data recovery efforts.
To this end, I dedicate this article.
Jim
How to recover data from a crashed MBWE-II
Acknowledgments:
I want to acknowledge the help given me by Gabriel (who sure earned his name this time!) along with everyone else on these fora who posted their own experiences with the MBWE. Without your help I would have been SO SCREWED it would not be funny.
As we all know, there's really no excuse for inadequate backups. And of all people, I know better.
However, there I was with 30+ years of accumulated experience, tools, tricks, tips, software, etc. on a single drive - just waiting for Good 'Ole Mr. Murphy to come in and ball it up. This data was both critical and irreplaceable, so "failure is NOT an option!"
There was no choice, I had to recover that data "regardless of cost or loss!" - even if it meant I had to go through those disks byte-by-byte with a disk editor.
I was damned lucky.
I was able to recover about 99% of my data, with the lost data being (relatively) easily replaced.
It did cost me though. I went through about $700.00, four tanks of gasoline, and a number of trips to my local (!!) Micro-Center to get parts and materials. Not to mention two weeks of acid-reflux.
I am taking the trouble to document what eventually succeeded for me - in the hope that it will help others avoid some of the mistakes *I* made.
Lastly, please excuse the length of this article. Even though I will make it as brief as possible, it was a long time in the telling, and it won't be told here in three lines.
Hardware Requirements:
- Your hard drive must still be spinning, with the potential for recovering data
Obviously if your drive's platters have frozen solid and don't spin, or the drive is suffering from a gross mechanical defect - such as pieces rattling around inside - your chances of success plummet like a rock.
- You will need a computer that you can exclusively dedicate to this task for awhile
"Awhile" might be measured in days, or even weeks. It took me two weeks of trial-and-error to get my data fully recovered.
- You will need at least twice as many drives as there were drives in your MBWE
My device had two 500 gig drives, so I purchased four drives to rebuild data on.
- Each new drive will need to be at least twice the size of the drive you're trying to recover
Since I had two 500 gig drives, I purchased four 1T drives.
- You will need a controller card - or available SATA space on your recovery system's MoBo - for the extra drives in addition to the drive(s) already in the system
- You may need a replacement drive for the one that failed
Try to get as exact a replacement as possible. Western Digital, same size, same model series if possible, etc.
Software Requirements:
- You will need a flavor of Linux compatible with your system and controller
- Some people recommend the use of a "Live CD" for the recovery. I don't. I found it very convenient to be able to save log files, as well as some of the smaller data files, to my desktop. It's not so easy to do this with a "Live" CD.
- Since you will need to download, install, save test artifacts and files, etc. etc. etc. I found it much easier to just do a flat "install from scratch" on the recovery system.
- Additionally, the "Live" CD's I did try, (Ubuntu, Fedora, Knoppix), did NOT want to work with the SATA (RAID) card I bought. Chip revisions change, and sometimes the older drivers don't like the newer boards. I was able to get newer drivers, but only for Fedora, and they'd ONLY work on an "installed" system from the full-up install DVD - not the "Live CD" install.
(N.B. I have since upgraded / reinstalled from Fedora 8 (which the drivers were for) to Fedora 10 (the latest stable release as of this writing), and the additional drivers were not required. The Fedora 10 "Live CD" however did not work. Maybe they have to leave things off like "unusual" drivers on the CD? :-) )
- You will need ddrescue / dd_rescue
You will need to find, or download, a copy of the program "ddrescue". (It's called "dd_rescue" on some distributions.) If your distribution does not come with that already, download and install it via your distribution's package manager.
- You will need mdadm
This is commonly included in most recent distributions. If it's not included, you can download it via your distribution's package manager.
- You will need a recent copy of the Western Digital Data Lifeguard Tools CD to make a boot floppy of the Western Digital Data Lifeguard "Diagnostics".
- You will need to be on excellent terms with Lady Luck!
Or, as Scripture says: "The fervent effectual prayer of a righteous man availeth much."
And I'm not kidding. If you're reading this, you are probably already in Deep Sneakers, and sinking fast. Luck, prayer, whatever, will be a primary constituent of your success.
Notes:
- You need to be logged in as root to do any of this stuff.
- Be EXTREMELY CAREFUL with the "dd" and "dd_rescue" commands - they are extremely powerful and useful commands - but a tiny typo could render your drives, or your computer, a quivering wastland.
- For brevity, I have NOT included examples of every possible command used (i.e. "mount" "umount" "ls" etc) If you are not sure how to do this stuff, (or are not that familiar with Linux), get help!
My MBWE-II Configuration and Status as of the time of the repair
- My system was set up as a LINEAR array - that is the two 500 gig drives in my system appeared to be one 1 terabyte drive.
- Because of this - it is actually a RAID-0 I think - the data was striped across both drives. In this case, the failure of any one drive means the entire data store was garbage.
- To recover this - if both drives are spinning! - I needed to copy off the data from both drives to somewhere I could work on it, and then try to "stitch" the two array halves back together again.
- The Web Setup (admin) page for my system showed "Drive 'A' Failed"
- When I used dd_rescue (described below) to image the "failed" drive, the system partitions showed a number of "bad blocks" - in this case, it turned out that they were not truly defective, but just corrupted beyond the ability of the operating system to repair them. However, because the system partitions had bad blocks, I had to hope that the system partitions on drive B (my hopefully "good" drive), would be intact enough to recover from.
- In my case, it turned out that my "B" drive was "still good" - and as dd_rescue proved further down - I had no bad clusters on that drive - so I could try to use the system partitions from that drive to re-create the partitions on the "bad" drive.
- I was able to prove - using the Western Digital Drive Diagnostics - that the "A" drive was actually not truly defective. That saved me from having to actually replace the drive. However, if that had been needed, the only difference would be to substitute the NEW hard drive for the OLD one when you begin the drive "A" rebuild process.
Recovery Steps:
Rule #1: Don't Touch That Drive!
You are already in trouble. Dinking around with the drive - potentially changing it's contents - will only make it worse.
Prepare the new drives to receive the recovery data
- Open a terminal session - or two! - and SU to root.
- You will need to be ROOT (super-user) for any of this to work.
- Each time you shutdown and restart the system, you need to re-open your terminal sessions and re-su to root.
- Attach all the new drives, create one single partition on each, and format as ext3.
- You can do this one-at-a-time, or you can attach all four of the new recovery drives to the controller, and format them all up there.
- Shutdown and remove all formatted drives and set them aside carefully.
Copying the data off the damaged drive.
- Install the drive that is NOT damaged, and view the partition table with Gparted or QTParted and verify that the partition table is intact.
- Your partition table should look like this:
- Unallocated space. (This space is used to store individual system specific data, such as MAC address, serial number, etc.)
- Partition #1, formatted as ext3. (This is the boot partition, with /boot, /root, etc. on it.)
- Partition #2, formatted as swap (This is the system paging file.)
- Partition #3, formatted as ext3 (This is the rest of the O/S, /var, etc.)
- Partition #4, unknown format. (This is the data-store, don't modify or change this!)
- Using dd_rescue, copy the "un-damaged" drive to a file on one of the new drives.
- This will take a fairly long while - measured in hours.
- Take note of any failed blocks. (cut-and-paste to a text file.)
- Shutdown the system, turn it off, remove the new drive with the file, label it, and put it somewhere safe.
- Attach another new drive.
- Reboot.
Commands to do the above:
dd_rescue -l /home/**uname**/Desktop/B-logfile.txt -o /home/**uname**/Desktop/B-bbfile.txt -v /dev/sdb
/recover/b/b-recover-disk
Don't type the "asterisks" (**)
**uname** = Your username (this is the path to your desktop)
l = logfile output
o = bad-block logfile output (you need both of these for repairs)
/dev/sdb = The physical device the drive is on
/recover/b/b-recover-disk = the output filename for the extracted disk image.
I mounted my "recovery" drives at a mount-point called "/recover" on my system,
and the recovery drives were mounted as "a" and "b", so I had "/recover/a"
and "/recover/b" as the two recovery drives on my system.
- Using dd_rescue, copy the last partition from the "undamaged" drive to a file on the new drive.
- This will also take a long while. Almost exactly as long as the first copy, since this is where most of the data lives.
- Again, take note of any failed blocks. Hopefully you won't find any on the "2nd" drive during either copy.
- Shutdown the system, turn it off, remove both the new drive (mark it and put it somewhere safe), and the "B" drive, label and put somewhere else safe.
Commands to do the above:
dd_rescue -l /home/**uname**/Desktop/B-logfile.txt -o /home/**uname**/Desktop/B-bbfile.txt -v /dev/**sdb4**
/recover/b/b-recover-data
/dev/sdb4 = The 4th partition on device "sdb" You can copy any partition by enumerating it here.
/recover/b/b-recover-data = the output file containing the **data** partition from drive "B".
- Add the failed drive to the system and attempt to verify partitions
- Attach the failed drive ("A"), to the controller where the "B" drive was, and re-run the Gparted, QTParted partition verification step as noted above.
- Shut down and turn off the system.
IF the "failed" drive's partition table is NOT OK, continue with the steps below.
- Use dd to copy the first 512 bytes from the disk with the good partition table.
- Copy that file to the first 512 bytes of the "bad" disk to see if we can recover valid partition data.
Attempt to recover data from the failed drive
- Attach the failed drive ("A"), to the controller where the "B"drive was, and attach another new drive.
- Reboot the system.
- Using dd_rescue, copy the last partition of the "A" drive to a file on the new disk.
- Again, this will take a long while.
- Also, take careful note of any bad blocks.
- Shutdown the system, turn it off, remove and label the new drive, and put it away safely.
- Attach the last new drive and reboot.
- Attempt to copy data from the entire disk to a file on the last new hard disk
- Allow dd_rescue to copy about half the disk contents to a file, then abort it with CTL-C.
- Hopefully, one of the two disks had the system partitions without errors.
- Shutdown the system, turn it off, remove and label the last new drive, and put it away safely, leaving the potentially defective drive attached.
At this point, you should have all the images you need.
Verify if the "failed" drive is really bad
- At this point, the system should be shut down, with all the new drives removed, and the one failing drive still attached.
- Boot the system using the "Diagnostics" floppy you created from the Western Digital Data Lifeguard CD.
- Select the correct drive in your system.
- Run the "Quick Test".
- It is not necessary to run the "full" test.
- If the drive passes the "Quick" test, repeat it a few times to verify that it always passes.
- Ideally, each pass will return an error code of "0000"
- If the drive passes, mark it so, and put it away.
- If the drive fails, mark it so, and set it aside where you won't pick it up to use it.
- The magnets out of a failed H/D make GREAT 'fridge magnets!
- Replace it with the replacement drive you purchased, or go purchase one. Remember to get as exact a replacement as humanly possible.
- Repeat this same exact procedure, substituting the other MBWE drive to verify it is OK.
Attempt to rebuild the damaged data array
- Re-attach the data image drives and prepare to recover
- Shutdown and turn off the system if not already shutdown.
- Attach the two drives that have the two data-partition images on them in positions 1 & 2 on the controller.
- Attach a blank drive - if available - as position #3.
- Restart the system.
- Mount the three drives in a convenient location
- I will assume /recover/a, /recover/b, and /recover/c are the mount points.
- I am also assuming that the drive with the drive "A" data image is first, the drive "B" data image is second.
- Loop-mount the recovered data image files created before
- I will assume that they're named "a-recover-data" and "b-recover-data"
- Execute the following commands to loop-mount the two image files:
Commands to do the above:
losetup /dev/loop0 /recover/a/a-recover-data
losetup /dev/loop1 /recover/b/b-recover-data
This creates two "fake" (virtual) drives mounted on loop0 and loop1 that contain the contents of these two files.
Trick: You can loop-mount ANY valid file-system image - including things like cd/dvd ISO images, etc.
- Merge the images into a copy of their original array
- Execute the following command to re-create the original MBWE array structure:
Commands to do the above:
mdadm --assemble /dev/md1 --force /dev/loop0 /dev/loop1
This command takes the two loop-mounted array parts and (hopefully!) merges them into an array image similar to the one on the MBWE that the two drives came out of.
Hopefully the array built - and started! - correctly. If it didn't, I don't know how to help you here.
Assuming the array built correctly - mount /dev/md1 wherever convenient. (Let's assume /recover/md1)
Navigate to the mount point, and view the contents of the root of that "drive". If all has gone well, at this point you should see a filesystem containing folders and data - as you had it on the original MBWE.
If you successfully see a filesystem - congratulate yourself, take a deep breath, and perhaps take a short break.
If you don't have a filesystem here - I am not sure how to fix this. Not without messing with it myself.
Make a "backup" of the filesystem's apparent content.
- Very Important!
- Using "cp -R", copy the entire contents of the /dev/md1 mount point to the empty drive you have mounted at your third hard drive mount point.
- This will take a while. Take careful note of any files that generate errors.
- We do this because when we try to repair the two partition images, things might get destroyed.
Attempt to repair / recover the partition images
- Check array partitions for consistency
- Execute the following command to verify the structure of the array partition's filesystem.
Commands to do the above:
fsec -t ext3 /dev/md1 -- -n -f -v
-n = Don't actually fix anything
-f = Force scan, even if screwy.
-v = Tell us a lot about what you see.
- Again, remember to take careful note of any errors or issues seen.
- In my case, there were a lot of "inode hash" errors
- Try a "real" fsck to clean up issues
- This will discover if any of the issues disclosed were "serious" issues. (They probably are, but we can see if we get lucky… .)
- Execute the following command:
Commands to do the above:
fsck -t ext3 /dev/md1 -- -D -p -f -v
D = consolidate and re-index directories.
p = "Preen" (auto-repair) non-critical problems.
f = Force checking
v = Tell us what's happening.
- You may get a "/dev/md1: Adding dirhash hint to filesystem" message when you start the "real" fsck. This is indicating that fsck is updating the partition to handle indexing properly. This is a non-problem.
- When I did this, it still bailed out on me because "inode hash" issues are considered "critical" problems. What will happen is that - if you force fix, and you will need to, trust me - the directories and/or files with the inode hash errors will be deleted and the space consumed returned to the free pool.
- Retry fsck forcing it to fix all errors found
- We will need to absolutely clean up the issues found, so we must (at this point) force fsck to fix things.
- Execute the following commands to do this:
Commands to do the above:
fsck -t ext3 /dev/md1 -- -y -f -v
(note, we're omitting the "-D" here deliberately.)
y = force auto fix (answer any question "yes!")
- Re-execute the same command again to verify all issues have been resolved.
- Repeat until there are no more errors found.
- Once everything is OK, re-run fsck again to optimize and re-index directories.
Commands to do the above:
fsck -t ext3 /dev/md1 -- -D -y -f -v
- Un-mount /dev/md1, and stop the array
Commands to do the above:
umount /dev/md1
mdadm --stop /dev/md1
Stop and take stock of things
Where we should be now
- We should have two partition image files loop-mounted.
- We should have them successfully assembled into an array.
- We should have successfully run fsck on the array partition and cleaned up any errors.
- We should have at least ONE good disk out of the two that came from the MBWE.
- We should have at least ONE good system image from the two drives.
- If you don't, you will need to download one and follow instructions to install it at a later step.
Begin rebuilding the two drives for the MBWE.
- I am assuming that the "B" drive contained no bad blocks - and if there were, they are in the data partition, not the system partitions.
- I am also assuming that we have a good drive "A", or a replacement, that may not have a good system image on it.
- If this is not true - you do not have ANY good system images, skip the single step below, download a system image, and follow the instructions to install it on the two drives, creating the last (fourth) partition.
- Using dd_rescue, copy the entirety of drive "B" to drive "A". This will replace the bad/missing system partitions, and re-create the 4th partition for the data.
- After this is about 1/2 done, stop the copy with CTRL-C.
- Using dd_rescue, copy the drive "A" data partition image that we fixed-up before, back to partition 4 of drive "A".
- We use dd_rescue instead of "dd" - because dd_rescue will properly detect the end of the drive/data and will make sure every byte gets written. "dd" - when it reaches the end of the drive - would simply fail, and not write the last few blocks of data.
- Using dd_rescue, copy the drive "B" data partition image that we fixed-up before, back to partition 4 of drive "B".
- Once that is done, completely shut-down and turn off power.
Rebuild the MBWE
- Re-install the hard drives
- Replace the two side-rails on each hard drive (if you removed them)
- Re-insert the two drives into the MBWE, remembering that drive "A" goes in the slot closest to the controller electronics.
- Re-connect all connectors removed during MBWE tear-down.
- Reconnect network and power
- Re-attach the network cable to the MBWE.
- Re-attach the power connector to the MBWE.
- FIRE THAT PUPPY UP!! (and pray…)
- Re-connect power.
- Carefully monitor the front-panel lights.
Note If you replaced the system partitions with downloaded partition data, you may have to re-configure the MBWE to your needs.
Verify correct operation
- Attempt to access the web setup page
- Verify that the web-setup page works, and that the drive status is "OK"
- Re-configure any settings that you need to change.
- Attempt to access the pre-existing shares on the MBWE
- Verify that the original shares on the MBWE exist, you can access them, and you can read-and-write data to them.
- Note that any files or directories that were "corrected" during the fsck of the partition array above may not be there - you may have to replace this data. THAT is why I asked you to take notes!
Verify everything's correct, replace any lost data, and return to service
- Satisfy yourself that everything is back to normal, by shutting down the MBWE, re-booting it, etc.
- You will probably notice that the MBWE is booting up - and serving files - much faster now than ever it did before.
- This is a result of both cleaning up all the cruft and problems, as well as the consolidate, optimize, and re-index steps that we performed during the FSCK operations above.
- Replace any necessary lost data
- Replace any necessary lost data as noted during the FSCK passes above.
- Return to Service
- Return the MBWE to normal operational status.
Congratulate Yourself on a Job Well Done!
JimMonday, January 19, 2009
January '09 - New Year's Presents!
Originally published 1/19/2009 as
QA Tech Tip - January '09 - New Year's Presents!
This month’s Tech-Tip celebrates the New Year with a couple of New Year’s freebies – they may be free, but they still pack a wallop!
#1:
Avira’s AntiVir Personal (http://www.free-av.com/)They offer:
- An absolutely free (personal use) anti-virus solution for your Windows boxes.
- Absolutely free product and virus definition updates for as long as you have AntiVir installed.
- A free “Rescue CD” program that will build a rescue CD ISO file and, (optionally), write it to CD if it recognizes your recorder. The Rescue CD is a stand-alone Linux system-on-CD that runs AntiVir on your hard drives while they are “quiet”, (neither mounted nor active under Windows), to make finding – and removing! – bogus programs easier. Both the virus definitions and the rescue CD image creator program are updated daily.
- AntiVir is NOT a resource hog, and does NOT bring your system to its knees just because you installed it. The corresponding Norton and McAfee products quickly reduced every machine I installed them on to the performance equivalent of a 66 MHz ‘486 of days gone by.
- AntiVir doesn’t cost an arm-and-a-leg to purchase. The comparable commercial products (from Norton and McAfee) would cost something like $60+/seat. If you have more than one or two computers to support – as I do – then this can become a seriously significant expense.
- AntiVir doesn’t annoy you with a “subscription” based update system.
(I don’t think this needs elaboration. . . .)
I got the answer to that question when I ran Avira’s AntiVir Rescue Disk on my computer. Despite running several different versions of both Norton and McAfee on my machine at various times – all of which gave it a clean bill of health – Avira’s product found two, count ‘em, TWO root-kit droppers / worms neatly tucked away into a couple of perfectly innocent e-mail attachments. At that point, it was up to me to go get the e-mails in question and delete them as AntiVir was reluctant to just stir around inside my Outlook mail-files.
To be perfectly fair, Norton found the installed root-kits / worms, but choked on the removal process which I had to complete manually. It did NOT discover the source of these infections. AntiVir did.
The one “bad” point about the AntiVir program is that every time it updates, (usually once daily), it opens a pretty darn large dialog on your desktop extolling the virtues of their paid products and providing a convenient link to their on-line purchase page. The good part of this is that it is entirely optional – you can safely dismiss the dialog and get on with your life. And I really can’t blame them for the attempted up-sell. Even Anti-Virus developers have to eat sometime!
All in all – I rate it, (especially the rescue CD), as a definite “Must Get”.
#2:
Ubuntu Linux (http://www.ubuntu.com/)“Ubuntu” is an ancient African philosophy meaning, (in essence), “Humanity toward humanity” or “Humanity toward others”. While I won’t get into a philosophical discussion on ubuntu, (though there is an excellent Wikipedia article here), Ubuntu as a Linux/GNU operating system comes very close to this ideal.
They offer:
- Ubuntu “Desktop” – designed for the “typical” desktop user with Firefox, Eudora Mail, Open Office, etc. pre-installed for you.
- Ubuntu “Server” (Enterprise, clustering, whatever. . .) – is designed for those who wish a more server-oriented install. It should be noted that - unlike other Linux distributions - the fancier versions, such as Enterprise, Clustering Server, etc. are - all of them - free for the taking.
Of course, you can purchase commercial support - which might not be a bad idea in a production or mission-critical environment - but you don't have to. There are lots of ways to get questions answered and problems solved even if your IT budget is NOT bottomless.
(Note that all of my observations below are based on the “desktop” version of Ubuntu.)
Ubuntu Linux has a number of characteristics that I believe make it stand above the crowd:
- Ubuntu Linux is designed, first and foremost, to be used by people. Note that I said “people” and not “techies”. In support of that, Ubuntu has made great strides in the area of user experience and just plain old usability. If you can use Windows, you can use (and install!) Ubuntu.
- Ubuntu expands on the concept of “usability by non-techies” with a well thought-out installation process.
In a word it is “slick” – even more so than Windows. There are five-or-six dialogs in the installation process that basically ask you “who you are” (along with “what do you want to name your computer”), and “where to put it” on your hard drive. The defaults offered are all reasonable and sane, choices are clearly shown, and if you really want to go behind the scenes and diddle, you can do that too.
Starting with a stone-cold system, it takes less time to complete the Ubuntu installation dialogs, (and start the install running!), than it takes for Vista’s installer to boot and load. This is made even sweeter because Ubuntu’s ability to correctly detect – and configure – a machine’s hardware configuration is as good as, or maybe even a tad better than, anything I’ve seen up to this point. Unlike many other Linux installers, Ubuntu’s installer is truly “plug and play”. You pop-in the CD, answer a few simple questions and you are on your way.
- More important than that, Ubuntu doesn’t arrogantly assume that you want to throw away all the other operating systems on the machine. If there’s another operating system present, Ubuntu will work hard to make itself fit in without disrupting the other system – and the Grub boot manager provides a clean boot process for all of the operating systems installed.
- Ubuntu strives to be as completely Open Source as possible – but does not become religious or pedantic about it. Ubuntu will cheerfully make non-open-source drivers or applications available to you – after telling you that there are either licensing restrictions that prevent it from being purely open source, or other issues that you may need to be aware of.
Example: Both ATI and NVIDIA have released a number of their video drivers to the Linux community – as binaries – but still retain copyright and ownership. Ubuntu makes them available to you, but tells you that they’re not “pure” open source. Other Linux versions, Fedora chief among them, go to great lengths to “forbid” (or even obstruct) the use of non-open-source drivers or software - despite the potential consequences to their users.
- Because Ubuntu is based on the Tried-And-True Open Source Linux platform – it has available for quick download a HUGE library of free applications and utilities – from the simple, (roving eyeballs for your task-bar), to the complex, (Scribus, the open source replacement for Quark), to the more esoteric, (Q-draw, an open source AutoCAD replacement); it’s all there, waiting for you to find a need for it. Since it is based on Debian and uses the equally tried-and-true “.deb” package installation process, adding applications or features is as painless as anything I’ve seen.
- Ubuntu is about giving the user choices. You can go this way, or that way, and both ways are just fine by them. If you like things nice-and-simple, that’s perfectly OK. If you want to play uber-geek and mess with the more technical aspects of Linux, that’s all there too.
Their user security model is very similar to Vista’s (or is it vice-versa? ;-) ), where Ubuntu will let you do whatever you wish – until it would affect the system as a whole – then it asks you to confirm by typing in your own password.
I, for one, think that’s a great idea – rather than the classic Windows’ “everyone is Admin/root/God”, model which has been the bane of Windows users, and a boon for malware writers. Or their “Restricted User” model, which is nearly useless. Though Ubuntu’s system is far from bullet-proof, it goes a long way toward making it darn difficult to “accidentally” pooch your system beyond repair.
You can even run Windows apps on it, (at least in theory, I have not tried it yet), by running them inside a “Windows emulator” called Wine.
The real telling characteristic is this: Ubuntu is the first Linux system that I would seriously consider putting on my wife’s computer, or even my mother’s, (a lady of nearly 80 years), confident that they would be able to use Ubuntu with an absolute minimum of difficulty.
Are there issues? Of course there are. Some portions of Ubuntu are less forgiving than others, (God help you if you accidentally set your monitor resolution or refresh rate wrong!), but this is true for any operating system out there – especially the various ‘nix systems and their near brethren. (And I won’t even discuss the headaches Windows or Vista have given me!)
My bottom line is this: If you’ve been thinking of trying out Linux, but were afraid of all the “techie” aspects of it; go ahead, take the plunge and give Ubuntu a try.
That’s it! Now go give these a try and have a wonderful New Year!
Jim
Subscribe to:
Posts (Atom)