Welcome to the QA Tech-Tips blog!

Some see things as they are, and ask "Why?"   I dream things that never were, and ask "Why Not".  
Robert F. Kennedy

“Impossible” is only found in the dictionary of a fool.  
Old Chinese Proverb

Saturday, March 19, 2011

Linux vs Windows?
Which to use and when


Over the years, I have messed with a wide variety of both Linux and Unix type systems.  I have also used virtually every Microsoft operating system - ranging from the venerable DOS and it's kin, through the germinal "Windows" versions, all the way to Windows 7, Server 2008, et. al.

The Linux vs Windows debate is, (IMHO), one of those pointless "religious wars" that have no real resolution.

Like the "Ford vs Chevy" or the "Mac vs Everybody Else" kind of debate - it's tantamount to arguing which is better, Vanilla or Chocolate.  There are legions of advocates and supporters on both sides, each convinced of the Ultimate Rightness of their respective opinion.

So?  Here goes. . . .

Which do *I* think is better?  Wanna be perfectly honest?  Neither one.

There are things that can be done in Linux with trivial ease that are virtually impossible to do in Windows; and vice-versa, you can do things in Windows that would tie Linux into knots.

Oh, yea - I can hear it now:  Microsoft wears a black helmet and has amplified breathing, whereas Linux and the whole Free Software Movement is allied with the Good Side of The Force.  And when I hear someone spouting that - I have just two words for them:

Grow up!

With all due respect to Mr. Stallman and the Free Software Foundation, I personally believe that pedantic, polarized religious thinking is counterproductive.  Neither Windows nor Linux is going away in the reasonably foreseeable future and the continuing back-biting, blame-throwing and finger-pointing isn't doing anyone any good.  Except for those periodicals, web-sites, advertisers and other Yellow Journalists that have always profited from roiling up strife and discord.

Comparing Linux to Windows is like comparing a pumped up hot-rod to a Buick.  The hot-rod is a much more powerful and fun machine to drive, but that power and fun is purchased at the cost of greater administrative responsibility.  You have to do more "tinkering" with high-powered cars than you do with the Buick.  You also have to really pay attention when you are driving one; much more so than with the Buick.

By comparison the Buick is more comfortable and easy to use, but it sacrifices power and flexibility to achieve that ease of use.  This is not to say that a stroked-and-bored Street Beast is inherently better, or worse, than the Buick.  It's a matter of personal desire, taste, and need.

Aside from the fact that the Windows user interface is inarguably the most well known UI in the world - aside from the UI for automobiles -  (Sorry, Mac!), I believe that each system does things that - ultimately - complement the other.  To express it differently:  both Windows and Linux can, and should, coexist peacefully.



The one place where Windows absolutely excels is in large enterprise deployments, where global policies and granular permissions need to be propagated throughout the infrastructure in a seamless and efficient manner.  Within Windows it is possible to grant administrative authority for a very limited range of actions in a very specific and granular way.  Allowing specific users to administer a very limited and specific set of printers in their department is one example.  The ability to limit who can send faxes via the corporate fax-server is another.

By comparison, Linux is more of an all-or-nothing situation.  "Sudo" grants virtually unrestrained root access. Though you can create specific user-groups that have specific authorities, the grant of authority - even in a specific situation such as administering printers - is often uncontrollably wide and vast - or is unreasonably specific and limited.

Windows allows you to tailor authority over a certain very specific group of resources, the printers in a local department for example, without granting a broad and sweeping authority over printers in general.

Windows also supports the concept of group policies in general, and global group policies in particular.  With these policies, you can grant a very specific authority in a more generalized way.  For example, you can grant to QA or development departments - wherever they are in the organization, even if in remote locations - the authority to change the computer's system time, while forbidding everyone else.

Likewise you could set a policy that establishes a global standard workday - 08:00 to 17:00 local time with an hour's lunch from 12:00 to 13:00 - except in Dubai where there are also four or five ten-minute intervals blocked out for the required Islamic prayer-periods.

Permissions and policies - though set for regional needs or preferences - can be made portable.  This way the executive from Dubai who is in Chicago can have his machine automatically adjust for local time - while still maintaining the prayer-period time blocks he needs.

Windows allows a machine to be added to a particular group or class of users - a new employee hire, for example - and once that user is placed in a particular department or group, the appropriate enterprise permissions and restrictions are automagically applied to his system without further intervention.

Policies can be designed to apply to a particular class of computers regardless of where they might be at a particular point in time or who is using them.  Likewise permissions or restrictions can be applied on a user-by-user basis, regardless of what machine this particular person might be using.

Additionally, local groups or departments can be delegated specific authority to administer their own policies.  Stock traders or investment brokers within an organization may be subject to legal or administrative restrictions that would not normally apply to the average user.  Or vice-versa.

With Linux, you would have to manually propagate the policy from machine to machine, group to group, user to user and hope you didn't exclude someone who needs it, or include someone who doesn't.  Of course, you could automate that task with shell scripts, but with 'nix, every time a machine is added, removed, or changes assignment within the enterprise, the configuration process has to be done all over again.

In a nutshell, when it comes to the ability to granularise permissions and authority, Windows beats 'nix hands-down.



On the other hand Linux systems are remarkable for their flexibility and their ability to adapt to varied and varying roles with little or no cost.  Though Linux systems are making inroads into the desktop user-space, the place where Unix in general, (and Linux in particular), excels is in server-based applications.

Virtually any old piece of hardware you may have laying around can be adapted to a wide range of useful purposes using Linux.  I have built multi-terabyte file servers using systems that had reached the pinnacle of their capabilities with Windows '98.

And I am not talking about some crufty version of Linux back from the time of Wooden Ships and Iron Men; I am talking about modern distributions, fully updated with all the latest security patches.  Of course, they might be running a text console instead of a full-fledged GUI, but it can still be done.  By adding Samba, a minimal installation of Apache as well as SWAT, you can have a fully functional file server with a manifestly capable administrative web interface.

You can, (and I have), taken ancient laptops that have long outlived their usefulness in the Windows world and adapted them for use as Linux boxes.

A excellent example would be to take an old - but quite capable - laptop, install Linux on it and use it as a portable network test-set.  Tools like Wireshark, Netmon and a whole host of others that are free for the asking, allow you to take an ancient laptop and convert it into the equivalent of a multi-thousand-dollar portable network analysis tool.  And the cost of this is virtually de minimus; all it requires is a tiny bit of research and a few moments of your time.

Unlike the multi-thousand-dollar network analysis tool - your tool is adaptable and upgradable as your network or needs change - without investing additional  thousands of dollars into upgrades of dubious merit.

In a similar vein, Vyatta provides an Open Source enterprise firewall/router/vpn/etc. system that is easily the equivalent of the best that Cisco has to offer.  Even on older hardware it Eats Cisco's Lunch and if you invest in a multi-core Beast System, Cisco is not even in the same solar-system.  (Though there is the risk of that Cisco box coming back in the distant future, surrounded by a huge electro-magnetic field, looking for it's master. . .)

Another place where Linux excels is in the granularity of the installation.  In pretty much the same way that Windows excels in granularity of permissions, Linux allows you to install, or create, an installation environment tailored to exactly and precisely what you need.  With Windows any installation is, (virtually), an all-or-nothing situation.  Not only do you get the entire circus, you get the elephants thrown into the bargain.

Linux by comparison allows you to install, add, remove and otherwise tailor an installation to a specific need.  With regard to that multi-terabyte file-server, you can include 100% of precisely what you need with little to no extra fat.  This is one of the main reasons why Linux can be used so successfully on older systems; you can make that ancient laptop into a lean, mean, network munching machine; without the fat, cruft, bloat, and gobbledygook that other operating systems drag in their wake.

Just as important as Linux's inherent flexibility, is the extensibility of Linux.  If there is an application or use for Linux that you desperately need, it's a virtual lead-pipe-cinch that someone has already created it for you.  In the unlikely event that what you need doesn't exist, there are tools - again free for the asking - that allow you to create what you need for yourself.  These tools range from the simplest of shell-scripts to the most extensive development and version-control systems imaginable.

By comparison, the cost of development systems for Windows is often a significant portion of the total software development cost.

Should you need help, it's there waiting for you on the Internet.  Unlike the Windows assistance and training that is available, (for the mere pittance of multi-thousands of dollars, per person), most Linux help is available for the asking.  There are a multitude of fora, groups, blogs, local meetings, events, shows, and other things that - if not absolutely free - are available for a fraction of the cost of the corresponding Microsoft/Windows offerings.  Even those companies that provide payware solutions often provide free webinars, podcasts or RSS feeds to help keep you abreast of the latest developments.



What about interoperability?

Windows' ability to play nice with others in the sandbox - though limited - is slowly improving.  Microsoft, having come to the startling realization that - maybe, just maybe - they aren't the only fish in the pond, ( !! ), is beginning to make efforts to interoperate more effectively with other systems and platforms.

The biggest strides toward interoperability have been made - as one would expect - by the Linux and Open Source / Free Software community.

A shining example of this is the Samba software suite that allows Linux based systems to fully participate in Windows networks.  This is not restricted to just file services - though it does this remarkably well - it also includes participation as Active Directory capable member servers, domain controllers, enterprise level role masters and global enterprise repositories.  Implementing DNS, WINS, mail services, including "Exchange" capable mail services, is also doable within Linux.

In fact, because Microsoft server licensing is - shall we say - somewhat expensive, even larger enterprises with deeper pockets tend to place Windows servers in key locations within the architecture and fill in the rest with Linux machines participating in the Windows enterprise network.  Smaller organizations sometimes completely forgo the Windows servers altogether, using Linux equivalents to administer their Windows desktop machines.



In Summary:
Each platform has both strengths and weaknesses.  Each platform is like a specific tool - designed and useful for certain specific uses.  Which you use, and how you use it, is entirely up to you.

Tossing out Windows - as some advocate - simply because it's Windows is tantamount to throwing the baby out with the bathwater.  Likewise, avoiding Linux simply because it's NOT Windows is similarly narrow-minded.  You need to keep a varied and flexible set of tools in your tool-box to meet all your needs.

Both Windows and Linux deserve a place of honor in your tool-box, alongside all the rest of your tools.

What say ye?

Jim

Wednesday, March 16, 2011

An Update to my open letter to Ubuntu


The following was posted on the Ubuntu blog - Launchpad - as question 149330
https://answers.launchpad.net/ubuntu/+question/149330 



Issue:

Ref: My blog post titled "An Open Letter to Canonical and the Ubuntu Team."
http://qatechtips.blogspot.com/2011/03/open-letter-to-cannonical-and-ubuntu.html
(Please read and comment)

Ubuntu's Claim to Fame - and what has lifted it to the top of the popularity list for Linux distributions - was it's primary emphasis on usability instead of the Latest and Greatest whizz-bang features.

The Linux community is both broad and vast - there is a distribution for just about every taste imaginable - from the micro-Linux to the monolithic "everything but the kitchen-sink" monster distro's; from the most experimental "bleeding edge" distributions for the most daring Uber Geek, to those distributions that focus on usability.

I have tried many different Linux distributions for varying reasons over the years and I settled on Ubuntu for one simple reason:  I have a job to do - and it's often difficult enough to do what is needed without having to jump through the roadblocks and hoops imposed by those distributions who don't know better, or just don't care.

Until recently, Ubuntu has been my favorite distribution because "it just works".   Period.  In fact, I praised Ubuntu in a previous posting on my blog as the *ONLY* Linux distribution that I would be willing to install on my wife's computer - or even the computer run by my sainted mother of 70+ years.

And why?  Like I said before, it just works.  You didn't have to be an uber geek to use it.  Of course, if you wanted to get your hands dirty and poke around under the hood, that was available too.

Unfortunately, in their latest distributions Ubuntu has sadly fallen away from this high standard of excellence.  In fact, perusing the various blogs and posts, I have noticed an increasing disdain toward "dumbing down" Ubuntu.

There seems to be an increasing emphasis on moving toward a more "edgy" (bad pun !) distribution model, sacrificing the usability that has been Ubuntu's hall-mark for years.

I have a number of beefs with Ubuntu, but I will place at the Ubuntu Community's feet the two that I think are the worst of the bunch:   Grub2, and the new GUI interface.

Note that I am referring to my own installed distribution - 10.04 LTS.


Grub2:

Back when Men were Men, and Linux was Linux, we had LILO as the primary boot-loader.  It was difficult, annoying, and a pain in the tush, but it was what we had; so we sucked-it-up and did the best we could with a bad situation.

Then, in a Stroke of Genius, someone came up with the Grub boot loader.  Not only was it a miracle of simplicity compared to the abomination that was LILO, it was a miracle of simplicity in it's own right.  Edits and configuration changes were as simple as editing a few lines in the menu.list file.

It's basic simplicity and ease-of-use resulted in virtually Every Distribution Known To Man immediately depreciating LILO and switching en masse to Grub.

In fact, over 99.9999999(. . . . .)99999% of the existing distributions *STILL* use Grub for just that reason.  Even the most experimental and Bleeding Edge distro's still use Grub.

Unique among all distributions, Ubuntu and Ubuntu alone, has decided to switch to Grub2 despite the fact that Grub2 is probably one of the most difficult boot-loaders I have ever had the misfortune to come across.

It resurrects everything that was Universally Hated and Despised about LILO, and it does it with a vengeance!

Not only does one have to go edit obscure files located in remote parts of the file-system, one has to edit - or pay attention to - several different files located in different places, presumably doing different tasks in different ways.  And one cannot edit simple menu lists, one has to create entire shell scripts to add a single boot entry.  Even LILO wasn't that gawd-awful.

It is so bizarre that even the foremost author on Grub, Dedoimedo - the author of the definitive Grub tutorial - mentions in his tutorial on Grub2:
Warning!  GRUB 2 is still beta software.  Although it already ships with Ubuntu flavors, it is not yet production quality per se.
When discussing the question of migrating to Grub2, he says:
Currently, GRUB legacy is doing fine and will continue for many more years.  Given the long-term support by companies like RedHat and Novell for their server distributions, GRUB legacy is going to remain the key player. . . . .
And to put the cherry on top of the icing, on top of the cake, he says:
Just remember that GRUB 2 is still beta. . . . so, you must exercise caution.  What's more, the contents and relevance of contents in this tutorial might yet change as GRUB 2 makes [it] into. . . production.
(Ref: http://www.dedoimedo.com/computers/grub-2.html )

This is oh, so true!  Even the existing Ubuntu tutorials on Grub2 don't match current, shipping configurations - which makes attempts to edit Grub2's boot configuration more difficult - even for seasoned pro's at configuration edits.

Why, oh why, did Ubuntu have such an absolutely asinine brainstorm is totally beyond me.


The new GUI:

The ultimate goal of any Linux distribution - especially Ubuntu - is to encourage cross-over adoption by users of other - proprietary - operating systems.  And when we talk about cross-over adoption from other operating systems there are only two others of significance: Windows and Mac.

Mac users don't see their platform as a computer or an operating system; to them it is virtually a religion - with the rest of us being the poor, pitied, un-saved heathen that we are.  Expecting them to drop Salvation according to Jobs in favor of Linux is just silly.  Especially now that they can crow that they have their own 'nix O/S.

So, the best and most obvious choice for cross-over adoption are those users who use the various flavors of Windows.

Microsoft's licensing and activation paradigms have become so onerous and expensive that entire national governments as well as several states here in the US, (ex. Massachusetts for one), have completely abandoned Windows in favor of Open Source solutions.

"It is intuitively obvious. . . .", (as my Calculus professor used to say), that Ubuntu should be in a position to garner the lion's share of these cross-over users, right?  And the obvious move to encourage this would be to make the target interface as friendly and familiar as possible.   Right?

So - what does Ubuntu do to encourage Windows user cross-over?  They have gone to great lengths to make their user interface as Mac-like as they possibly can, short of being sued by Apple!  As if Mac-izing the GUI will cause legions of Apple users to abandon The True Faith and jump on the Ubuntu bandwagon. . . . .

Brilliant move Ubuntu!  Encourage Windows cross-overs by plopping them into a completely alien user interface!

In Summary:

Ubuntu's original claim to fame was the attempt to de-mystify Linux and make it increasingly usable by heretofore non-Linux users.  Canonical and the move by Ubuntu's leadership away from these ideals is, in my humble opinion, a huge mistake with potentially disastrous consequences for both Ubuntu in particular and Linux as a whole.

What say ye?
Jim

Thursday, March 10, 2011

Terrabyte or not Terabyte
That is the question


In my last post, The 2000GB Gorilla, I discussed some of the issues surrounding the newer 2 terabyte hard drives.  Partition table types, allocation unit sizes and partition alignment all had to be taken into account.

I've been working with a pair of 2TB drives for the past week - and I've become increasingly frustrated.  They'd work wonderfully one minute - I could torture-test them into the ground without so much as a hiccup.  Next thing I know, they blow up leaving dead bodies all over the Peking Highway.

Yet, it's not absolutely reproducible.

The lack of reproducibility, and the fact that the largest number of the drive "failures" occur at the highest eSATA port number, leads me to believe that there is more to this than meets the eye - and I began looking at the controllers themselves.

Looking online I notice that SATA controllers come in three distinct "flavors":
  • 1.5 Gbs
  • 3.0 Gbs
  • 6.0 Gbs
Along with two major versions:
  • 48 bit LBA (logical block address)
  • 64 bit LBA
Where the "64 bit LBA" cards claim to be able to handle the 2T+ drives.

Now WHY should 64 bits of Logical Block Address be required for drives larger than a single terabyte or so?

If I look at the addressing range for a 48 bit LBA, (248), I get 281 trillion bytes, (2.81 x 1014 or 281 TB), if we assume that the smallest item addressed by the logical block address is a single byte.

However – these are logical BLOCK addresses, not byte addresses and the smallest addressable unit is the sector, (or allocation unit), which is, (usually), 512 bytes.  So what you should have is 281 trillion sectors of 512 bytes each which is a pretty doggone big number.  Even if we ignore sectors and count just bytes, we still have more than 281 terabytes to play with.

Just for grins and giggles, if I assume that this is 281 (plus-or-minus) terra-BITS – we’d divide by 8, which still gives us well over 35.2 terra-BYTES of storage.

The only thing that comes even close to conventional numbers is dividing the 281T, by 512 – which gives us right at 551 gigs.

Again, this does not make sense.

First:  There is no logical reason to divide by the sector byte count.

Second:  If that were true, there would be no 1TB drive on the planet that could possibly work.  They would all puke at 551 gigs due to wrap-around.

Looking at a 64 bit LBA, 264 equals 3.6 exta-bytes (3.60 x 1017 bytes) which should be plenty enough bytes to last anyone for at least the next year or so until the google-plex sized drives come out.

So, no matter how we slice it, there should be plenty of bytes to go around and I suspect that the 64 bit LBA is more of a marketing tour de force  rather than a real hardware requirement.

So. . . .  What is the real limiting factor?

My money is on “the controller card and its memory”.

The SATA controllers that talk to both the drives and the computer buss itself have to do a significant amount of data-translation – parallel to serial – as well as serial to parallel.  Addresses as well as data received need to be assembled and dis-assembled somewhere and the serial ATA controllers have to have registers large enough to handle the data widths.

My suspicion is that the controller card memory, which was plenty and more than plenty, when handling drives 1TB or smaller; becomes a critical resource when handling 2TB drives.

I also suspect that my specific controller card, (I am assuming it was spec’d for four 1TB drives max), depends on the fact that at no time will all four drives be sending data absolutely simultaneously as the controller can “control” (duty-cycle), the data streams to keep things in-bounds.

Two 2TB drives running at the same time is the equivalent of all four 1TB drives talking all at once – and that becomes a juggling act that the controller may have trouble keeping up with, since the controller cannot duty-cycle individual 1T data-streams.  And when a hard-drive controller starts dropping the balls, well. . . .  Lets just say that it’s not a pretty sight.

So – IMHO – the real limiting factor here is that the existing hardware SATA controllers have been outgrown by their respective drive sizes; requiring us to either limit the number of 2T drives, (or not use them at all), OR upgrade to a more modern controller that is equipped to handle the larger drive sizes.

What say ye?

Jim

Tuesday, March 8, 2011

The 2000 Gigabyte Gorilla


Here's the scenario:

You have a computer that supports SATA / eSATA - or an external drive enclosure that supports SATA - and you decide you want a huge drive to fill it.

You snoop around and find a really good price on 2+ terabyte hard drives, so you buy a couple-or-five, depending on your cash situation.

You bring them home, carry them lovingly to your computer, hook them up, and proceed to partition and format them in the way you usually do.

Unknown to you, there's a 2000GB Gorilla in the room with you.  And that's when the fun begins!



In my case, I wanted to hook them up to the Linux box I am using for my primary file store so that I could make space on my RAID array.  I was planning to move less critical files to a more "near line" storage device, so I needed a very large drive to accommodate them.

So, I did exactly that.  I plugged one in, partitioned and formatted it in the usual way and started copying almost a full terabyte of data over to it.

Unfortunately about half way, (maybe two thirds of the way), through the copy the drive errored out and remounted as read only; causing the entire copy process to go straight to hell in a hand-basket.

I tried everything.  I changed interface adapters, I used a different power supply to power the drive, I even hooked it up directly to my computer's eSATA port.

No difference.  It would still error out about half way through the copy.

So I'm thinking:  "$^%#*&@!! - stinkin' hard drive's bad. . .!" and I get out the second one I bought. (I bought two, so I'd have a spare.)

I repeat the entire process and - sure enough - the drive fails about half way through the bulk copy.

I look on the Internet and I see a whole host of articles complaining that these drives, (from Western Digital), are pieces of GAGH!  Everybody's having issues with them and not a few unkind things were said to - or about - Western Digital.  Not to mention a whole host of other drive manufacturers who appear to be having the same issues.  Even my buddy, Ed, at Micro Center says they're all junk.

Hmmm. . . . .  Is EVERY two terabyte hard drive garbage?  This doesn't make sense to me.  Western Digital, Samsung, Hitachi, Seagate and all the rest of the hard drive manufacturers might be crazy, but one thing is absolutely certain:  They are NOT stupid.  I cannot believe that any reputable manufacturer would deliberately ship crates and crates of drives that are known garbage to an unsuspecting public.

Of course, the "conspiracy theorists" are having a field day:  It's all a conspiracy to get us to buy solid-state drives!

But it doesn't make sense to me.  Why would any reputable manufacturer risk his good name and reputation for the sake of a "conspiracy"?

I still couldn't see the 2000GB Gorilla, but I decided to dig a little bit deeper anyway.



Let's pause for a short trip down memory lane. . . .

Back at the Dawn of Time - when Men were Men, and Hard Drives were Hard Drives, (and starting one sounded like the jet engines on a B-52 winding up), hard drives used a very simple geometry known as "CHS" - Cylinders, Heads, and Sectors.  Any point on the drive could be addressed by specifying the cylinder, (the radial position of the heads), which of the many heads to use, and what sector on that particular platter is desired.

Once hard drives started to get fairly large - larger than about 512 megs - the old CHS scheme had troubles.  In order to address a particular sector on the drive, the number of cylinders and heads had become larger than the controllers could handle, so there were BIOS updates that allowed the drives to report a fictitious CHS geometry which would add up to the correct drive size.

Again, when hard drives became relatively huge, (around 8 gigs or so), there was another issue:  The CHS system could not keep up.  So hard drives, and the respective computer BIOS programs, addressed this issue by switching to Logical Block Addressing, (LBA), where each sector was numbered in ascending order.  And that kept people happy for a while. . . .  But not for long, because hard drives were getting bigger, and bigger, and bigger, and . . . . . .

Enter the 132 gig problem:  We've run out of bits to address all the logical blocks on a large drive.  So there was another hack: Extended LBA, (also known as LBA-48), that increased the bit-count even more.  This allowed the IDE/ATA interface to accommodate larger and larger drives capacities.

At around 500-or-so gigs, the LBA addressing scheme, (as well as the entire ATA architecture), was straining at the seams.  There were architectural issues that could not be solved simply by throwing bits at them.

This time - instead of hacking what was rapidly becoming an old and crufty interface - they decided to go an entirely different direction; SATA, (serial ATA).  It was faster, it was neater to install because the cables were smaller and it allowed, (theoretically), a virtually unlimited addressing range.

As a plus, because of the smaller cable arrangement with fewer pins to accommodate, drives could be added externally to the computer - hence eSATA.  Drives were still using LBA addresses, but now the addressing range was much greater.

And. . . .  just to make things even more interesting. . . . .

For the longest time hard drives, and their manufacturers, were leading a double-life.

In public they still supported both the CHS and LBA geometries, but secretly they were re-mapping the "public" geometry to a hidden geometry that had no real relationship to the public one.  And what a life it was - on the outside they had the stodgy, old and conservative wife, but secretly they had the young, sexy mistress making things nice for them.

In fact, this had been going on since the original 512 megabyte limit issue, when the drives started reporting ficticious geometries that would keep the BIOS happy.

"All good things must come to an end" and if you're living a double life you eventually get found out.  Which, by the way, is exactly what happened.



Fast forward to the present day as drives keep getting bigger and bigger.

Somewhere between the 1.5 TB and 2TB drive sizes, the drive manufacturers reached a crisis.  Trying to keep up the "512 byte sector" facade was becoming more and more difficult.  Making things worse was the fact that most every operating system had given up addressing things in "sectors" long ago.  Operating systems started allocating space in terms of "clusters"; groups of sectors that were treated as a single entity.  The result was that for every request to update a cluster, a multitude of sectors had to be read, potentially modified, and then written back - one by one.

Early attempts were made to solve this bottleneck by allowing read and write "bursting"; asking for more than one sector at a time and getting all of them read - or written - all at once.

Increasingly large amounts of cache memory on the hard drive were used to mitigate the issue by allowing the computer to make multiple requests of the drive without actually accessing the drive platters themselves.  Since, for a fairly large percentage of the individual drive requests, the O/S would be addressing the same or near-by locations, the drive's cache and internally delayed writes allowed the drives to keep up with the data-rate demands.

Later still, hard drives adopted "Native Command Queueing", a technique that allowed the drive - internally - to shuffle read and write requests so that the sequence of reads and writes made sense.  For example, if the computer read a block of data, made changes, wrote the changes, then made more changes and wrote them again; the hard drive could choose to skip the first write(s) since all the changes were within the same block of data.

Likewise, if multiple programs were using the disk, and each wanted to read or write specific pieces of data; the drive, (recognizing that all these requests were within a relatively short distance from each other), would read all the data needed by all the applications as one distinct read, (or write as the case may be), saving significant amounts of access time.

However. . . . .  There's still the 2000GB Gorilla.

When you get up into the multiples of terabytes, keeping track of all those sectors becomes hugely unwieldy.  Translation tables were becoming unreasonably large, performance was suffering and the cost of maintaining these huge tables, as well as the optimization software needed to make them work, was becoming excessive.  Both the cost of the embedded hard drive controller chip's capacity and speed, as well as the sheer manpower needed to keep it all working, had become a significant expense.

What happened is what usually happens when manufacturing and engineering face a life-and-death crisis:  All the engineers got together, went to a resort somewhere, and got drunk . . . .

After they sobered up, they came up with a solution:  Drop the facade, and "come clean" with respect to drive geometry.  The result was the new Advanced Format Drive, (AFD), geometry that abandoned the idea of 512 byte sectors, organizing the drive geometry into larger "sectors", (now called "allocation units" or "allocation blocks"), that are 4kb in length.

And I am sure you can guess what happened next.  It's what usually happens when someone comes clean about a sexy young mistress - the stodgy old wives had a fit!

The BIOS writers were / are still using the "Interrupt 13", (Int-13), boot process - a fossilized legacy from the days of the XT - and maybe even earlier.  And this boot process requires certain things:
  1. The hard disk will report a "sane" CHS geometry at start up.
  2. The Int-13 bootstrap would see 512 byte sectors for the partition table, boot code, and possibly even the secondary boot loader.
. . . . and it's kind-of hard to square a 4k allocation unit size with a 512 byte sector.

So, to keep the stodgy old wives happy, the hard drive manufacturers did two things:
  1. They allow the first meg-or-so of the drive to be addressed natively as 512 byte sectors.  This provides enough room for the MBR, (Master Boot Record), and enough of the bootstrap loader so that the Int-13 boot process can get things going.
  2. The drives would still accept requests for data anywhere on the drive based on 512 byte sectors with two caveats:  There would be a huge performance penalty for doing so, and YOU had to do more of the work to keep track of the sector juggling act.  And God help you if you dropped the ball!
And this is exactly the crux of the problem:  Many operating systems, (surprisingly, later versions of Windows are a notable exception), depend on sharing the juggling act with the hard drive itself.  Even Linux's hard-drive kernel modules assume that the drive will shoulder some of the load when using the legacy msdos partition table format.

I am sure you can guess what happens when HE expects you to be shouldering the entire load, and YOU expect him to shoulder his share.

This, my friends, is the 2000GB Gorilla and if he's not happy, things get "interesting". . . .

So, how do you go about taming this beast?

Interestingly enough, there has been a solution to this all along.  It's not until now where larger capacities, (that require the AFD drive geometry), appeared on single-unit drives that things have come to a head.

First:
The old "msdos" type of partition table makes assumptions about drive geometry that are no longer true.  Not to mention the fact that the msdos partition table can't handle exceptionally large drives.  Not without jumping through hoops or some really ugly hacks that we really don't want to think about.

The solution is to just abandon the msdos partition type, as there are a host of other partition types that will work just as well.  One in particular, GPT, is especially designed to work with more advanced drive geometries.

You do it like this:
(I'm using GNU parted, so that you can actually see what's happening.)
# parted
(parted)  select /dev/[device]
(parted)  mklabel gpt
(parted) [. . . . .]

Presto!  A non-msdos partition table structure that is compatible with the newer drive geometries.

Second:
You have to make sure that the partition table's clusters, (allocation units), are set up so that the logical allocation units, (where the partition thinks the clusters are), and the actual, physical allocation units on the hard drive itself, are aligned properly.

If you fail to do this you could suffer the same massive performance penalty as if you were addressing 512 byte sectors; because for every allocation unit you read or write, multiple physical allocation units may have to be individually read, updated, and/or written.  Fortunately the Linux partitioner, parted, will complain bitterly if it notices that things aren't aligned properly.

The solution - when using parted - is to skip the first meg of the drive so that physical and logical allocation units align correctly.

Like this:
(parted)  mkpart primary ext4 1 -1
(parted) [. . . .]
(parted) quit

Here you make a partition that is primary, preset as an ext4 partition, starting at a 1 meg offset from the beginning of the drive and stopping at the very end, (-1).  Of course, you can set the partition to ext2, ext3, or whatever.  I haven't heard of this being tried with xfs, Reiser, etc, so Your Mileage May Vary.

By the way, this works with the Western Digital drives I purchased and, ideally, other manufacturers should map their drives the same way.  However if you get a warning that the partition is not aligned correctly - look on the web, try different offsets and keep plugging at it until you get the geometry lined up just right.

Update:
I finally had the chance to try this with a couple of 2TB Seagate drives and the partitioning scheme mentioned above worked like a champ with them as well.  So there's a really good chance that, whatever brand of 2TB hard-drive you buy, this fix will work just fine for you too.
/Update

If you are creating multiple partitions, you have to check alignment for each and every partition from beginning to end.  Fortunately, if the first partition is aligned properly, there's a good chance that subsequent partitions will align properly too.

Once you do that, you can use mke2fs, (or whatever), to create the actual file system in the normal manner.  And once that is done you should notice that the drive access times are MUCH faster than before and you don't get a mid-drive logical crash!

It may appear more complicated now but I suspect, very highly, that those who work with these fundamental drive utilities and drivers will rapidly bring their software up-to-date so that this stuff is handled transparently in future releases.

There is, unfortunately, one caveat with all this:  You can kiss backwards compatibility with legacy versions of Linux goodbye - as well as compatibility with legacy non-Linux operating systems when you switch to any kind of advanced partitioning scheme.

Of course this is not news.

When drives switched from CHS to LBA, from LBA to LBA-48, or from parallel ATA to serial ATA, backward compatibility for the newer drives was also lost.  You could regain it if needed, but not without using some butt-ugly hacks or specialized hardware adapters.

And my money's on the almost certain possibility that - in a few years when hundreds of terabytes, or peta-byte hard drives become main-stream - the AFD geometry will need a major update too.

What say ye?

Jim

Sunday, March 6, 2011

An Open Letter to Canonical and the Ubuntu Team


As Mark Twain once said:  "You shouldn't criticize where you, yourself, cannot stand perpendicular." (or something like that. . .)   Anyway, the message should be clear, take the 4x4 pressure-treated beam out of your own eye, before trying to remove a splinter from someone else's.

So - I really hate to criticize someone else's work - especially if I'm not a "contributor" to that work.

However, noting the current trend in the Ubuntu development, I feel compelled to make my feelings and opinions known.



Dear Canonical,

When I switched to Ubuntu from Red-Hat / Fedora, I was especially attracted by the Ubuntu slogan: "It's all about giving the user choices."

And this is a great concept. If a particular user wants a system that is essentially an "appliance", it's there; a distribution that is simple to configure and easy to use.

Likewise, if a user wants to "get under the hood" and get his hands dirty, that's available too.

Some of the features are absolutely unprecedented in the world of 'nix operating systems, such as the automagic "apropos" feature where a mistaken or mis-typed command is rejected - and the errant user is supplied with "did you mean. . . .?" suggestions. Even to the relatively experienced sysadmin, this feature is both welcome and useful.

I also absolutely love the graceful way Ubuntu now handles device or mount errors in fstab - instead of puking it's brains up with a kernel panic and dropping them into a very limited shell - you tell the user "Such-and-so didn't mount or isn't ready yet." and you offer the user the choice of doing an immediate fix-up, or just continuing without that device.

The ability for me to say "Yes, I know about that, just keep on going." is invaluable.  And it is especially invaluable for people like myself who often work with multiple possible configurations at the same time. Even if there is a real problem, (Oops! I forgot to update the UUID!), this is much easier to take care of from within the GUI, than from within a severely limited shell environment.

Unfortunately, both on the forums and within the distributions themselves, there is an increasing disdain for "dumbing down" the distribution.

Folks, I hate to break the bad news to you, but it is exactly and precisely this; the "dumbing down" as it were, that makes Ubuntu such a popular distro - you don't have to be an uber-geek to use it.  In fact, I mentioned in an earlier article on this blog that "Ubuntu is the first Linux distro that I would seriously consider installing on my wife's computer, or even my mother's."

My wife is the quintessential anti-geek, and my mother thought that Windows 98 was the best thing that ever happened.  They don't want stacks of Hollerith punch cards, or lists of cryptic commands - they want a system they can turn on, use for something useful and be done with it.  Just like a toaster or microwave.  These are people for whom today's multi-button TV / Satellite / DVD / Home Entertainment remote control is beyond their technological grasp.

But! "You don't want to dumb-down Ubuntu". And that, in my humble opinion, is a great loss for both the distribution in particular and those people who might be convinced to use it.

Secondly: Ubuntu is, (supposedly), all about giving the user "choices". . . . .

I don't know about you, but my understanding of the word "choice", (as in "choices"), means that I get the option of choosing between more than one alternative; that I get an active say in what, and how, my system is organized and configured, when it is being organized and configured.

Unfortunately, that credo has - apparently - gone by the boards at Ubuntu.

A couple of cases in point:

Someone, somewhere, had the brilliant revelation that Ubuntu should switch to Grub2 from the venerable, stable, and well understood Grub boot loader.

Again, in my humble opinion, Grub2 represents a throwback to everything that was universally hated and despised in the LILO loader. It's difficult to configure because the user has to find - and edit - an obscure "template" file and then run a special command that makes the changes for him. This is what made LILO such a (ahem!) "popular" loader - you couldn't just go edit a config file somewhere, you had to jump through hoops and pray to the Blessed Virgin that you didn't inadvertently bork things up beyond all recognition.

It seems to be an incredible coincidence that Ubuntu is the only distribution that has embraced Grub2. Even the very experimental distributions that seek to be at the Bleeding Edge of the curve have stayed away from Grub2 in droves.

But the most disturbing aspect was this: I wasn't offered a choice. Nowhere in the installation or upgrade process was I asked "do you want to use Grub2 or Grub as your boot loader?" Grub2 is the default. And in my opinion, it's clearly "default" of whoever had that brilliant idea in the first place.

More recently someone had the amazing brain-storm that the default GUI should suddenly transition from the familiar Windows-like interface, to a much more Mac-ish design with tiny, difficult to see and use, Mac-like buttons all on the left hand side.

It is important to remember when designing a GUI, that not everyone is 20 years old and not everyone has 20/20 eyesight. Buttons, especially these fundamental control buttons, should be big and bright so that they are easy to find and easy to use.

Again, this was not something that I had the opportunity to select or not as I saw fit. Instead, it suddenly and magically appeared.

This change, more than any other change Ubuntu has foisted upon us, has me shaking my head in absolute wonder.

Grub2? If I make an incredible technological stretch of my imagination, I can - maybe - see some sense in supporting the newly invented Extended Boot Protocol; despite the fact that the only PC architectures that required it were PC's based on the now defunct Itanium processor.

However, the move to Mac-ize the GUI is absolutely beyond my comprehension, no matter how far I stretch my imagination. Does Ubuntu seriously believe that by this change in the GUI, that they can convert legions of Mac users to Ubuntu?

You forget two essential facts:
  • To the Apple product user, the Mac isn't a system; it's more like a religion - with the rest of us being the "unsaved heathens". Switching to any other operating system would be sacrilege of the highest order! Mac users may - under duress - use other operating systems at work because they are forced to; but they complain endlessly about it.
  • Over 95% of the personal computers in use today, (as well as a substantial percentage of the servers), use Windows of one flavor or another. The Windows GUI paradigm is, unarguably, the most popular and well known GUI on the planet. And I strongly suspect that should we ever venture to Mars or other planets in our Solar System, Windows will be in the vanguard of that venture.

So, if we assume that one of the main thrusts of the Linux community is to attempt to broaden the Linux user-base, where do you think these users are going to come from? The Mac? Who are you kidding?! They already crow about having their own 'nix based system - Free BSD - with the Mac GUI pasted on top of it. A 'nix based Mac was inevitable, but by slapping the Mac GUI on it they keep their religious sensibilities and the purity of their beliefs.

No, the real market for cross-over users are those that use Windows. The Microsoft licensing model is becoming increasingly onerous. The real cost of implementation is becoming increasingly expensive to the point that entire governments, both state and national, have eschewed Windows in favor of Open Source solutions.

Changing the basic GUI paradigm from a familiar Windows-like paradigm to a much less popular and more difficult to use Mac type interface only serves to drive away users that might be tempted to make the switch. Linux is already different enough in many respects, why make it even more alien?

Allow me to offer the following suggestions:
  • Loose Grub2.
    I don't know of a single Sysadmin using Linux who would, willingly, get within a hundred yards of Grub2. And it's the Sysadmins who make the recommendations on what operating system to use, that drives the implementation of Linux in general and Ubuntu in particular.
  • Forget the "pseudo-Mac" interface.
    The easier you make it for the Windows user to make the transition, the more Windows users will actually want to make it. Again, it's the power-user that is at the forefront of evangelizing Linux. The more you frustrate them, the less likely it is that they will recommend the switch.
  • Keep It Simple, Stupid! (The "KISS" rule)
    "Dumbing-down" Ubuntu to make it more easily within reach of the average user is absolutely the primary key toward the goal of getting people to transition away from Windows.
  • Don't forget to actually give the user a CHOICE.
    Most distributions, prior to making a radical change, "depreciate" the original method for several distributions before making the actual change itself.
    First of all, it puts people on notice that a fundamental change is in the works.
    Secondly, it gives users a chance to "try before they buy" - and weigh in on the proposed change. Does this change annoy 90+% of your user-base? Uh, maybe we should rethink it. . . .
  • Fork the distributions.
    Make the ".0x" distributions focus on what works, not on what can be changed. Avoid making radical changes in the design - unless absolutely, positively, inescapably necessary. This gives the user an important continuity of design that is essential in production environments. This continuity of design also eases the transition shock of those switching from other operating systems.

    Make the ".1x" distributions the "experimental" distributions - where new things are tried and eventually proven or discarded. Eventually, when a new feature or other change is sufficiently proven and useful, it can then be merged with the main-line ".0x" releases.

In essence, Ubuntu would actually consist of two, separate and distinct, release paths. The first for those users who want long-term stability and the other for those users who want to be on the Bleeding Edge.

It would also have the advantage of giving each prong of the fork a one-year release cycle, as opposed to the current release cycle of one every six months.

You could have two separate teams, each working with enough time to make their distributions the best there is. It would also give time for a few "Official Beta Releases" to test the waters, so to speak.

Please remember that it is ultimately the user-base itself that decides if a particular distribution sinks or swims. Right now Ubuntu is riding high - just don't forget that it's a long hard fall when you get toppled from your ivory pillar.

What say ye?

Jim