UK Rio Engineer schreef op Mar 13, 2004 @ 04:47 PM:
*Personal opinion!*
From what we've been able to tell, the "dead" HDDs most people report have one failure - head on disk. This is when the hard disk has stopped rotating, but the heads (which should have been moved to the parking ramp and locked) are still on the disk platter.
Hard disk heads "fly" over the surface of the disk on a cushion of air, generated by the movement of the platters. The surface of the disk is very sensitive, and it's not designed to have things landing on it. As both the disk and the head are ultra-smooth, when they touch they sort of bond (this is which people call sticktion, and you can witness it with any two incredibly smooth surfaces that are put together).
The bond is so strong that the hard disk can't get spinning the next time it wants to spin up, as in effect, the head is holding the disk.
An ATA HDD is a complete system in itself, and there's no situation where it should turn off the HDD motor without the heads being safely parked. Even in the worst-case scenario (hard disk power unexpectedly removed) it has a system which uses stored charge on a capacitor to park the heads when there's no power going to the drive - it does this instantly, while the platter is still spinning. Mobile drives (maybe desktop ones too, nowadays) also have a head lock solenoid, which physically locks the head in the parked postion, to guard against shocks when the drive isn't running.
Obviously, our firmware follows Hitachi's guidelines about turning the hard disk off - you issue a standby command, wait for this to finish sucessfully, then remove power.
Note that the heads being on the disk is also a possible result of a *large* shock when the drive is not operating, causing the head to fly out of its parked/locked position and land on the disk. IMO this would be more serious, as the head is likely to have scratched the platter radially. The one drive I've seen in this state (which was dropped down two floors in a stairwell) did come up after hitting it, but had a huge number of bad sectors and simply wasn't well.
My *personal* theory is that there is a bug in the drive firmware which means that under some circumstances (we don't know what they are, otherwise this would become a very easily replicable problem and so rather easier to prove), the drive will stop spinning without parking the heads, meaning that the next time it tries to power up you just hear the "click click click" noise from the drive, which is the motor driver energising the platter motor trying to get it spinning.
When people hit their drives when they are in this state, the shock breaks the bond between the head and the platter, and the disk spins up and carries on working. The issue here is that the head landing on the platter may have caused surface damage to the disk, ie bad sectors. Like any ATA drive, the one in the Karma deals with this automatically, and will reassign a good sector to replace the bad one from the reserved areas of the disk the next time this sector is written. I believe that when people hit their player, then see it rebuilding the database when it comes back up, a sector in the database has been marked as bad and hence the player rebuilds it and saves it to disk, hence reassigning the bad sector and returning the system to full function.
Note also that a violent "twist" of the player may also work, as in - hold it in your outstretched arm with the screen towards you and twist your wrist back and forth as quickly as you can. Possibly this is a less frightening thing to do with a non-functional player than hitting it, which is obviously something neither myself, Rio or Hitachi would recommend.
The reassignment process doesn't make your drive "smaller" - it uses a non-system-accessible area of the disk for remappings. Obviously, there are only a finite number of reserved sectors, and so at some point the drive will not be able to reassign and will have to just permanently record the sectors as bad. At this point, things will start to fall over unrecoverably.
So, results of hitting the drive may include:
- No harm at all
- Bad sector(s) generated in the database area: A database rebuild (player self-heals)
- bad sector(s) generated in the music area: A track being marked as bad (if a sector in the track is unreadable). You would have to delete then re-upload the track, which would reassign the sector.
Hitachi are investigating this problem, and have been sent drives that have shown this behaviour. We are also trying to come up with ways that we think we might be able to modify our firmware so as not to provoke the possible firmware bug in the drive itself.
IN MY OPINION it's a drive firmware bug, that happens to be provoked by the order of read/write commands we sent to the drive, and subsequent write buffer levels at the time of the flush cached/standby commands. However, getting a component supplier to admit liability is hugely unlikely, so we're seeing if there are any workarounds we can come up with. Problem is, with the situation being incredibly rare on any one player (ie, it has a good chance of NEVER happening on a single player - it's never happened on mine, though it has happened twice in our office, where we have had maybe 30 players for the last year) it's also hard to work out whether your changes have improved the situation at all!
Just to reiterate: these are my opinions, not those of Rio.