Yesterday morning our primary system disc showed excessive Seek_Error_Rate errors … almost a billion. The disc has been running continuously for the last 4.7 years 24 hours per/day 7 days a week. I ran these numbers past our web hosting service and they recommended that we replace the drive as it was in pre-fail mode. Communicated with Saker and he approved of the outage to replace the disc. A 30 minute warning was posted, then the server was shutdown. I expected only a 1-2 hour outage but I sure was wrong. The drive was removed from the server and data was cloned to a new drive by the hosting staff. This took hours as explained in detail here . After completion of the cloning and the installation of the new drive the system failed to fully boot. After a few re-boots by the hosting staff it was found that an additional manual console step was required. Once done system booted up fully and thesaker.is site is now back with a brand new disc that hopefully will run for another 4.7 years.
UPDATE: Some of the commentators have suggested raid-1 technology for the site. So I asked our hosting provider to give a quote on having raid-1 storage. Software raid 1 is $204/year or $17/month … sounds like a good price. The hardware raid-1 is much more expensive at $459/year. or $38.25/month. Extending these numbers to the 4.7 years that we used the last drive the raid insurance would have cost us $959 (4.7yr) for the software raid and the hardware raid would be $2,157 (4.7yr) . I personally feel that this level of continuity and uptime is not required for a non-revenue generating site.
Regards
Herb Swanson (webmaster)
I am so enormously relieved.
I thought you guys were taken down.
Thank you Herb.
Me too!
Really relieved to see it back and working!
What OS? With the latest Windows Server, what would you get 2009 minutes, if you are lucky? Granted its a client OS, but Windows 9x was good for around 95-98 minutes in my experience. But then I used it to access Flash dependent sites through AOHell.
PS, I am well aware that in the mainframe days 10 years was possible, and that as a server OS Linux and the BSD flavors are relatively stable compared to Windows, which like glass, breaks easily.
Yes I also would like to know how long the disc would last. It ran for 4.7 years and was still working flawlessly. However to find out we would have had to run until failure (this would be unacceptable on a production website). Our hosting firm reviewed the disc ‘smart’ attributes and recommended replacement. I have to respect their recommendation as I’m sure they have seen many disc failures.
Herb
Hi,
15 plus years web hosting experience here.
You should definitely look into a RAID mirror setup. Properly setup, if one disk fails the other still operates and you can replace the bad one without any down time.
As you may already have bought a new disk, you can use these reports from Backblaze for future reference. They run a backup company and they publish long term specs on hard disks models and failure rates.
https://www.backblaze.com/blog/backblaze-drive-stats-for-q2-2021/
IMO, 4.7 years is very long on the tooth, you should be prepared for a shorter time between replacements.
HTH,
BB
Thanks for the explanation, Herb, and the wonderful work you, the Saker and the rest of the team are doing at thesaker.is
Blessings and long may you run!
Hi guys!
Technical idea: what about having your system disk mirrored with disks that are not the same brand (RAID 1), with hot swap feature on the server (I am using this on my servers, some of them HP Proliant G8, pretty cheap second hand).
This would allow you to replace a system disk without any worry and having to shut down the server.
Same goes for the data disks (RAID 1 or 5).
In any case, thanks you very very much for the quality data you have on your site!!
Best regards,
MR
This is a relief – I have been worrying as well that something else than downtime for a HD replacement happened.
I second the recommendation for a RAID 1 or better RAID 5 array – with hot swap drives, you can replace in RAID 5 one dying disk even while the machine is up and running online. A synchronized drive in a second computer or NAS would be a good idea as well. Power supply units use to break down after some years, and sometimes they roast your main board in this process. Then you have several problems at the same time since depending on your OS, your drives may not boot on your new main board. Best solution with still reasonable hardware requirements is a web server with a separate boot disk and a RAID 5 array for the data only, ideally running Linux, the data drive synchronized to an external NAS drive as backup. Having a spare power unit in your shelf is not too expensive as well.
Good to see you back online… thought you might be in the cross hairs and taken down.
A Big cheers to the tech team.
Thank you.Appreciated
Who does the hosting/collocation…?
I only ask because they seem to have responded to the problem in a very positive and helpful way.
Well done!
Tried to log in late at night to no avail.Thought it could be some mischief by known enemies of the Saker Blog,Good you are back.
That’s what RAID is for, to eliminate downtime. Better still, keep a hot spare to replace a failed disk automatically, and keep redundancy. Even better a self-healing file system, with snaphots copied periodically to back-up.
Server grade HW, with redundant supplies, and error-correcting memory is highly recommended for 24/7 operation. Higher cost solutions with even higher resiliency, and better capability to survive are also available.
Consult a HW/system guy, those SW types have no clue of the innards.
BTW, SSDs are cheap enough nowadays, even those high endurance, and performance ones can be got cheaply SH.
Good thing you stopped at the first sign of a problem. I had a hard drive die (only once in 30+ years with multiple computers). I probably could have copied the data when it first showed signs of trouble, but I didn’t and ended up with it responsive like a brick. It could have been worse. I got 99.9% of my data back with the help of a data recovery firm, for $1900 and almost 2 weeks time. It was a good lesson about keeping up to date backups.
Good thing I did not e-mail the saker to inform about the site being under attack.
Servers can be configured with RAID mirrored drives. Even desktop motherboards have SATA controllers that support RAID. As an IT Director I required all servers in my division to be configured with RAID-1 at minimum.
I thought it was being shut down by censorship form “you know who”.
Apparently there is a back up plan for that, not sure its been tested.
Happy birthday Debian 11!, stable since 14 Aug. 2021.
Thanks so much Herb for your efforts!
We’re all on a learning curve in life.
Are there alternate sites in case the censorship screws tighten on this website?
The old blogspot ( https://vineyardsaker.blogspot.com/ ) domain may be even less reliable, and I noticed other saker domains (NZ/Oceania, Serbia) are down too.
Such a relief as we desperately need this website to inform us on the current events.
Best wishes and good health to my fellow Saker Community!!!
You cannot get rid of me that easily!
I had a panic attack!
michael lacey / Prospector / Mad Serbian and most of us, LOL!
I was driven to MOA site to see comments and found Per/Norway and Sean making inquiries; on Smoothie site, Larchmonter to my relief explained it was a system maintenance issue.
Last year there was a bit of predictive programming by Public Enemy called, “What You Gonna Do When The Grid Goes Down?”
What will WE do when the AZ control freaks nuke the internet, the penultimate point before the Götterdämmerung?!
where to get our fix?!
We should begin a gentle self-controlled withdrawal process.
Just a note from me. The Saker Site has amazing uptime (for an alternative and relatively small site in size, but massive in its reach and influence).
That does not come without smarts. Herb has done a hell of a job here, and I am in agreement with his no Raid-1 decision for a site of our size and a site that does not have a profit-making objective. It is a fine decision and we’re always broke as well!
@everybody: I TOTALLY agree with amarynth, Herb think has done a ABSOLUTELY SUPERB job running, maintaining and upgrading this website. He is, BY FAR, the best webmaster I had (and the other were pretty good too!).
As for the Raid-1 decision, he consulted with me and I fully agreed with this arguments.
Cheers
The Saker
On the cheap? Kind of a professional hobby of mine!
Snapshot/or some sort of delta dumps of the database, encrypt it, rsync it nightly over to a cloud provider. Linode has a pretty good low cost service. I pay $10/month for mine. Maybe deltas each night until a non busy night for full. So you’d have to move all that for a db restore but it would get you to the last night.
The rest of the system would be pretty static, so that could even be a system image burned to DVD. You’d have to update that each time you change the css/html etc.
Thank you so much for your time and energy mr. Swanson!
How about database replication via ssh?
Rsync runs on top of ssh and adds the synchronize part. Very nice for database dumps should a link break in the middle of a transmit. Rerun and you don’t have to start from zero. Ssh itself though, what a lovely utility! (I don’t wax lyrical about *nix utils every day but these two happen to have a special place in my heart.) LOL