Shrek.kde.org is a powerful server that hosts git.kde.org and svn.kde.org. Both are virtual machines. Shrek’s operating system was dated and had run out of security updates, it needed to be updated. We planned that for the end of the week and this morning got the confirmation that the work could be done today.
We took down the virtual machines. Upgraded Shrek without problems. Then the 2 virtual machines were started again to resume normal operations. A few seconds later it became clear that the virtual machines had file system corruptions. We took down the machine’s and run fsck’s on the images. It reported double used blocks for ~300 files. The result is that some git repo’s are damaged beyond repair on the master server. SVN has been fixed and is running fine.
You would probably think that it’s not that bad, we have like four anongit mirrors around the world from which we can restore from. But there is a problem. A consequence of powering up the git server for a few minutes have been that the anongit mirrors have synced the corrupted repo’s, hence corrupted repo’s on the mirrors.
Luckily it seems we can restore bits and pieces from each mirrors and other tricks our git experts are implementing. All the sysadmins that have knowledge about this are on it and working hard to complete this difficult task. This includes Dirk Muller, Nicolas Alvarez, Jeff Mitchell and Ben Cooksley.
I’ll try to give more updates via identi.ca: http://identi.ca/kdesysadmin