git.kde.org down…

Shrek.kde.org is a powerful server that hosts git.kde.org and svn.kde.org. Both are virtual machines. Shrek’s operating system was dated and had run out of security updates, it needed to be updated. We planned that for the end of the week and this morning got the confirmation that the work could be done today.

We took down the virtual machines. Upgraded Shrek without problems. Then the 2 virtual machines were started again to resume normal operations. A few seconds later it became clear that the virtual machines had file system corruptions. We took down the machine’s and run fsck’s on the images. It reported double used blocks for ~300 files. The result is that some git repo’s are damaged beyond repair on the master server. SVN has been fixed and is running fine.

You would probably think that it’s not that bad, we have like four anongit mirrors around the world from which we can restore from. But there is a problem. A consequence of powering up the git server for a few minutes have been that the anongit mirrors have synced the corrupted repo’s, hence corrupted repo’s on the mirrors.

Luckily it seems we can restore bits and pieces from each mirrors and other tricks our git experts are implementing. All the sysadmins that have knowledge about this are on it and working hard to complete this difficult task. This includes Dirk Muller, Nicolas Alvarez, Jeff Mitchell and Ben Cooksley.

I’ll try to give more updates via identi.ca: http://identi.ca/kdesysadmin

4 thoughts on “git.kde.org down…”

  1. Wow, seems like you guys need a proper storage system… if you need some reliable NFS (and/or iSCSI/Fibre) storage let me know, I can get you some hardware (for free). It’s not the latest & greatest but it’s very reliable

    If you want to know more feel free to contact me

    -Michael

    1. Hey Michael, thanks for the offer. Actually, i am sure we can find a very good place for such a device.
      Could you tell us a bit about the specs? (space, bandwidth etc). And also, some encription maybe?

  2. Wow-wee. The phrase “last known valid state” sends shivers down my spine. Mirrors are no good if they all sync from a single source. If the single repo goes south, then everything is FUBAR.

    Hate to say it, but should have had redundant backups.

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>