Saturday, January 12, 2013

Recover broken Amazon EC2 instance

Sometimes shit happens, accidentally.
Especially  when you lost ssh access to your virtual server (e.g. due to errors in  ~/.ssh/authorized_keys, /etc/sudoers, /etc/group, /etc/shadow or another important config file).

When you catch errors like:

No supported authentication methods available

or

sudo
sudo: >>> /etc/sudoers: syntax error near line 1 <<<
sudo: parse error in /etc/sudoers near line 1
sudo: no valid sudoers sources found, quitting
sudo: unable to initialize policy plugin


Don't worry, inside the Amazon cloud you can repair almost all.

Follow these simple steps to solve the problem:
  1. Via AWS console:
    1. Launch new temporary EC2 micro-instance in the same availability zone.
    2. Stop broken instance.
    3. Detach root EBS volume from the broken instance.
    4. Attach this EBS to the temporary instance (as /dev/sdf).
  2. Via SSH console from the temporary instance:
    1. Mount this EBS:
      sudo mount /dev/xvdf /mnt
    2. Fix all problems on mounted file system.
    3. Unmount this EBS:
      sudo umount /mnt
  3. Via AWS console:
    1. Detach this EBS from the temporary instance.
    2. Attach this EBS to the broken instance (as /dev/sda1).
    3. Start broken instance.
  4. Check that broken instance now is healthy (available via ssh and everything is functioning normally).
  5. Then you can stop temporary instance (or terminate it).

I hope this article will make someone little bit happy)

5 comments:

  1. Saved my bacon. It silly that I ended up here (who woulda thunk that a simply yum update would render my AWS free tier RHEL server unbootable?? Oh yeah, custom kernels for AWS...) but I did end up needing these exact steps. Thanks for posting these up

    ReplyDelete
  2. +1 very helpful. Was kind of surprised there was no java console (not via ssh) or a way to access a virtual serial port via CLI or web UI. Thanks for sharing.

    ReplyDelete
  3. Excellent process that helped me.

    ReplyDelete
  4. can you please explain how to do the step

    1.4 Attach this EBS to the temporary instance

    ReplyDelete