As a daily routine, I was trying to login ( ssh ) to one of my staging instances ( AWS ec2 ) but was unable to login to it with my credentials. After doing basic troubleshooting, including checking if my network is choked, and checking if CPU/network of the instance is getting choked because of some rogue process ( BTW AWS provides good dashboard for basic monitoring without login into instance ), and verifying that ssh client is properly getting authenticated. I realised that none of these issues were causing the problem.
I went back to “IT Crowd” mode and tried “switching off and switching on” the instance, it worked, …for once. I was able to login once but once my session got expired, I was not able to login again. I can’t keep restarting the my instance whenever I want to connect so I went back to troubleshooting mode.
If there is some answer, it must be in the logs of the system, So, I restarted the system again and connected and checked ssh logs. To my surprise there are around 12K failed login attempts, As I was checking the logs, that instance keep getting more and more login attempts with most used username password combinations. The reason for me not able to connect is that the number of login attempts are a lot and increased “btmp” log to very large size. Because of this server is not able to start my session. Lucky for us, the login attempts will be unsuccessful thanks to the fact that ec2 instance is authenticated by pem file ( private key ) instead of password.
First thing to do for protecting my instance is to block all requests. AWS provides excellent security options by creating security group by default in which I can configure which ports to be open to public and whitelist IPs which can access that particular port. when I created this instance, there were multiple developers accessing this instance from multiple locations and hence it didn’t make sense to restrict ssh login from one IP. Now that there is attack going on, First thing I did is restrict port 22 to be accessed only from my IP. This blocked the requests going to instance. I took back up of the logs and cleared all logs on that instance. and I was able to login subsequently.
I wanted to investigate the attack. I took the backed up logs and started analysing them. log has information about IP, username, timestamp of attempt. I assumed that there may be couple of IPs from which the requests are coming from. After looking at the logs I realised that requests are coming from at least 8K different IPs from different countries. A zombie network is trying to brute force ssh login of my instance with public IP. I sorted the IPs based on number of requests and found that there are 3 IPs which are sending at least 20% of the requests. Rest of the IPs seemed to be sending one or two requests. The top three IPs are from different locations and different cloud providers. I reported them to the cloud provider and waited for their response.
For an automated attacks, everything accessible on internet is a target, irrespective of their importance to the owner. It could be happening to lots of instances that accessible from internet. I could not have thwarted this attack without tools provided by AWS. there are lot of cloud providers who don’t have these tools at disposal. I have recently witnessed an attack on GoDaddy’s wordpress hosting which hosts lot of websites on shared instance got hacked. The website owner had to rely on GoDaddy’s support service for doing anything. Even with more than 30+ years of internet experience, we are still struggling to keep our infrastructure safe from attackers.
With the kind of growth of internet economy it is important for us to focus on security of our infrastructure. Dev/Ops needs to be equipped with better tools for defending themselves.