Yesterday the entire One Month platform went down for over four hours during our big Cyber Monday sale. For anyone heading to our website, they got an error page instead of our regular website. For us at One Month, it meant that not only was our main website down, but so were many of the secondary websites that we use to run our company — including RubyGems, Travis-CI, and Meldium!
When we discovered that the site went down, we did some routine checks to figure out what was happening. It turned out that the attack was not on our website specifically, but on a separate provider that runs our domain name system (DNS) and points people to our website.
The site is back up today, and we thought we’d share an overview of what happened as a learning experience for people curious about hacking and web technology. In addition, we’re extending our Cyber Monday sales for anyone who had difficulty getting to our site yesterday.
Why did it happen, how did it happen, and how did it get fixed?
Domain Name System (DNS): The Internet’s Phone Book
For a website to work, it uses a technology called DNS to connect domain names (like onemonth.com) to IP Addresses (a computer’s “phone number”). DNS is essentially the “Phone book” of the internet, mapping domain names to IP addresses. Just like there are competing phone book providers, there are competing DNS providers as well.
The service we use for DNS is called DNSimple, which keeps record of the internet address where One Month lives, and DNSimple went down. Anyone else using DNSimple would have been down as well. Since DNSimple was unreachable, your computer said “I’m sorry, I can’t find onemonth.com.” Without your DNS provider, all requests for the website get stuck in cyber space — literally.
There are a few more layers to DNS (which we’re happy to go into) — but this is a basic overview for people new to what DNS is and how it works. (If you want to geek out with me, join us in a class and send me a message and we can get into deep web stuff).
So, how did they do it?
In order to take down our DNS provider, DNSimple, hackers employed what’s known as a “DoS attack”.
DoS stands for “Denial of Service,” and it basically means that someone intentionally created a lot of fake traffic and flooded their servers. When you create that much traffic, the servers can’t handle all of the requests and the majority of the requests get unanswered — resulting in a website not loading.
As a result, the attack took down the website responsible for our DNS, which in turn took down our website along with many other websites relying on the same DNS provider.
If you are curious about what these attacks look like you can take a look at http://map.ipviking.com/, they show a near real-time attack map based on internet traffic analysis.
To successfully execute a Denial of Service, it’s not a small influx of traffic — you wipe it out with the equivalent of a citywide flood that stops an entire geographic area.
In order to take down DNSimple for 4+ hours (the amount of time that our site was down yesterday), we estimated that you’d need nearly 180 Terabytes of data, or 38,000 DVDs. (Netflix has about 10,000 titles available for streaming). That’s how much data it would take to throw at DNSimple in order to flood it and override the system.
Who did it?
We can’t figure out who ran the attack — because the attack wasn’t on our website, but on a provider that supports multiple websites. One thing we can guess is that this was an orchestrated attack by someone with a serious bent on taking down a website or multiple websites, because of the volume of traffic that was sent at a very specific time.
DNSimple has been sharing updates on their report log, and they’ll likely be working for a few more sleepless nights to continue to fix the issue. (As you can see, their team spent most of the night working to fix it):
Now we can assume that some entity with a serious grudge attacked our DNS provider — but the question we want to know is, how can we create resiliency and prevent this from happening in the future?
How do we stop this from happening in the future?
We could switch DNS providers, but they all will hit issues like this at some point or another. It’s not common, but sometimes DNS providers get attacked and websites go down.
More complex solutions (like Master-Slave, Hidden Master, or having multiple DNS providers) are all options, and we’re going to weighing the pros and cons of this additional complexity internally.
Implementing these systems takes a bit of time — and in the interim, the best option we think is to be transparent with our community and our users. Good student support is what helped us get through this and is the best thing to solve a “once a few years” problem.
The Cyber Monday sale is now extended for a week!
One of the frustrating things for our company was creating a big sale for everyone and then not having the site be available for people. Because so many people reached out to us and we were flooded by support tickets, we’re keeping the sale up and running through the end of this week (it will end Friday, December 5 at midnight) — for anyone who was looking to snag one or more of our classes at massive discounts, you can do so all week.
On the upside, events like this are learning opportunities, and we love sharing what happens and explaining it.
Happy holidays from us at One Month!
And if you have any questions about what happened, let us know.