Final Summary
It is 3:00 PM PDT on Monday, August 6, as of 1:15 PM PDT everything here at digital.forest is back to normal. At that time we brought up our connection with 'Network X' and it has been stable ever since. As promised I will provide a short timeline of events over the weekend, a summary of what we did to mitigate this attack, and what we have done to prevent future attacks from having a similar effect. This data, together with what we posted last week (below) should serve as a total recap of the entire event.
We have had varying levels of success working with our peer networks. One has been excellent, one has been awful, and the other just "OK"... for this reason I have chosen to make them anonymous using the "Network *" as a substitute for their names. The one network which provided excellent support is one we have been working with for many years. The other two are both up for contract renewal before the end of the year and the one we are very unhappy with, "Network X" is unlikely to keep our business. Please keep that in mind as you read the following.
The attack which started in the early morning hours Friday was best mitigated by identifying the attack source and destination, and configuring the routers in between to ignore that traffic. This requires coordination with the networks we connect to directly, and ideally the networks that connect to the attack sources directly. We started by making those configuration changes on our routers, then contacting our network peers and requesting thy make a similar configuration change. The NOC staff at "Network Y" were VERY responsive and immediately made the changes we requested. This is what allowed us to come back online at 7:56 AM PDT Friday morning. Half of the attack traffic was coming in via the connection with "Network X" and we could not get a positive response from their NOC. We had opened a trouble ticket with them, but nothing was done on that ticket through all of Friday. The circuit stayed down until 7:15 PM PDT on Saturday. Unfortunately when that circuit came back up, the attack resumed. We once again shut it down on our end. "Network X" stayed offline over the entire weekend.
Our Network Manager, Kyle Murray performed some forensic analysis on the data we collected during the attack. The attack seemed to be coming from a single IP address allocated to a company in New York which appears to have been out of business since 2004. The source address was likely a forgery, as the amount of traffic we saw coming inbound was impossible to generate with a single computer. It was also coming into our network over several network peers. This is what lead us to believe that in reality it was a distributed attack. The source network of that IP was being announced to the Internet routing table by a network we'll call "Network C". We contacted their NOC to let them know about the attack we were seeing which theoretically came from their network. They agreed to null route that address as well. Kyle was hoping to hear back from them today with more information, but they are based on the east coast and their Security Staff have already left for the day.
Today at 1:15 PM PDT we finally brought up our BGP connection with "Network X" and the attack has been completely blocked.
We have fingerprinted the attack profile and created an alarm that pages us if any traffic matches the behavior of this attack traffic. We are building some automated systems to detect and null route such traffic.
digital.forest was hit with a massive distributed denial of service (ddos) attack this morning. We are working with our network peers to mitigate this as much as possible. Please be patient while we dedicate all available resources to resist this attack.
Update 8:25am As of 7:30 am we have good connectivity with one of our network peers. Another is partially up, with attack traffic coming in, but at a lower volume than before. One network connection is still down. We are working with the NOC staff of all our external networks to resolve this issue as best we can.
Update 9:51am PDT The ddos against our network this morning has been stopped and as such our network has returned to normal operating status. All sites and servers should again be functioning normally and accessible. If you are still experiencing any difficulties getting onto your website or server please give us a call at our technical support line 877-720-0483 option 3.
Final Analysis and Time Line:
Some specific details are still under investigation, however we have a very good understanding of what happened early this morning and are prepared to share in general terms the following information.
* Around 3:45 AM PDT a Denial of Service attack, directed at a single IP address inside our network began. At first it was not very large.
* By 4:10 AM PDT it had grown large enough to set off alarms in our network monitoring systems. Emergency pages went out to NOC staff and our Network Manager.
* At 4:27 AM PDT we lost the BGP session with one of our network peers, we'll call them "Network X".
* BGP was reestablished with Network X at 4:28 AM PDT.
* 4:35 AM PDT Network Manager was awake and gathering data from d.f NetFlow server. Recognized the traffic patterns as a Denial of Service Attack. Was in contact with NOC staff on site at digital.forest.
* 4:40 AM PDT Discovered the target of the attack via NetFlow reports. DoS traffic now at 30,900 flows per second.
* 4:44 AM PDT added route to black hole DoS target to Boundary Router 1
* 4:59 AM PDT added route to black hole DoS target to Boundary Router 2
In past experience, this step has stopped every other attempted denail of service attack on our network. By telling the world that the target does not exist, the attack usually stops. What followed instead was more of the same. The attack continued, and in fact intensified.
* 5:10 AM PDT BGP with Network X goes down.
* 5:11 AM PDT BGP with Network X reestablished.
* 5:16 AM PDT BGP with Network X goes down.
* 5:17 AM PDT BGP with Network X reestablished.
"Blackholing" the attack target has had no effect. Our attempts to get attack source data from our NetFlow server is fruitless, it is unable to keep up with processing as flows begin to exceed 50,000 flows per second/3,000,000 flows per minute. If we can get source data, we can start making attempts to block the source, or work with our peer networks to block the attack.
* 5:40 AM PDT Boundary Router 1 goes non-responsive.
* 5:43 AM PDT On site tech restarts BR1 under direction from Network Manager.
* 5:52 AM PDT BR1 up again.
* 5:52 AM PDT BGP session with another provider we'll call "Network Y" is lost. This network is terminated on a separate router, Boundary Router 2.
* 5:53 AM PDT BGP with Network Y restored.
While we maintained BGP connectivity with one of our three providers ("Network Z") throughout the event, the attack traffic at times consumed 100% of the CPU of one, or both routers, causing such high latency that we were, for all intents and purposes, not passing traffic. At this time Network Manager calls the Vice President of Technical Operations and informs him of the situation. Ops VP starts calling technical support staff to have them get to the office and assist with telephone calls. Also informs the CEO and VP of Sales.
* 6:08 AM PDT Boundary Router 2 goes non-responsive and it restarted by on site staff.
* 6:12 AM PDT Boundary Router 2 is back up. Attack traffic has effectively blinded both routers. NetFlow server records over 70,000 flows per second/4.2million flows per minute before it goes non-responsive as well.
* 6:20 - 7:30 AM PDT Network Manager and Ops VP contacting the NOCs of peer networks to have them assist in DoS Mitigation. Network Manager logs trouble tickets with Network Y and related Metro Ethernet provider, proceeds to datacenter from home. Ops VP logs tickets with Network X and Network Z... (after much phone tree navigation... way too much with "Network X")
* 7:37 AM PDT Network Manager on site at d.f and making very good progress with NOC staff of Network Y.
* 7:56 AM PDT BGP session back up with Networks Y & Z. We are back "on the air" again, though down by one provider. Attack continues, but is being mitigated actively by Network Y. Network Z is up, steady and not included in the attack. Network X is still down. Most of the tech & customer service staff is on site, taking calls from customers.
* 8:03 AM PDT BGP with Network Y lost.
* 8:10 AM PDT BGP with Network Y restored.
* 8:50 AM PDT Ops VP leaves home headed for digital.forest.
* 8:58 AM PDT BGP with Network Y lost.
* 9:03 AM PDT BGP with Network Y restored.
* 9:06 AM PDT BGP with Network Y lost.
* 9:07 AM PDT BGP with Network Y restored.
* 9:10 AM PDT BGP with Network Y lost.
* 9:11 AM PDT BGP with Network Y restored. Attack now completely blocked. Traffic on Network Y stabilizes and returns to normal. Network Z has remained up and stable since 7:56 AM PDT. Network X still down.
* ~10:00 AM PDT Server which was the target of the attack is brought back online.
As of now, 11:55 PM PDT, Network X, the first of our network peers to be lost, is still down. We have been calling their NOC and have a trouble ticket logged. We strongly suspect that the port on their equipment we connect to in downtown Seattle has failed an auto-negotiation. Hopefully we'll have this resolved soon.
Our other circuits, Networks Y & Z are stable and handling all our traffic normally.
We would like to thank our clients from their patience and understanding during this event. We will continue to work on this issue with the intent of learning as much as we can. We have been subjected to denial of service attacks before, but in each of those cases we have been able to successfully mitigate them, usually before they had any noticeable impact on our network. This was the first attack on our network since December 23, 2001 that had more than a few minutes impact on our ability to stay online. We've spoken to a number of DoS Mitigation experts today, and will continue to do so. We've made some configuration changes and will continue to harden our infrastructure against attacks.
As always, if you have questions or concerns, feel free to contact us.
Regards,
--Chuck Goolsbee
V.P. Technical Operations
digital.forest, Inc.