Tonight, starting at midnight, we’re going to apply kernel updates and reboot several systems, including several internal application and SQL servers, as well as public facing clusters that handle services like mail and webmail. Overall impact should be minimal, but customers may experience delays or brief outages while accessing affected services while the systems are rebooted.
While we’ve had ongoing issues with our recursive name servers being used as part of DNS amplification attacks for the past few months, they were severe enough yesterday that even with the rate limits and other mitigation techniques we have had in place that normal usage and performance was affected. We finally had to resort to blocking the most popular DNS queries used in the attacks in order to prevent any impact to our regular services. Customers may have noticed slow DNS requests, most likely experienced as slow loading of web pages, off and on until early afternoon. We expect that we will also finally block all off-net access to our recursive DNS servers sometime in the next few days. Once complete, it should prevent this from being an issue moving forward.
In addition, we’re working on identifying our customers that appear to have zombied systems that are being used to participate in the bot-nets that are responsible for the attacks.
Sorry for the MOTD delay.
-Kelsey and William
This evening, beginning at 11:59PM, we will be performing intrusive maintenance on equipment serving Fusion and FlexLink Ethernet customers in the Forestville area. Expected downtime for affected customers is less than 15 minutes.
Update: Maintenance has been completed successfully.
This evening, beginning at 11:59PM, we will be performing maintenance on equipment serving a small subset of Legacy DSL subscribers in the Bay Area. Expected customer downtime is less than 15 minutes.
Update: Maintenance has been completed as planned.
Tonight, April 17, starting at 11:59PM, we will be performing maintenance on some of our backbone infrastructure in the San Jose area. Customers may experience brief routing changes but we do not expect any lasting impact.
Update: Maintenance completed as planned.
-Tomoc, Nathan, and Robbie
We are currently investigating an outage that is affecting Fusion and Flexlink customers in the Santa Cruz Area. This post will be updated as we investigate further.
Update: At&t has confirmed a major fiber cut in the area and have dispatched technicians out for repair. We will update further when we receive an ETR.
Update 2: At&t technicians continue to work on all the cut fiber but are unable to provide a ETR at this time.
Update 3 (11:15AM PST): At&t has determined that due to the extent of the damage, new cable will need to be brought in and installed to work around the affected area. The current rough ETR is 5 hours or around 4:00PM PST.
Update 4 (4:10PM PST): AT&T has pulled new cables and has begun splicing. Still no ETR.
Update 5 (7:00PM PST): At&t continues their construction to repair the damage. The latest estimate received from the splicing crew for restoral of service is 1AM to 2AM.
Update 6(3:10AM PST): At&t has completed splicing in the replacement cables and all affected areas are now back online.
Update 7: Additional details on this outage are available from the following news article. http://sanfrancisco.cbslocal.com/2013/04/16/gunshots-cause-oil-spill-at-san-jose-pge-substation/