Santa Rosa Datacenter UPS Maintenance

The UPSes for our Santa Rosa Datacenter are scheduled for their regular maintenance on Tuesday, Feb. 14th starting at 10:00AM.  Maintenance is fully scripted with our vendor and should not impact any services.  However, the maintenance does involve bypassing the UPS and running loads directly on utility power.  The backup generator will be kept running for the duration of the maintenance to protect against utility power failure.

– Sonic Facilities and System Operations

San Francisco Fusion Fiber Outage

Today, starting at 12:08am, equipment serving a subset of Fusion Fiber customers in the San Francisco area appears to have suffered a hardware failure. We are working to replace the equipment and restore service as quickly as possible.

Update(2:40am): ETR 2 hours

Update(4:59am): Affected equipment has been replaced and we believe service has been restored to all affected customers

-Tomoc

Intermittent connectivity for copper based customers in San Francisco

As of approximately 10:40am, a small subset of copper based customers in San Francisco may be currently experiencing intermittent connectivity issues.  We are working to identify the cause of this problem and repair it.  More information to follow.

Update(1:02pm): We have narrowed this issue down to a single device in our network and are no longer experiencing any instability. We are continuing to investigate the root cause.

-Sonic NOC

Fusion Fiber Maintenance

Starting on Monday, February 13, we will be performing maintenance on equipment serving Fusion Fiber customers around the greater Bay Area. This maintenance will be over the course of the week and will affect the following areas:

 

Monday: Sebastopol

Tuesday and Wednesday: San Francisco

Thursday: Brentwood

 

Maintenance will begin at midnight and expected customer downtime is 15 minutes.

 

Update(4:49am): Maintenance is complete, however a small subset of customers experienced an equipment failure during the upgrade. Affected customers can expect our support department to reach out in the morning to assist in restoring service.

Update(2/16 6:00am): Maintenance is now complete for the San Francisco area. Some customer ONTs may require a powercycle to restore service. Customers in this state can expect a call from our support department in the morning to assist in restoring service.

Update(2/16 5:18pm): Brentwood maintenance has been postponed.

 

-Tomoc and Brandon

System Maintenance

Update: Maintenance complete.

Tonight at 11:59pm, System Operations will be running maintenance updates on several customer facing systems. The following services may experience brief interruptions:

  • Customer hosted websites

The maintenance period is expected to last 1 hour.

-SOC

Authoritative Name Server Migration

Sonic maintains three geographically diverse and redundant authoritative domain name servers, in NY, CA and TX.  These servers are used for our own domains as well as customers who have domains hosted on our network.  Early this morning, c.auth-ns (NY), had a complete hardware failure and we took the opportunity to move services to a different colocation provider in NJ later today.  The failure and subsequent migration should have had no noticeable impact on any of our services.

-Dexter and Kelsey

 

Network Connectivity Outage

Update: 4:15pm – Today at 12:28pm Sonic experienced a network instability event of an unknown origin at the time.  Through troubleshooting we were able to narrow down the issue to two transport links going from one of our datacenter locations to core network equipment.  These links were receiving replicated traffic from the transport providers network equipment, sent back at us. The replicated traffic overloaded the CPU on our core routers. We have taken steps to prevent the replicated traffic from affecting our network, and we have contacted our provider for further diagnosis. Apologies for the delay in getting issue resolved, it was a very difficult problem to troubleshoot, and we have never seen anything like this happen before.

Update 3:42pm – We believe everything is restored.  We will release a RFO soon with more specific information.

Update 2:46pm – We are still working to mitigate the DoS. All of our engineering staff is currently engaged in this issue. We will post more details as they become available.

As of 12:30pm, the Sonic network is experiencing reachability issues to the outside world.  A large DoS attack is the suspected cause but we are still working to identify and mitigate the problem ASAP.  More information will follow.

-Sonic NOC

Phone Switch Maintenance

Update: 3:10  Maintenance is complete.

 

We will be performing scheduled maintenance on our phone switch tonight at midnight. We do not expect any service impact by this maintenance.

 

Network Engineering