File Server network issue.
Incident Report for Amplifi.io
Postmortem

The team and I at Amplifi.io just wanted to take a moment and apologize for the disruption to your work on April 6th. 

Root Cause:

Virtualization system vendor code issue revealed.

On April 6th early AM, our development operations team confirmed Upload/Download service became unresponsive due to a condition that exposed a code issue on our virtualization system.

Resolution and Recovery:

Team utilized the vendor’s help understand and deploy a temporary solution. It was decided that the fix would be faster than a fail-over event. Team cleaned up the "stalled" containers responsible for routing issues. Team then recreated the service stack (portal, api, indexing listener, upload, download) as part of the recovery. The fix took approximately 5 hours.
**
Corrective and Preventative Measures:**

Our team has completed maintenance checks and reviewed the outage with the vendor. Vendor has now released an update to permanently rectify this issue. Amplifi.io team will implement this update to prevent this issue from happening in the future.

We will be planning several short maintenance windows overnight Pacific Time on the next two weekends to implement the vendor’s updates.
**
Related Upcoming Planned Maint. Windows:**

April 10/11 -  3 hours around midnight Pacific Time.
April 17/18 -  3 hours around midnight Pacific Time.

We expect to upload/download services to be offline at these times. Please plan your bulk uploads and downloads to avoid these time frames.

Thanks for your understanding and continued support as we better our technology and offerings.

Best Regards,

Ken Garff, CPO/CEO

Posted Apr 09, 2021 - 12:37 PDT

Resolved
This incident has been resolved. We will report here at a later time as to the cause and any further information or maintenance windows that may be required.
Posted Apr 05, 2021 - 19:55 PDT
Monitoring
The files server connection to the web interface is restored and we are continuing to work on the uploading, transcoding and downloading functions. The issue had to do with our Docker containers virtualization software. More details will be provided after our post-incident analysis. The team working on this issue projects that all systems will be fully operational by Tuesday AM ET.
Posted Apr 05, 2021 - 16:53 PDT
Identified
The issue has been identified and a fix is being implemented.
Posted Apr 05, 2021 - 11:33 PDT
Investigating
We are currently investigating issues uploading and downloading media to the file storage service. Network team is working on this issue. Best to hold work until this is resolved. We will send another notification when resolved.
Posted Apr 05, 2021 - 08:09 PDT
This incident affected: Media Transcoding, File Server, and Uploader.