Service disruption during routine database upgrade
Incident Report for DataGrail
Postmortem

On Friday, June 11, 2021, DataGrail experienced a service disruption of privacy intake forms and application downtime lasting from 8:08PM PST until 8:43PM PST. This was caused by a routine upgrade to apply security and performance improvements to DataGrail’s core database infrastructure.

DataGrail Engineering had previously performed similar upgrades across its test environments and had not observed a service interruption. 

The DataGrail team has identified several measures to prevent this failure from occurring in the future. These include implementing a two-phase migration process for database upgrades that will avoid downtime due to delayed maintenance as well as enhanced failure mode benchmarking in our test environments.

Please contact support@datagrail.io if you have any questions or concerns.

Posted Jun 15, 2021 - 11:11 PDT

Resolved
The incident has been resolved. Please contact support@datagrail.io if you have questions or would like a copy of our root cause analysis (once available).
Posted Jun 10, 2021 - 20:58 PDT
Monitoring
The service disruption has been resolved and all systems are back online. We are actively monitoring.
Posted Jun 10, 2021 - 20:45 PDT
Identified
A service disruption started around 8:11 PM PT during a routine database upgrade. We are actively resolving the issue and should be back up shortly. Updates to follow.
Posted Jun 10, 2021 - 20:30 PDT
This incident affected: Request Manager, Live Data Map, and Intake Forms.