[EU Region] Presto performance improvement release
Scheduled Maintenance Report for Treasure Data
Completed
The scheduled maintenance has been completed.
Posted Jun 14, 2020 - 21:14 PDT
Verifying
Verification is currently underway for the maintenance items.
Posted Jun 14, 2020 - 21:08 PDT
In progress
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Posted Jun 14, 2020 - 21:01 PDT
Update
We will be undergoing scheduled maintenance in one hour at 6:00 AM CEST.
Posted Jun 14, 2020 - 20:01 PDT
Update
We have postponed the release schedule to occur on Monday, June 15th between 06:00 and 06:30 CEST.
Posted Jun 10, 2020 - 13:17 PDT
Scheduled
On June 15th, 2020 from 06:00 to 06:30 CEST, we will release a hotfix for the Presto query engine.

The release rolls forward the changes that were introduced on May 7th to improve the query performance degradation experienced by queries in specific scenarios.

The May 7th release was rolled back on May 22nd because it had a bug causing sporadic write inconsistencies in CREATE TABLE AS, INSERT INTO, and DELETE FROM queries. This new version, based on the May 7th release, contains an additional fix for the write inconsistency issue.

Both performance degradation and write inconsistency issues are described in detail in the postmortem at https://status.treasuredata.com/incidents/mrnh2jc0kmqb.

The release should have no impact on running queries, which will transparently be transferred to the new Presto query engine version.

As communicated in the last postmortem, due to the Presto write inconsistency having affected the data integrity of the platform, we have taken the following additional precautions:

* Remediation
We removed the code responsible for the aggressive write optimization that caused the write inconsistency bug.

* Verification
We have reproduced the write inconsistency issue and we built a reliable set of tests to confirm the fix is effective.

* Detection
We implemented additional application logic to detect eventual race conditions (e.g. throw an exception, alert about the anomaly, etc...) when writing into our Plazma storage. Should a race condition even occur (not expected), it will raise an exception forcing the query to error out and an alert to be sent to our staff. Upon receiving an alert, our staff will investigate the situation and when warranted, reach out to the customer to recommend data recovery.

* Monitoring
Our Presto monitoring and alerting was improved. Our team will follow an on-call duty rotation to monitor the health of the system post release for an extended amount of time (96 hours or 3 days) and catch any anomalies.

# Communication

Beyond this notice, we will provide updates approximately 1 hour before the beginning of the maintenance window, at the start and completion of the operation, and once the verification is completed. At that time, all systems will have returned to full functionality and the Scheduled Maintenance will be closed.

If you have any question or concern about this maintenance, please feel free to reach out to our Support team at support@treasure-data.com.
Posted Jun 10, 2020 - 10:44 PDT
This scheduled maintenance affected: EU (Presto Query Engine).