[US Region] Presto Query Execution Failures
Incident Report for Treasure Data
Resolved
Presto queries from 21:05 to 21:39 PDT (13:05 to 13:39 JST) experienced failures due to a stale cluster configuration.

Jobs that reported the error "Name or service not known" were automatically retried and no further action is required.

The following types of jobs cannot be automatically retried, and must be
manually rerun or resubmitted:
1. Presto JDBC/ODBC API jobs
2. Presto jobs submitted through CDP Segmentation UI
3. Other jobs submitted by console, CLI, or API that reported another non-retryable error
Posted Apr 25, 2019 - 23:19 PDT
Monitoring
We've fixed the problematic configuration. Now New presto jobs should work correctly. Customer jobs which failed with "Name or service not known" error will be retried. Other jobs which failed with "undefined method" error are needed to be run again by customers.
Posted Apr 25, 2019 - 22:00 PDT
Identified
Our cluster configuration storage was overwritten by stale invalid values due to misconfigurations. We’re on fixing those invalid values by correct one.
Posted Apr 25, 2019 - 21:42 PDT
Investigating
Currently investing an issue where presto query executions are failing
Posted Apr 25, 2019 - 21:28 PDT
This incident affected: US (Presto Query Engine).