[US Region] Query Engine - Service Degraded Performance

Incident Report for Treasure Data

Postmortem

We experienced a temporary overload on the storage layer.
It started from 16:15 PDT and fixed on 18:15 PDT.
The major impact was performance defgadation for data ingestion components (Streaming Import REST API, Mobile/Javascript REST API, Data Connector) and Hive and Presto query engines. Some of queries executed on Hive and Presto failed because of performance degradation of the storage.

Posted Oct 01, 2024 - 21:35 PDT

Resolved

This incident has been resolved.
Posted Oct 01, 2024 - 18:30 PDT

Update

We are continuing to monitor for any further issues.
Posted Oct 01, 2024 - 18:15 PDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Oct 01, 2024 - 18:01 PDT

Identified

The issue has been identified and a fix is being implemented.
Posted Oct 01, 2024 - 17:41 PDT

Update

We are continuing to investigate this issue.
Posted Oct 01, 2024 - 17:07 PDT

Update

We are continuing to investigate this issue.
Posted Oct 01, 2024 - 17:06 PDT

Investigating

We're experiencing an elevated level of API errors and are currently looking into the issue.
Posted Oct 01, 2024 - 17:04 PDT
This incident affected: US (Streaming Import REST API, Mobile/Javascript REST API, Data Connector Integrations, Hadoop / Hive Query Engine, Presto Query Engine, Presto JDBC/ODBC Gateway).