All Systems Operational

About This Site

This is Treasure Data's status page.
We believe that trust starts with full transparency.

US Operational
Web Interface Operational
REST API Operational
Streaming Import REST API Operational
Mobile/Javascript REST API Operational
Data Connector Integrations Operational
Hadoop / Hive Query Engine Operational
Presto Query Engine Operational
Presto ODBC Gateway Operational
Tokyo Operational
Web Interface Operational
REST API Operational
Streaming Import REST API Operational
Mobile/Javascript REST API Operational
Data Connector Integrations Operational
Hadoop / Hive Query Engine Operational
Presto Engine Query Engine Operational
Presto ODBC Gateway Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
System Metrics Month Week Day
US - REST API - Response Time ?
Fetching
US - REST API - Error Rates ?
Fetching
US - Streaming Import REST API - Response Time ?
Fetching
US - Queued Streaming Import requests ?
Fetching
US - Mobile/Javascript REST API - Response Time ?
Fetching
US - Web Interface - Response Time ?
Fetching
Tokyo - REST API - Response Time ?
Fetching
Tokyo - Streaming Import REST API - Response Time ?
Fetching
Tokyo - Queued Streaming Import requests ?
Fetching
Tokyo - Web Interface - Response Time ?
Fetching
Past Incidents
Apr 27, 2017

No incidents reported today.

Apr 26, 2017

No incidents reported.

Apr 25, 2017
Resolved - This incident was resolved.

At 21:48 PDT we updated DB configuration for a part of API servers. The change was to isolate streaming import DB from query job DB at old streaming import API, but the wrong configuration caused DB connectivity issue to the old streaming import API. In addition to it the problem was propagated to other REST API. We reverted the change at 21:58 PDT and all problems were resolved. Queued streaming imports were already processed after 21:58 PDT. Please re-execute CLI when you observed API error 500 during the incident.

We are now working to isolate API clusters per purpose in this quarter to prevent this kind of error propagation. Please accept our apologies for any inconvenience this has caused.
Apr 25, 22:39 PDT
Monitoring - We found the wrong DB endpoint configuration in today's API release and reverted the application at 21:58 PDT. Now API is working normally.
Apr 25, 22:05 PDT
Identified - We are observing elevated API error rate from 21:50 PDT. We identified the cause and fixing now.
Apr 25, 21:57 PDT
Apr 24, 2017

No incidents reported.

Apr 23, 2017

No incidents reported.

Apr 22, 2017

No incidents reported.

Apr 21, 2017
Resolved - All systems are working normally since 08:07 am PDT. This incident is resolved.
Apr 21, 10:52 PDT
Monitoring - After the previous incident, we are again observing the same problems from 06:10 to 06:16 PDT and 07:50 to 08:05 PDT. The cause is the same, storage IO problem at one of our backend DB server.
We have implemented interim mitigation fix to API servers.
Apr 21, 08:20 PDT
Resolved - All systems are working normally after 04:57 PDT. This incident is resolved.

From 04:43 to 04:57 PDT our API server showed high error rate due to slow backend DB response. Mobile import chunks were queued at our server and processed quickly after API recovered. td-agent has retry mechanism therefore no data loss should have happened. Scheduled jobs also has retry mechanism and all queued scheduled jobs were processed, too.

td command may fail due to API error response during this incident. Please retry the command execution.

Sorry for the inconvenience this incident has caused.
Apr 21, 05:34 PDT
Monitoring - From 04:43 PDT one of our backend DB server underwent slow storage IO. At 04:57 the IO problem was solved.
Apr 21, 05:08 PDT
Investigating - We are observing high error rate of API. We are investigating now.
Apr 21, 04:58 PDT
Apr 20, 2017

No incidents reported.

Apr 19, 2017

No incidents reported.

Apr 18, 2017

No incidents reported.

Apr 17, 2017

No incidents reported.

Apr 16, 2017

No incidents reported.

Apr 15, 2017

No incidents reported.

Apr 14, 2017

No incidents reported.

Apr 13, 2017

No incidents reported.