Dear Treasure Data Customers, this is the postmortem for streaming data import delay happened at Jul 21 2015, for td-agent v1.1.20 or earlier.
From 2015-07-21 14:30 to 2015-07-22 11:45 (UTC), our API import endpoint (https://api-import.treasuredata.com/) didn't accept SSLv3.
Customers using td-agent1 (old stable) version 1.1.20 or earlier, experienced the data import delay with the error message like below in the log (/var/log/td-agent.log).
SSL_connect returned=1 errno=0 state=SSLv3 read server hello A: sslv3 alert handshake failure
This error suggests that td-agent is trying to upload with SSLv3, but our API server rejected to establish the connections.
While the incoming data should be buffered on the disk, the data imports were hugely delayed. At 2015-07-22 11:45 (UTC), we have enabled SSLv3 again, and the problem was solved.
Recently the world discovered that SSLv3 contains weaknesses in its ability to protect and secure communications. This is well-know as POODLE vulnerability.
These weaknesses have been addressed in Transport Layer Security (TLS), which is the replacement for SSLv3 and the new default for most operating systems and clients.
Consistent with our top priority to protect Treasure Data customers, Treasure Data had a plan to support versions of the more modern TLS rather than SSLv3.
Originally we planned to let customers know about deprecating SSLv3 via emails.
However, when we modified our load balancer (Elastic Load Balancer) configuration, SSLv3 was disabled by default by the underlying cloud provider and we didn't recognize about it. After the load balancer config change, our endpoint started rejecting SSLv3 connections.
We will work more closely and quickly, to mitigate any potential security issues.
Also we'll implement further monitoring mechanism on the server-side to more strictly check the customer's data import rate. This will allow us to have a critical alert, when multiple customers' import rate dramatically dropped down.
We will also contact affected customers to stop using SSv3. Our customer success team will reach out you to recommend upgrading to td-agent2 (current stable), or td-agent1 (old stable) v1.1.21.
Again, we want to apologize. We know how critical our services are to our customers' businesses. We will do everything we can to learn from this event and use it to drive improvement across our services.
Sincerely, The Treasure Data Team