Identified - The issue has been identified and a fix is being implemented.
Apr 19, 2024 - 15:32 PDT
Update - We are continuing to investigate. The issue seems to have started on April 14th when we applied a AWS-required MySQL 5.7 => 8 upgrade to our Subscription service database. This has apparently caused some unforeseen performance issues when running multiple sites' machine learning jobs.

We will be upgrading the database instance size in order to sidestep the space issue temporarily. Our hypothesis is that this will buy us some time and (hopefully) allow our big jobs to continue running.

Meanwhile, we will be investigating how to make the disk usage more efficient, or resolve the issue overall.

Apr 19, 2024 - 15:32 PDT
Investigating - We are currently investigating, but it seems one of our databases has run out of temporary disk space to unload large tables to our machine learning algorithm. Not all sites are affected, and it seems to primarily be an issue for larger sites (many millions of users).

We are looking to remediate this, but also we're trying to find out why this suddenly started happening even though we haven't changed much on the database server side. We will update here with more findings as we have them.

Apr 19, 2024 - 14:20 PDT

About This Site

This is ReSci status page, where you can always find updated information on how our systems are doing. We will post here if there are interruptions to service.

As always, if you are experiencing any issues, don't hesitate to get in touch with us at http://help.retentionscience.com and we'll get back to you as soon as we can.

Cortex Application (Main Dashboard) ? Partial Outage
90 days ago
99.84 % uptime
Today
Users API ? Operational
Smart Blast / Promo Blast Sends API Operational
Transactional Sends API ? Operational
Recommendations API Partial Outage
Filtering and Segmentation Service ? Operational
Data Imports Service ? Operational
Campaign Reporting and Analytics API ? Operational
Events and Tracking API ? Operational
Support ? Operational
Template Tools ? Operational
Reporting ? Operational
SMS ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Past Incidents
Apr 20, 2024

No incidents reported today.

Apr 19, 2024

Unresolved incident: Database server out of temporary disk space causing certain sites' recommendations to fail.

Apr 18, 2024

No incidents reported.

Apr 17, 2024

No incidents reported.

Apr 16, 2024

No incidents reported.

Apr 15, 2024

No incidents reported.

Apr 14, 2024

No incidents reported.

Apr 13, 2024

No incidents reported.

Apr 12, 2024

No incidents reported.

Apr 11, 2024
Resolved - Date Started: March 27th
Date Resolved: April 11th

Cause of Incident:

A Migration/Upgrade to MySql v8 per Amazon AWS requirements occured on March 26th.
When the migration happened there was a small bug in the code/query around NULL date values. This caused rows in our AI predictions output to go missing. Any row that had a date or timestamp or a 0 or null or date outside the normal range would cause that row to be removed entirely from the output. This was something the new version of sql was sensitive to that the old version of sql was not.

How we will prevent this from happening in future:

A Database migration is not a frequent occurrence, something that happens once every 4 years so so.
We will do a better job in QAing by involving the Client Success team to QA a few of their largest clients to make sure no anomalies are occurring with the stages and engagement.

Apr 11, 11:00 PDT
Apr 10, 2024

No incidents reported.

Apr 9, 2024

No incidents reported.

Apr 8, 2024

No incidents reported.

Apr 7, 2024

No incidents reported.

Apr 6, 2024

No incidents reported.