Resolved -
We’ve traced the issue to a bug in our reporting engine (a 3rd party component) that caused one server to become overloaded during startup. Our load balancer continued routing some requests to that server, which led to intermittent errors.
What we’ve done:
-Immediately removed the affected server from rotation, restoring stability.
-Collected detailed logs and memory data to confirm the root cause in the reporting engine.
-Updated our server "health checks" to better detect when a server is overloaded.
Current status: The system has been stable since early Saturday morning, and we don’t anticipate further impact. We’ll continue monitoring closely as we finalize a permanent fix with our reporting engine vendor.
Sep 26, 23:15 EDT
Investigating -
On September 26th, around 8:30pm ET, a small number of users experienced slowness and being unexpectedly logged out. While there was no full outage, some sessions were briefly affected.
Sep 26, 20:30 EDT