UpCloud Status / Summary of UK-LON1 compute node crashes
over 2 years sitten
Posted Wed, 26 Jun 2019 11:49:25 +0000
Jun 26 , 11:49 UTC
Identified - Since June 13th, we have had multiple incidents of compute node crashes in our London data centre. We would like to give some insight behind these, as we owe our users an explanation for this unusual situation.
Earlier this year in May, a new Intel CPU vulnerability was discovered (https://upcloud.com/blog/mds-vulnerabilities/), which prompted us to swiftly run security updates throughout our infrastructure to protect our users data. This is something we have had to do in the past for similar Intel CPU vulnerabilities such as Meltdown (https://upcloud.com/blog/intel-cpu-vulnerability-meltdown/), and which we did with great success.
Around the same time, we also rolled out new versions of our infrastructure software, in preparation of new features and products that we are currently working on. Unfortunately, as we discovered, the combination of certain hardware in the London data centre, our new software features, together with the Intel CPU vulnerability mitigations, proved to cause instability over time in specific situations.
UK-LON1 has been the only data centre to experience these situations and we have been working tirelessly to identify the root cause and administer a fix. We are confident that we have identified the root cause at this point, and have already begun rolling out updates. These updates require no actions on our users part, and will not cause any disturbance to your services at UpCloud.
The rollout is estimated to conclude within the next 36 hours, by the end of Thursday, 27th of June, 2019.
We will post any future updates here, so please make sure to subscribe or check back for any updates.
We are truly sorry for the loss of service our users have endured.