Crowdstrike deploys fix for massive ongoing outage

19 Jul 2024

Image: © William/Stock.adobe.com

The company’s CEO apologised for the global outage and a fix has been shared, but with so much disruption it is unclear how long it will take for the dust to settle.

Travel, banking, healthcare and many more sectors around the world have been facing what could be one of the biggest outages in history, after a Crowdstrike update went wrong and caused Microsoft systems to crash.

The disruption began earlier today (19 July), with reports first coming out from Australia of businesses seeing the infamous ‘blue screen of death’ – a Windows error message when a PC is forced to shut down.

The outage was linked back to a flawed update from cybersecurity company Crowdstrike, which has been working to resolve the global issue. The company has since issued a fix and says it is working with affected organisations.

But implementing this fix is an ongoing issue, as organisations around the world are still reporting disruptions to their IT systems. A photo from Sky News’ Ireland correspondent Stephen Murphy shows Belfast Airport falling back to whiteboards as a way to show flight updates. The disruption has impacted various Irish services, including transport apps, airlines and car testing centres.

Crowdstrike’s response

Earlier today, Crowdstrike CEO and president George Kurtz said the outage was caused by a defect found in “a single content update for Windows hosts”. He also said Mac and Linux customers have not being affected.

“This is not a security incident or cyberattack,” Kurtz said. “The issue has been identified, isolated and a fix has been deployed. Our team is fully mobilised to ensure the security and stability of Crowdstrike customers.”

A statement on the Crowdstrike website says the issue came from a software update for Windows users and also noted that the issue is not a cyberattack. Kurtz apologised for the global incident on NBC Today earlier and said the company is “deeply sorry” for the outage.

Meanwhile, Microsoft is also facing what appears to be a separate IT issue, as users said they were “unable to access various Microsoft 365 apps and services”. The tech giant said it rerouted affected traffic to “healthy infrastructure” and has been reporting “continuous improvement” since then.

Possible causes

Despite both companies claiming to have found fixes for these IT issues, businesses around the world are still facing disruptions. Reports are coming in of major airlines around the world facing delays, while banks, train companies, media outlets, telecom companies and supermarkets have all been impacted.

While media outlets scramble to figure out the scale of the Crowdstrike outage and its exact cause, IT experts have shared their views on the incident. Tom Lysemose Hansen, CTO of Promon, believes the issue may have been caused by an “invalidly formatted driver causing Windows to crash”. He also says that fixing such an issue is not very straightforward.

“Crowdstrike’s affected customers will have to effectively break into their own systems to get everything back online by logging into the admin console and booting their systems in safe mode,” Lysemose Hansen said.

“The errors made today will cost the affected organisations millions and leave their reputations significantly damaged due to a compromised experience for their customers.”

Dr Simon Woodworth, a lecturer in business information systems at University College Cork, shared his insight on what might have caused the Crowdstrike outage.

“One possible cause is that the update was inadequately tested and a coding error crept through to the software that was released to users overnight,” Woodworth said. “The fault seems to be with a specific piece of software called Falcon Sensor, which watches for suspicious internet traffic either to or from the Windows PC. It appears that the faulty Falcon Sensor caused Windows to crash when booting up.”

While a fix has been released, Woodworth said the knock-on effects of such as disruption will take “much longer to clean up”. Woodworth also explained that the disruption affected certain companies based on their update policy – some businesses choose to delay software updates “for their own reasons”.

“This isn’t an unreasonable thing to do, as this is not the first time software updates have caused problems, though not on this scale,” he said. “Also, not everyone uses Crowdstrike and a lot of systems do not use Windows. Mission-critical systems that control aircraft, for example, do not use Windows at all.”

Big Tech, big problems

It is unclear how long it will take for this global disruption to be resolved. Omer Grossman, CIO at identity security company CyberArk, predicts that the process will take “days” as the systems showing a blue screen of death “cannot be updated remotely and thus the problem must be solved manually, endpoint by endpoint”.

Grossman also said a key issue on the agenda will be finding out exactly what caused the malfunction. Meanwhile, ESET global cybersecurity advisor Jake Moore says the disruption is a reminder of the significance certain Big Tech companies have in modern systems – and the danger of relying on single entities too much.

“It is simply impossible to simulate the size and magnitude of the issue in a safe environment without testing the actual network,” Moore said. “The inconvenience caused by the loss of access to services for thousands of people serves as a reminder of our dependence on Big Tech in running our daily lives and businesses. Upgrades and maintenance can make systems and networks more vulnerable to small errors, which can have wide-reaching consequences as demonstrated today.

“Another aspect of this incident relates to ‘diversity’ in the use of large-scale IT infrastructure. This applies to critical systems like operating systems, cybersecurity products and other globally deployed applications. Where diversity is low, a single technical incident, not to mention a security issue, can lead to global-scale outages with subsequent knock-on effects,” Moore added.

Find out how emerging tech trends are transforming tomorrow with our new podcast, Future Human: The Series. Listen now on Spotify, on Apple or wherever you get your podcasts.

Leigh Mc Gowran is a journalist with Silicon Republic

editorial@siliconrepublic.com