Tech Insight : Lessons Learned | Digital Network Solutions

Written by: Paul | July 24th, 2024

Following the massive after-effects of the faulty CrowdStrike update, we take a look at the lessons learned so far in this ongoing situation.

What Happened?

On July 19, a faulty software update from cybersecurity technology company CrowdStrike affected approximately 8.5 million Microsoft devices globally causing chaos across multiple industries globally as key systems were disabled. The faulty software update only impacted Microsoft because the update for CrowdStrike’s enterprise security platform was specifically designed just for the Windows operating system. The update caused a ‘logic error’, leading to widespread system crashes and blue screens of death (BSOD).

The worst cyber event in history (so far), it has surpassed the scale of the 2017 WannaCry attack and highlighted significant vulnerabilities in modern cybersecurity frameworks.

What Is CrowdStrike?

CrowdStrike, founded in 2011 and headquartered in Texas, provides the Falcon platform, a cloud-based endpoint protection solution used by large businesses and organisations globally.

Lessons Learned from the CrowdStrike Event

Although (at the time of writing this), some of the after-effects are still being felt, the scale and severity of the event have already taught us some valuable lessons. For example:

– Businesses may have an over-reliance on the Cloud. The CrowdStrike incident has starkly highlighted our over-reliance on cloud services. Many businesses have embraced the cloud for its scalability, cost-effectiveness and convenience, often integrating critical operations and data storage into cloud platforms. However, this event demonstrated the potential risks of depending heavily on a single cloud provider or a homogeneous cloud environment.

– A re-evaluating of business cloud strategies may now be necessary. For example, the disruption caused by the faulty update has already led to some reconsidering their cloud strategies. Many businesses are now re-evaluating their cloud-first approaches to avoid single points of failure. Strategies being reported include moving away from a platform-centric approach to a more tailored, nuanced strategy which balances performance, costs, and security, and enhances efficiency and reduces risk. Also, adopting workload-specific strategies and determining the best platform for each application, e.g. a private cloud, industry cloud, on-premises data-centres, or a multi-cloud architecture, may now be a more attractive and less risky strategy.

Broadly speaking, building resilience through diversity (e.g. diversifying cloud providers and also implementing hybrid and multi-cloud strategies) plus ensuring all eggs aren’t just in one basket may now be the way forward (controversially) for some businesses. This approach could mitigate the risks associated with single points of failure, ensure greater operational continuity, perhaps even reduce (long-term) costs, and hopefully contain any cloud chaos in future.

– There is a need for rigorous software testing. The CrowdStrike incident emphasises the critical importance of thorough testing before deploying software updates, especially on a Friday! This event demonstrated that even minor configuration changes could have catastrophic consequences if not properly vetted. Comprehensive testing protocols must now be implemented to prevent such incidents from occurring in the future. Robust incident response plans are necessary. Businesses need to ensure they have comprehensive strategies in place to quickly address and mitigate the impact of IT failures. This includes regular drills and updates to incident response protocols to stay prepared for various scenarios.

– Enhanced employee security training and awareness is important. Increased vigilance against phishing and other cyber threats is crucial in the aftermath of such incidents. Businesses must invest in continuous employee training to recognise and respond to cybersecurity threats effectively. This proactive approach can significantly reduce the risk of successful cyberattacks exploiting the situation. For example, some of the reports of how cyber-criminals have already taken advantage of the situation include:

– Phishing campaigns, pretending to offer fixes and updates for the CrowdStrike-related issues. These campaigns are aimed to trick users into clicking on malicious links, leading to malware infections. The US The Department of Homeland Security’s Cybersecurity & Infrastructure Security Agency (CISA) reported that it had “observed threat actors taking advantage of this incident for phishing and other malicious activity”. People have been advised to avoid clicking links in any text or email related to the CrowdStrike or Windows disruption.

– Setting up fraudulent websites claiming to provide legitimate updates and solutions. These sites were designed to distribute malware under the guise of providing help. For example, cybercriminals distributed ZIP archives with names like “CrowdStrike-hotfix.zip” containing the HijackLoader payload (which loads malware) and was reportedly aimed at users and CrowdStrike customers in Latin America.

– Initiating ransomware attacks, taking advantage of the disruption.

– Stealing data. In some cases, attackers have exploited vulnerabilities exposed by the disruption to infiltrate systems and steal sensitive data, compounding the damage caused by the initial outage

– Continuous and transparent communication makes a big difference in a crisis. CrowdStrike’s swift communication and deployment of a fix were crucial in managing the incident’s fallout. Transparent and continuous updates helped affected organisations understand the issue and implement necessary measures. This event highlights the importance of maintaining open lines of communication between cybersecurity firms and their clients during crises to ensure timely and effective responses.

– Be cautious with third-party services. The CrowdStrike incident underscores the critical risks associated with relying on third-party services. Businesses learned that dependency on external providers for crucial functions can lead to widespread disruptions if those services fail. The incident highlighted the necessity of rigorous vetting processes to ensure third-party providers meet high security and reliability standards. Continuous monitoring and regular audits are essential to identify and mitigate risks promptly.

Diversifying service providers can reduce the risk of a single point of failure, enhancing overall resilience. Companies should ensure contracts with third-party providers include stringent security requirements and clear terms for liability and incident response. This approach helps maintain control and oversight over outsourced services, safeguarding operations and data integrity against potential vulnerabilities introduced by external partners.

Why Was The Aviation Sector So Badly Affected?

The aviation sector experienced severe operational disruptions. Thousands of flights were cancelled or delayed, affecting major airports worldwide. The aviation and travel sectors were heavily affected by the CrowdStrike issue due to their reliance on real-time IT systems for critical operations. The system crashes disrupted flight scheduling, booking, and check-in processes, leading to thousands of cancellations and delays. Additionally, the outage compromised safety and security monitoring systems, exacerbating the operational chaos and inconvenience for passengers.

Why Was The Healthcare Sector Also Badly Affected?

Hospitals and healthcare systems faced critical disruptions, delaying clinical procedures, and impacting patient care. The incident forced many institutions to revert to manual processes, highlighting the vulnerability of healthcare systems to IT failures. Healthcare and hospitals are particularly vulnerable to IT issues like the CrowdStrike incident (and cyberattacks) due to their reliance on IT systems for critical patient care functions, such as electronic health records (EHRs), medical devices, and communication systems. Also, the complex IT infrastructure in hospitals, often a mix of legacy and modern systems, creates additional vulnerabilities, as securely integrating these diverse systems is challenging.

This event demonstrates the urgent need for healthcare providers to invest in robust IT infrastructure and emergency protocols to ensure patient safety and continuity of care during technological crises.

What Does This Mean For Your Business?

The CrowdStrike incident is a stark reminder of the inherent vulnerabilities in modern cybersecurity frameworks and the critical importance of robust IT management strategies. For businesses, the event offers many lessons, such as the need for rigorous testing and validation processes for all software updates. Ensuring that updates are thoroughly vetted before deployment can prevent similar catastrophic failures in the future.

Also, the incident highlights the necessity of developing comprehensive contingency plans to maintain operational continuity during IT disruptions. Businesses should conduct regular drills and update their incident response protocols to prepare for various scenarios, ensuring they can quickly address and mitigate the impact of unexpected failures.

The extensive disruption across various industries illustrates the interconnected nature of modern business operations and the potential widespread impact of a single point of failure. Businesses should, therefore, take a good look not only their own cybersecurity measures but also closely scrutinise and manage the cybersecurity protocols of their service providers and partners. This includes implementing stringent vetting processes, continuous monitoring, and regular audits of third-party services to ensure high security and reliability standards are maintained.

The legal and financial ramifications of such events also cannot be ignored. The anticipated lawsuits and claims for damages resulting from operational disruptions and customer inconvenience could set significant precedents, influencing future legal standards and liability expectations in the cybersecurity sector. That said, many businesses in the aviation and travel sector may decide to risk arguing that this was an exceptional event, thereby hoping to limit their legal/financial liabilities. Businesses may, however, need to enhance their insurance coverage and legal strategies to mitigate potential similar risks in the future.

Also, the increased risk of cyber-attacks following this incident (and other incidents in the past) should prompt businesses to heighten their vigilance against phishing and other cyber threats. The surge in CrowdStrike-themed phishing websites, for example, demonstrates the opportunistic nature of cybercriminals. Businesses must ensure their employees are well-informed and equipped to recognise and respond to these threats, investing in continuous security training and awareness programs.

While the disruption caused by CrowdStrike’s software update was not a cyber-attack, it nonetheless highlights the need for comprehensive cybersecurity strategies. Businesses that learn from this incident and proactively strengthen their cybersecurity frameworks will be better positioned to navigate the complexities of the digital age and safeguard their operations against future disruptions. By diversifying their cloud dependencies, implementing robust incident response plans, and maintaining stringent oversight of third-party services, companies can build a more resilient and secure operational environment.