Above: Illustration by Authorpolygraphus/DepositPhotos
BitDepth#1470 for August 05, 2024
On July 19, cybersecurity firm Crowdstrike sent an automatic update to Microsoft Windows computers that was intended to upgrade the Falcon sensor security solution it sells to enterprise.
The worst possible thing happened. A bug in the code sent the computers that received into a death spiral of blue screens. The update was just 40 kilobytes in size and was intended to adjust the sensor’s ability to detect malware.
Instead, it caused more than US$6 billion dollars in real world damage.
Delta Airlines alone, which deployed the software widely in its computer network, reported losses of more than US$500 million over the week it struggled to normalise operations after the Crowdstrike bug crippled the company’s ability to function.
Crowdstrike has since committed to improving local developer testing, content update and rollback testing, stress testing, fuzzing, and fault injection and to conduct stability and content interface testing.
Microsoft estimates that more than eight million Windows computers were affected by the bug. Crowdstrike quickly deployed a patch that corrected the issue, but for many customers, it fixed nothing, at least not right away.
Falcon is an endpoint sensor widely used in computers that run systems like automated kiosks and customer interface panels that were also secured by Microsoft’s BitLocker encryption software.
On those computers, it was necessary to decrypt the hardware, apply the patch then restart. Roughly 20 minutes of work, multiplied by hundreds of devices.
Delta’s long path to restoring operations was apparently compounded by outsourced IT, which meant fewer people available to “touch” stricken computers.
Trinidad and Tobago was largely unscathed by the incident (https://cstu.io/36e5d9), and most organisations affected by the bug reported resumption of transactions within 24 hours.
“Do I think that TT dodged a bullet because Crowdstrike is expensive? Yes,” said cybersecurity specialist Shiva Parasram.
“The fact that Crowdstrike is very popular but very expensive might be one of the factors limiting its impact in Trinidad.”
“But it’s not necessarily a good thing. The reason why there was minimal impact is because that we don’t really spend much on cybersecurity.”
“I don’t think that cost was the determining factor why CrowdStrike is not as popular locally,” said Anthony Peyson, president of the Caribbean Chapter of the International Information System Security Certification Consortium.
“Traditionally, we are generally slow to replace older software systems and adopt newer ones like CrowdStrike.”
The cruel reality of Crowdstrike is that it wasn’t a cybersecurity attack. It was a quality of service lapse and the incident puts IT professionals in an odd space, sandwiched between determined and sustained attacks by hackers and ransomware organisations and hastily deployed software that ends up fragging their systems from the inside.
Do IT pros do all recommended updates as they are issued and risk buggy updates like Crowdstrike?
Do they wait a few days and risk compromise because of outdated security measures or unplugged security holes?
Do they create a sandboxed update system to confirm that updates are safe? If so, how practical would that be for typically underpaid, overworked local IT teams?
“Most IT professionals are concerned with service delivery, which means that they are focused on ensuring that services remain accessible to all those who need to use them,” Peyson said.
“In this fast-paced environment, most IT professionals set systems to update automatically. It becomes one less thing to worry about. Best practice suggests that IT professionals use a sandbox or a lab to test updates on a small number of systems before deploying them to the wider network.”
“In practice, this is not considered practical and is another layer of complexity that requires resources which include people, time and money. Having a local IT workforce which is generally overworked, underpaid and not respected adds fuel to the fire. This situation creates an environment here the standard IT professional speeds through tasks without exercising due care or diligence.”
Parasram believes that sandboxed test systems to confirm updates are something that companies will have to build into their IT management.
“It’s not going to get any easier for TT,” he said.
“But we have a lot more graduates coming out, new professionals who are looking for a start. Companies will have to get serious about disaster recovery and that includes cloud service providers and software as a service.”
“Companies have to do third-party risk assessments on these businesses, ensure that they are certified, that they have qualified teams, that they are on the ground. What is their response time [when disaster strikes]?”
“People don’t take on service level agreements, but you have to look at how much downtime and uptime are guaranteed and if it’s not provided, you are due compensation. Service level agreements and contracts have to be studied quite carefully to ensure that these critical services are supplied.”
As the immediacy of Crowdstrike disruptions gave way to analysis of the incident, talk of legal liability began to surface.
Unsurprisingly from Delta, but also from Malaysia, whose digital minister pointed out that five government agencies and nine Malaysian companies in aviation, banking and healthcare were affected.
What should Trinidad and Tobago take away from the Crowdstrike bug?
Top of the list is that businesses and government agencies are responsible for the sanctity of their computer systems and every business decision should be predicated on maximising cybersecurity.
Contingency planning must be thorough, exhaustive and well-exercised.
When systems fail, customers and the public don’t actually care and often don’t understand distributed responsibilities, so blaming other companies and services is always going to fall flat.
Everyone knew that a company called Crowdstrike was responsible for service outages, but Delta had to deal with tens of thousands of customers who are still venting on Reddit about their experiences.
A mother travelling with two tired children really doesn’t want to know anything about some other company and some digital problem when they can’t board their flight.
“Organisations are playing Russian roulette if they are operating without a business continuity plan,” said Peyson.
“There is a high dependence by local organisations on third-party vendors like Microsoft and CrowdStrike. Microsoft has suffered major outages over the past couple of weeks and this has affected more local companies than the CrowdStrike outage.”
“The problem is the same, the overall impact due to the dependency on these vendors remains unknown to most organisations and therefore they are unaware of how much risk they are exposed to when these vendor services are disrupted. What will happen if we suffer a national internet outage?”
“The impact is expected to be devastating however we have no idea how much damage organisations will suffer if such an event occurs today.”
While TT customers have a high tolerance for service abuse, they should not be expected to offer eternal grace for digital failures.
TSTT weathered the humiliation of having private customer information exposed on the dark web and later the open internet by offering its CEO and CFO as symbolic public sacrifice.
iGovTT managed to escape public opprobrium after its proud achievement, TTConnect, simply disappeared for months.
With no legal requirement to notify anyone of cybersecurity breaches, other exposures of personally identifiable information remain largely unknown.
What we don’t know can, in fact, hurt us.
[…] Trinidad and Tobago – On July 19, cybersecurity firm Crowdstrike sent an automatic update to Microsoft Windows computers that was intended to upgrade the Falcon sensor security solution it sells to enterprise… more […]