What happens when cybercriminals start to use machine learning?


Over the last few years, machine learning threat detection and defence company Darktrace has been something of a rising star in the cybersecurity industry. Its core unsupervised machine learning technology lend it the reputation of being one of the best in AI-enabled security. But what exactly do those on the cutting edge of cybersecurity research worry about?

Computerworld UK met with director of cyber analysis at Darktrace, Andrew Tsonchev, at the IP Expo show in London's Docklands late last month.

"A lot of solutions out there look at previous attacks and try to learn from them, so AI and machine learning are being built around learning from what they've seen before," he said. "That's quite effective at, say, coming up with a machine learning classifier that can detect banking trojans."

Read next: Church of England puts a stop to ransomware with Darktrace

But what's the flip-side to that? If vendors are taking artificial intelligence seriously in threat detection, won't their counterparts in the criminal world consider the same? Are these hackers as sophisticated currently as some of the vendors would have us believe they are?

To understand where machine learning might be useful for attackers, it's useful to consider some instances where it has demonstrated strong advantages in defence.

"Technologically simple attacks are very effective," says Tsonchev. "We do see a lot of compromises on networks that are not flashy in terms of custom exploit development, bespoke malware that's been designed to evade detection. A lot of the time it's the old fashioned stuff: password theft, phishing, all sorts of these things.

"The problem with those attacks are they're still very effective. But they're quite hard to detect. A lot of times – say you have a situation where an externally facing server is compromised using an existing employee's credentials; or situations where employees aren't good at not using the same passwords for their personal stuff as their work stuff. When there's a data breach and passwords get leaked, they get into these traded and shared databases. There's a good chance these passwords would work on corporate systems.

"There's nothing clever in those attacks, nothing inherently malicious if you look at them. If you're looking for threats by violation of policies, that's not a violation of policy. That's an authentication attack where someone's used a password that's meant to have access to the system, access to files that are meant to be taken out.

"It's unwanted, it's fraudulent, but it's not technically distinguishable as malicious in terms of violating access controls, which makes it hard to detect."

In those instances, the technical indicators are uprooted by people simply acting suspiciously, a far more difficult indicator than if someone is trying to get access to a network through a backdoor. This is where behavioural understanding and AI comes into the equation, to better navigate the often unpredictably and tricky complexities of humans acting like humans.

Right now, Tsonchev said, Darktrace hasn't spotted a true machine learning attack in the wild.

"This is something we are super focused on – it's what we do – and we're very aware of the benefits so we are very worried about the stage when there is widespread access and adoption of AI-enabled malware and toolkits for attackers to use," explained Tsonchev.

Read next: Machine learning in cybersecurity: what is it and what do you need to know?

"The danger with AI is that it threatens to collapse this distinction," Tsonchev said. "In that suddenly, you can use AI-enabled tools to replicate en masse scale, the kind of targeting and tailoring that at the minute is only possible on a case-by-case basis.

"That is because by and large, applications of AI unlock decision-making, and that is what human-driven attacks do. You have an attacker in a network, on a keyboard, and they can case the joint. They can see what the weak points are. They can adapt the attack path they follow to the particular environment they find themselves in, that's why they're hard to detect.

"We're very worried about malware that does that: malware that uses machine learning classifiers to land and observe the network and see what it can do."

This sort of thinking could be applied to all the attacks that we have come to be familiar with. Take the majority of phishing attacks: for the most part these are 'spray and pray' approaches directed at the world in general, and if someone bites, then great.

Spearphishing – its highly targeted cousin – requires the attacker to pay close attention to their target, to stalk their social media accounts, to build a profile of them that they can manipulate with an email that's convincing enough to pass what Tsonchev calls the 'human sanity check'.

"The worry is that AI will be used to automate that process. Custom development, where the AI systems are trained to make phishing emails that pass the suspiciousness test. You can train an AI classifier on a bunch of genuine emails and learn what makes something convincing.

"And once you've got that and if that works and it gets in the wild, then there's no barriers of entry to do this to everybody, to every sized organisation.

"So you might get opportunistic attacks against small and medium sized enterprises that have the level and sophistication that currently only nation states do against high value targets.

"And that's really worrying."


Copyright © 2017 IDG Communications, Inc.

Shop Tech Products at Amazon