DLP (data loss prevention) tools can help companies keep sensitive data from leaving the company. There are standalone tools, but DLP features are also included in email security solutions as well as security service edge (SSE). secure access service edge (SASE), and endpoint protection platforms.
These vendors are starting to address genAI specifically. For example, Forcepoint DLP promises to help enterprises control who has access to generative AI, prevent the uploading of sensitive files, and prevent the pasting of sensitive information.
According to a Netskope report released in July, many industries are already starting to use such tools to help secure generative AI. In the financial services, for example, 18% of companies are blocking ChatGPT, while 19% use data loss prevention tools. In healthcare, the numbers are 18% and 21%, respectively.
It’s too early to tell to what extent these tools will work on embedded generative AI, since the way that the AI is embedded is still evolving. Additionally, the degree to which these AIs can access sensitive data can vary vendor by vendor — and day by day.
‘Enterprise-safe’ generative AI
Addressing unsanctioned genAI use requires a carrot-and-stick approach. It’s not enough to explain to employees why some generative AI is risky and to block generative AI tools. Employees also need to have safe, approved generative AI tools that they can use. Otherwise, if the need is great enough, they’ll figure out a way around the blocks by finding apps that haven’t hit IT’s radar yet or using personal devices to access the applications.
For example, most AI-powered image generation tools are trained on questionable data and do not provide security guarantees to users. But Adobe announced in June that it uses only fully licensed data to train its Firefly generative AI tools, and no content uploaded by users is used to train its AIs. The company says it’s also planning to offer legal indemnification against IP lawsuits related to Firefly output for its enterprise users.
In July, Shutterstock followed suit, offering similar legal indemnification to enterprise customers. Shutterstock was already using fully licensed images to train its AI, and last fall announced a compensation fee for artists whose images are used for training data. The company said it’s already compensated “hundreds of thousands of artists” and plans to make payments to millions more.
Similarly, with text generation, some enterprise-focused platforms are working hard to build systems that can help companies feel safe that their data will not be leaked to other users. For example, Microsoft’s Azure OpenAI Service promises that customer data will not be used to train its foundational models, and the data is protected by enterprise-grade compliance and security controls.
Some text AI companies have begun signing deals with content providers to get fully licensed training data. In July, for example, OpenAI signed a deal with the Associated Press to get access to part of its text archive.
But these efforts are still in their infancy.
Enterprise-grade productivity tools that add generative AI features — like Microsoft 365 Copilot — are likely to have good data protections in place, according to Insight CTO McCurdy. “If they’re doing it correctly, they’ll use all the same controls they have today to protect your information,” he said.
As of this writing, Microsoft 365 Copilot is in the early access phase. This is a fast-moving area, however, and it’s hard to keep up with which vendors are rolling out which features, and how secure they are.
Some enterprise vendors have already run into problems with their generative AI privacy and security policies. Zoom, for example, updated its terms of service to indicate that the company could use customer video calls to train its models. After a public outcry, the company backtracked, and today the company says none of this information is used to train its AI models. Plus, account owners or administrators can choose to enable or disable its AI-powered meeting summary feature.
Other vendors are getting out in front of this issue. Grammarly, for example, which uses OpenAI for its generative AI, promises that customers' data is isolated so that information won’t leak from one user to another, and says that no data is used by partners or third parties to train AIs.
Continuing complications
But not all vendors are going to be forthcoming on where their training data comes from or how customer data is secured.
Clients need to be able to understand how a particular AI will interact with their data, said Gartner analyst Jason Wong. “But it’s hard to get that level of transparency,” he added. Until there are regulations in place, organizations can try to push their vendors for more information, but they should be prepared for some resistance.
With productivity apps that are embedding new generative AI features, things might get even more complicated, he added. “In an embedded solution, you’re probably going to be using lots of different models,” he said.
Ideally, all software vendors would provide customers with a full AI bill of materials, explaining what models they use, how they are trained, and how customer data is protected.
“There’s some early conversation about that going around, but it will be exceptionally difficult,” said Forrester’s Cunningham. “We have a problem getting traditional vendors to do this with regular software. If you’re a big enough customer, you could ask... but if you ask for specifics, it’s unlikely that they would have the totality of what customers would need because it’s so new and nuanced.”
Often, a company’s security team might not even know that a particular application used by the company has now added generative AI tools, he said. “The majority of folks I’m talking to are pulling their hair out trying to deal with this,” he said. “The Pandora’s Box has been ripped open.”
If all software will now have AI built in, companies will have to review their entire software portfolio, he said. “It’s a real problem.”
Companies can block everything, but that would be a sledgehammer solution — and hurt productivity, he said.
That’s why it’s imperative to educate employees about the risks and liabilities of using generative AI, he added. “It’s perfectly valid for an organization to say, ‘You can use this at home, but you’re potentially endangering the company and your job by using these tools in the wrong way.’”