Image cloaking and prompt poisoning. How to thwart AI copyright theft?

Security researchers show how image cloaking (Glaze) and prompt poisoning (Nightshade) can prevent AI copyright theft.
21 November 2023

Image cloaking and model poisoning are a couple of adversarial AI approaches that can help creators of original content.

Getting your Trinity Audio player ready...

• AI copyright theft is a growing concern as the technology gets more sophisticated.
• Image cloaking and prompt poisoning are innovative techniques that could help.
• Could artists reclaim their creativity from the giant AI models?

Adversarial AI attacks on road signs exploit the fact that image recognition algorithms can be easily confused in ways that wouldn’t faze human drivers. And that algorithmic sensitivity to almost human-imperceptible manipulations of an image could help creators protect their original artwork from AI copyright theft.

Generative AI text-to-image creators have wowed users with their ability to produce digital art in seconds. No artistic talent is required, just basic typing skills and the imagination to dream up a suitable text prompt. However, not everyone is thrilled by these possibilities.

What to do if generative AI is stealing your images

Commercial artists in particular are concerned about AI copyright theft. And it’s telling that OpenAI has made changes to the latest version of its text-to-image tool, DALL-E 3. “We added a refusal which triggers when a user attempts to generate an image in the style of a living artist,” writes OpenAI in the system card for DALL-E 3 [PDF]. “We will also maintain a blocklist for living artist names which will be updated as required.”

Having somehow digested the works of living artists in their training data sets, generative AI models can produce lookalike images in seconds. And the problem isn’t just the speed at which text-to-image tools operate; it’s the loss of income for the human creators who are losing out to machines – and potentially, those who prompt them. If left unchecked, AI copyright theft enabled through style mimicry will negatively impact professional artists.

What’s more, it’s possible to fine-tune models – by exposing them to additional image samples – to make them even more capable of copying artistic styles, and members of the creative industry have had enough.

If only all AI copyright theft was so blatant...

If only all AI copyright theft was so blatant… (Image unironically generated by AI).

Presenting a tool dubbed Glaze (designed to thwart AI copyright theft) at the 32nd USENIX Security Symposium, researchers from the SAND Lab at the University of Chicago, US, explained how artists had reached out to them for help.

“Style mimicry produces a number of harmful outcomes that may not be obvious at first glance. For artists whose styles are intentionally copied, not only do they see [a] loss in commissions and basic income, but low-quality synthetic copies scattered online dilute their brand and reputation,” comments the team, which has been recognized by the 2023 Internet Defense Prize for its work.

How image cloaking works

Available for download on MacOS and Windows, Glaze protects against AI copyright theft by disrupting style mimicry. The software gives users the option of making slight changes to pixels in the image, which preserves the original appearance to human eyes, while misleading AI algorithms to believe that they are seeing artwork in a different style.

The image cloaking tool runs locally on a user’s machine and examines the original file to calculate the cloak that’s required to make the picture appear to be in another style – for example, resembling an old master.

Larger modifications to the data provide greater protection against the ability of generative AI algorithms to steal the artist’s original style. And once the cloak has been added to an image, it protects that content across a range of different models.

Tracking back to adversarial AI attacks against road signs, security researchers discovered five years ago that all it took to confuse deep neural networks used for image recognition was the addition of small squares of black and white tape.

The IOT/CPS security research team – based at the University of Michigan, US – was able to mislead the image classifier into thinking it was looking at a keep right sign when it was actually being shown an 80 km speed limit warning. Similarly, 80 km speed limit signs could be – in the ‘eyes’ of a deep neural network – made to look like an instruction to stop, just by adding a few sticky squares that would never have fooled a human.

The adversarial attack is successful because certain parts of the scene are more sensitive to manipulation than others. If you can identify those image locations, a small change has a profound effect on the algorithm – and that can help to protect against AI copyright theft too.

Rather than projecting road traffic information, the image cloak generated by Glaze fools generative AI tools into thinking the artwork is in a different style, which foils mimicry attempts and helps to defend human creativity from machines.

What’s the difference between Glaze and Nightshade?

Going a step further to thwart AI copyright theft, the SAND Lab group has devised a prompt-specific poisoning attack targeting text-to-image generative models, which it has named Nightshade. “Nightshade poison samples are also optimized for potency and can corrupt a Stable Diffusion SDXL prompt in <100 poison samples,” write the researchers in their paper.

Whereas Glaze cloaks a single image – a process that can take hours on a laptop lacking a GPU – Nightshade operates on a much larger scale and could protect many more digital artworks.

Text-to-image poisoning could be achieved by simply mislabeling pictures of dogs as cats, so that when users prompted the model for a dog, the output would appear more cat-like. However, these rogue training data would be easy for AI models to reject in pre-screening. To get around this, the researchers curated a poisoned data set where anchor and poisoned images are very similar in feature space.

Feeding just 50 poisoned training samples into Stable Diffusion XL was sufficient to start producing changes in the generative AI text-to-image output. And by the time that 300 samples had been incorporated into the model, the effect was dramatic. A prompt for a hat produced a cake, and cubist artwork was rendered as anime.

It’s promising news for artists who are concerned about AI copyright theft. And these adversarial options could make AI companies think twice before hoovering up text and images to feed their next-generation models.

Preventing AI copyright theft

The researchers first got interested in confusing AI models when they developed an image-cloaking method designed for personal privacy. Worried about how facial recognition was becoming widespread, the SAND Lab team released a program to protect the public.

The software (FAWKES) – a precursor to Nightshade – changed just a few pixels in each photo, sufficient to alter how a computer perceived the image. And if you are interested in evading facial recognition and defeating unauthorized deep learning models, there’s a whole world of fascinating research to check out.

Google has made adversarial patches that turn images into toasters when viewed by a classifier. And there’s make-up advice (CV Dazzle) available on confusing facial recognition cameras in the street. Plus, you can buy privacy-focused eyewear dubbed Reflectacles that is designed to block 3D infrared mapping and scanning systems.

Big tech firms are powering ahead with their AI development, but – as these examples show – there are ways for artists, the public, and businesses in general to make a stand and resist AI copyright theft and other misuses of deep learning algorithms.