Cybersecurity firms examine ChatGPT threat model

Among the millions of users who have joined OpenAI’s chat bot preview are security experts keen to probe the ChatGPT threat model in detail.
20 December 2022

What could go wrong? Cybersecurity experts have been exploring OpenAI’s ChatGPT to understand the security implications of chat bots based on large language models. Image credit: Iryna Imago /

ChatGPT, an AI-enabled chat bot based on transformers (neural network architecture capable of processing long sentences and auto-completing input data), has opened the eyes of millions to the capabilities of large language models (LLMs). Among the millions of users who have signed up to OpenAI’s ChatGPT are many cybersecurity experts. And security teams have been busy exploring the world’s most powerful chat bot to understand the ChatGPT threat model in more detail.

Before we dig into their findings, it’s worth recalling what ChatGPT can and can’t do, as well as looking into the protections that OpenAI has in place. “ChatGPT is not connected to the internet, and it can occasionally produce incorrect answers,” writes OpenAI on its FAQ page. “It has limited knowledge of [the] world and events after 2021 and may also occasionally produce harmful instructions or biased content.” Users can give feedback on the model’s output by clicking on thumbs-up or thumbs-down buttons that appear alongside each chat response.

ChatGPT is a fine-tuned version of GPT3.5 (a series of LLMs trained on ‘a blend of text and code from before Q4, 2021’). A key difference between GPT3.5 and ChatGPT is that the latter was tested on human trainers. This first wave of users provided inputs to guide the model and shape it so that the chat bot could handle dialogue more capably. ChatGPT has been trained to decline inappropriate requests, according to OpenAI.

During the current beta test, the firm –which is based in San Francisco, US – is reviewing conversations between users and ChatGPT for model improvement. The analysis also helps OpenAI monitor that content complies with its policies and safety requirements, which is a good opportunity to bring the ChatGPT threat model into our discussion.

What can go wrong?

Threat modeling, as the working group of the Threat Modeling Manifesto advises, is a good idea for anyone concerned about their system’s privacy, safety, and security. And, in a nutshell, the process boils down to answering four questions –

  1. What are we working on?
  2. What can go wrong?
  3. What are we going to do about it?
  4. Did we do a good enough job?

Thanks to the curiosity of millions of users, answering the first question becomes clearer each day as the capabilities of ChatGPT come to light. This leads us to question two and to the findings of a range of cybersecurity firms. Security researchers have been busy exploring the threat landscape associated with conversational AI.

The ability of OpenAI’s chat bot to decipher and explain code to users is a fantastic educational tool. And the same goes for being able to suggest code snippets in response to queries. But what about bad actors? Is ChatGPT capable of recommending malware or other harmful code?

“It can be used positively within an enterprise’s security and development workflow, which increases the defense capabilities above the current (existing) security standards,” comment analysts from Naoris Protocol, a cyber security firm. “However, bad actors can increase the attack vector, working smarter and a lot quicker by instructing AI to look for exploits in well-established code and systems.”

Virtual bad actors

The concept of chat bots becoming virtual bad actors, albeit unknowingly, is a common concern. “While ChatGPT is obviously a useful tool to educate, it can also be a useful tool in developing attacks,” notes Damian Archer of Trustwave’s SpiderLabs security team.” Once an attacker has found a vulnerability, ChatGPT can be used to help develop and correct exploits.”

Part of the problem is exactly what makes OpenAI’s chat bot so compelling – the ability to have a meaningful dialogue between human and machine. Users can ask the model for more information, or make corrections to their requests, refining the output that’s provided. Many people who’ve put ChatGPT through its paces will agree that its user experience improves on Google Search. The bulk of the tech giant’s information is often several clicks away, although it does display popular responses directly on its search page.

ChatGPT doesn’t necessarily bring any new information to the table; OpenAI has made clear that the current model is based on data up to Q4, 2021. But the way that the conversational AI can access, make sense and serve up those details to users, is impressive. Potentially, one of the downsides of making code easier to understand and deploy is that you lower the barrier for adversaries to launch an attack. But it’s worth keeping in mind that – if you know where to look – it’s already possible to purchase malware-as-a-service.

Cybercriminals are already only too happy to help unskilled bad actors, for a fee, so one could argue that ChatGPT may not have shifted things as far as some suggest. But making information more widely available still has the potential to stir up trouble

Check Point Research – another cybersecurity firm that’s been examining the ChatGPT threat model – has shown that the chat bot could be used by adversaries to create plausible phishing emails. Although, from the screenshots posted, it appears that OpenAI has tagged the user queries as suspicious, highlighting some capability for detecting when ChatGPT is being led astray.

Business model brainstorming

There’s also the question of whether ChatGPT will be the victim of its own success. Running the world’s most popular chat bot doesn’t come cheap, as Sam Altman – OpenAI’s CEO – mentioned recently on Twitter. OpenAI has a financial cushion thanks to its partnership with Microsoft. But that money won’t last forever, so make use of that free access while you can. And keep sharing those security insights responsibly.