Can chatbots remember what you type? NCSC issues warning

If chatbots can remember what you type then ChatGPT might be hoovering up your secrets. Be careful about inputting sensitive info warns NCSC.
14 March 2023

Keyboard concerns: chatbots may know more than you wanted them to. Image credit: Shutterstock Generate.

Getting your Trinity Audio player ready...

Being careful with whom you share your secrets is always good advice. And that tip applies to advanced chatbots such as OpenAI’s ChatGPT. In fairness, OpenAI does provide user guidelines in its FAQs. “We are not able to delete specific prompts from your history,” writes the developer. “Please don’t share any sensitive information in your conversations.” The advice comes 8th on the list and, a few bullet points earlier, OpenAI mentions that it reviews conversations between ChatGPT and users to improve its systems and to ensure the content complies with the company’s policies and safety requirements. But this may be of little comfort if you’ve overlooked the possibility of chatbots remembering what type.

The prospect of one of the fastest-growing consumer applications ever developed gobbling up user secrets has got security agencies concerned, including the UK’s National Cyber Security Centre (NCSC). Today, the NCSC issued a warning on the risks of advanced chatbots based on large language models (LLMs) such as ChatGPT. Advanced chatbots have grabbed the attention of firms as well as the public at large. ChatGPT can generate documents and other plausible business correspondence in seconds, but if the prompts used to generate text could cause issues were they to be made public, then business users might want to think again.

Take great care says NCSC

“Individuals and organisations should take great care with the data they choose to submit in prompts,” write the NCSC’s Technical Directors for Platform Research and Data Science – referred to as David C and Paul J, respectively – in a blog post published today. “You should ensure that those who want to experiment with LLMs are able to, but in a way that doesn’t place organisational data at risk.”

ChatGPT was launched in research preview mode towards the end of 2022, and reportedly already had 100 million users by January 2023, a 100x increase on the month before. Today, the number of users of public LLMs is likely to be even higher. Given the large number of signups plus the possibility that advanced chatbots remember what you type, the chances of sensitive information being shared are high, if terms of use and privacy policy details haven’t stuck in users’ minds.

Issues may not surface immediately – a point made by David C and Paul J in the NCSC blog post warning of the potential risks of ChatGPT and LLMs. Advanced chatbots built using LLMs can take months of training and require tens of thousands of GPUs. OpenAI teamed up with Microsoft to create the Azure-hosted supercomputer used to determine the billions of model parameters that keep ChatGPT whirring away today. At busy times, users can be force to wait if servers are at capacity – the LLM itself is relatively static and, in the case of OpenAI’s ChatGPT, doesn’t query the internet directly. But newer implementations such as Microsoft’s updated Bing search hint at a more dynamic future.

As the FAQ’s show, OpenAI doesn’t hide the fact that it can review conversations between users and its advanced chatbot. And it’s providing the service for free, which, as the saying goes, makes users the product. Or, more accurately, users are helping to build future LLMs, which justifies the high costs that OpenAI faces in providing AI-generated output to millions of users. Regular search results are cheap to deliver by comparison, and Microsoft has pumped billions of dollars of investment into OpenAI to take LLMs to the next level.

When ChatGPT launched, concerns were raised that the advanced chatbot, which was trained on large amounts of text scraped from the internet, could have digested swathes of copyrighted content along the way. And that text could end as part of the results served up to users. Similarly, the prompts fed into models today by millions of daily users could one day appear into future LLMs, which is fine if those keystrokes don’t put company details at risk or reveal thoughts that are best left private.

Can chatbots remember what you type?

Blake Lemoine – who shot to fame when he raised concerns that, as the Washington Post put it, Google’s AI had come to life – offers some insight into the memory that’s built into LLMs. Lemoine, who was part of Google’s responsible AI team and was tasked with putting the search giant’s LaMDA (Language Model for Dialogue Applications) to the test as part of the system’s development, prompted updated versions of the model for several years. And, speaking in interviews, Lemoine says that he could spot signs that the model had remembered conversations he’d had with LaMDA years previous.

Google appears to have been cautious in its public rollout of LLMs. And whether that caution reveals anything, time will tell. From a business perspective, shareholders were unimpressed with Google’s hasty launch of its Bard chatbot in response to ChatGPT. Advanced chatbots powered by LLMs are fascinating tools and have captivated millions of users worldwide, but they also give pause for thought – especially, if chatbots remember what you type.