In recent times, the chatbot has been praised as a powerful ally for business growth and efficiency, with many using it to streamline operations, extract data and write copy. However, with businesses increasing their utilisation of the chatbot, there have also been growing concerns over how the platform handles private data and the unknown risks it could pose to users and clients.

In light of these privacy concerns, a recent investigation revealed that, alarmingly, 11% of the information that employees copy and pasted into ChatGPT is confidential. Due to ongoing privacy fears, several large-scale companies, including JP Morgan, Amazon, and Accenture, have since restricted the use of ChatGPT by employees.

What is ChatGPT?

Launched in November 2022, ChatGPT is an artificial intelligence chatbot developed by OpenAI. Its main purpose is to enable users to engage in human-like conversations and receive assistance with tasks when prompted.

As a large language model, the chatbot uses both supervised and reinforcement learning techniques to provide detailed responses when instructed by its user. The model is trained on large datasets of text which originate from a variety of credible sources, such as books, journals, articles, and Wikipedia.

In essence, the chatbot works by learning and reproducing what it has learned when prompted, which is similar to how humans learn and communicate.

Since its release, the language model has been recognised for its ability to enhance business efficiency through streamlining communications, composing copy and emails, coding and helping answer customer queries.

What are the privacy concerns associated with ChatGPT?

Due to the chatbot's recent appraisals of being able to assist business productivity and orderliness, there have been increasing concerns over employees negligently submitting sensitive corporate data.

Employees submitting sensitive data

A recent investigation led by data security service, Cyberhaven, revealed the extent of which employees were pasting private data into ChatGPT. The security service also provided two examples whereby employees were wrongly inputting this type of data.

One instance involved a doctor who submitted their patient's name and medical condition into the chatbot and instructed it to craft a letter to the patient's insurance company. In another instance, an employee cut and pasted their firm's 2023 strategy document into ChatGPT asking it to create a PowerPoint deck.

These examples highlight a potential breach of their employee confidentiality agreements - which are traditionally disclosed between an employer and contractor at the start of employment. Secondly, the sharing of this private data could result in massive leaks of proprietary information for companies and individuals, which could ultimately surface into the public domain.

The nature of generative AI models

Due to the very nature of LLMs such as ChatGPT, there has been a lot of speculation surrounding how these models could be integrating data into their systems and how that information could then be retrieved. That is, if the appropriate data protection and security aren't established for the service.

In light of recent events where employees have been submitting sensitive data, data breach and security professionals have begun to raise substantial concerns over how this data may be ingested as training data into the models which could then resurface when prompted by the right queries.

In the case where the doctor submitted their patient's personal information, hypothetically this information could resurface due to the nature of LLMs, posing a potential GDPR breach risk.

Vagueness surrounding AI

As AI chatbots continue to gain popularity, numerous employees are getting swept up in the trend and incorporating these tools to boost business productivity. However, a concerning issue arises when some well-intentioned staff members unwittingly feed sensitive and confidential data into these AI chatbots.

The issue at hand is that a significant portion of the population lacks a clear understanding of the functioning of LLMs, which involves the consumption of data for training purposes and the possibility of returning this data to others if prompted by a specific query. Additionally, there is a lack of proper cautionary measures and alerts for these models, leading users to remain unaware of potentially submitting private data.

What is ChatGPT's Privacy Policy?

Beyond talk of how employees are inadvertently submitting private data, there has also been controversy surrounding ChatGPT's privacy policy and the amount of data the service can collect on users. The policy states that it collects the following:

Data on the user's interactions with the site, including the type of content users engage with, the features they use and the actions they took.

The user's IP address.

The user's browser type and settings.

The user's browsing activities over time and across websites.

OpenAI also states that it may share users' personal information with unspecified third parties, without informing them beforehand, in order to meet their business objectives.

The Need for Greater AI Regulation

On 20th March, ChatGPT suffered a large-scale data breach, rendering conversation history accessible to users, as well as billing and payment data. While the company says it has now fixed the problem, GDPR is likely to have been breached.

Due to instances like these, governments worldwide are currently investigating the effects of this type of occurrence and devising ways to regulate it to safeguard user data. The UK is also in the process of establishing a task force to evaluate the implications of large language models on society, and is currently in talks for developing a pro-innovation approach to AI regulation.

Ambiguity Over ChatGPT's Compliance

Privacy advocates and security specialists have also raised concerns over the ambiguity of how natural language processing tools, like ChatGPT receive and store data, raising doubt as to whether these models are GDPR compliant.

Under GDPR regulations, people have the right to request that their personal information be removed from an organisation's records. This is known as the "right to be forgotten" and it provides individuals with more control over their sensitive data.

But, with the very nature of natural language processing tools like ChatGPT ingesting data for training purposes, sensitive data could potentially resurface when prompted by the right queries. For these reasons, there is a lack of clarity over whether ChatGPT is complicit with GDPR.

How Businesses Can Stay GDPR Compliant with ChatGPT

1.   Assume that anything you enter could later be accessible in the public domain

Don't place any sensitive or confidential information in the chatbot. This includes things such as corporate secrets and personal identifiers, including names, addresses and medical records.

2.   Don't input software code or internal data

ChatGPT could potentially ingest this code and data as a training tool and then, when given the correct prompt, this could be included in responses to other people's instructions.

3.   Revise confidentiality agreements to include the use of AI

Evaluate and amend any legally binding documents clients are obliged to sign, to ensure they discuss the explicit use of sharing private data with AI tools such as chatbots.

4.   Create an explicit clause in employee contracts

Review and alter any employee contracts to ensure there is a specific clause relating to the sharing of sensitive data and the protection of the company's IP rights when using AI.

5.   Hold sufficient company training on the use of AI

As AI is being utilised more frequently in the corporate sphere, it is important to make training a priority. Throughout the learning, you should make it clear what constitutes as private and confidential data and give examples. You should also explain the legal consequences of sharing sensitive data.

6.   Create a company policy and an employee user guide

By crafting policies and procedures dedicated to the handling and appropriate use of AI, employees will feel more confident in the proper and improper use of AI, furthermore reducing the likelihood of sensitive data being leaked.

For more information visit Hayes Connor