In recent weeks, the most recognizable names in generative AI have been put under the microscope by lawsuits contesting their widespread use of copyrighted creative work, without consent or compensation. This has sparked concerns about AI and data privacy.
According to a recent Washington Post article, George R.R. Martin and Jodi Picoult are among the writers represented in a lawsuit led by the Authors Guild against OpenAI. The lawsuit claims that “at the heart of these algorithms is systematic theft on a mass scale.” Spokespeople for these companies argue that training their systems on virtually everything available on the internet qualifies as fair use.
Even if you aren’t a famous author or creative, these cases highlight the importance of understanding how generative AI systems are trained while the legal system sprints to catch up.
It will take time to establish legal precedents for how we train AI models, and how transparent companies need to be about their process with users. Currently, details about what OpenAI, Google, and others are doing with the information you feed them are somewhat vague.
For generative AI models like ChaptGPT and Bard to continue learning, they default to saving user input to train their systems. Following a bug incident that leaked ChatGPT users’ chat histories and sensitive information back in March 2023, both models have made some changes.
Now, you have the option to disable your chat history, opt out of training, and export the data you’ve shared with them. You can start this process by checking out Bard’s Privacy Help Hub or OpenAI’s brief article about how they use your data.
If you don’t want images, text, usage logs, and personal information stored and used to train their systems, you’ll have to adjust your settings and fill out forms to opt out. Even then, ChatGPT internally retains your content for 30 days, and the developer’s team may manually review it in order to check for abuse, according to Ars Technica.
OpenAI’s FAQ page for data controls mentions plans to release a ChatGPT Business subscription to appeal to users who want more control over their data. For this version, its key feature will be that “end user’s data won’t be used to train our models by default.” The planned release date, however, is unclear.
While there are many ways to use generative AI models safely to optimize your processes, it’s vital to be cautious about what information you input while using them. This is especially important if your company handles financial, health, or private personal data. Creating a well-developed data protection policy and an action plan mitigates potential risks and prevents unauthorized access and data breaches.
If an HR professional asks an open AI system like ChaptGPT or Bard to draft a response to an employee who asked for medical accommodations, that information is out of your control. Even if the HR professional had adjusted settings to disable their history, that sensitive information could be seen by an unintentional third party (i.e., the AI developer’s review team).
Always steer clear of adding any sensitive or personally identifiable information to prompts, and educate your team. According to Reuters, “Without proper training and policies on the use of AI systems, employers could find themselves making disclosures in violation of state and/or federal privacy laws.”
Ultimately, generative AI and data privacy are no longer topics that businesses can ignore. Your organization is responsible for protecting the personal data your clients and employees share with you.