In today’s world you may be wondering whether what you are about to read is human-written, AI-written, or maybe a bit of both? And you should be.
AI and, more specifically, large language models (LLMs) have taken over the collective consciousness of the tech world, not just IT professionals. It is complex, not least of all because this relatively new phenomenon is so intertwined with existing data privacy and protection frameworks issues.
LLM is a type of artificial intelligence (AI) algorithm that uses deep learning techniques and large data sets to understand, summarize, generate, and predict new content. Initial opinions about LLMs range from a “giant exercise in statistics” to calls for abandoning this technology altogether and claims that it “hacked the operating system of human civilization.”
There are many risk considerations, but I’d suggest the first area to focus on is what data is fed to LLMs.
These models can be used to collect, store, and process data including personal data and other data deemed confidential at an unprecedented scale and level of detail. Who is responsible for the legitimacy of data used to train generative AI products? How can we make sure that the AI is trained only on data of a certain quality and that we do not end up in litigation or fined overusing LLMs trained on proprietary data or personal data lacking the mandatory legal basis for processing? The saying ‘garbage in, garbage out’ has never been more true.
A second area of consideration is around what, exactly, happens with data within the LLM? Are humans able to understand the intricacies of processing operations taking place within this black box? How does the recurring tokenization, embedding, attention weighing, and completion of data take place that ends up rendering specific results? How do we prevent or consistently categorize problematic content of so-called AI hallucinations, observed biases, deepfakes on the rise or even outright discrimination? Are we bound to be only reactive in response to such incidents? And can LLMs on their own prevent or be used in preventing issues we are witnessing?
Last, but not least is to consider what will the LLMs be used for? Are there any existing restrictions on use of AI, LLM, and automated decision making? For what other human or AI-defined purposes will the data be used? From the perspective of data flows, who or what will gain access to the data and results generated by LLMs? Should you believe vendors claiming that AI embedded in their latest solution can help reduce the organization's attack surface or better manage cybersecurity risks? How will your organization be able to address privacy related obligations when using such tools, for example, in response to data subject requests for access or deletion of data?
This is just a subjective list of considerations that need to be discussed and debated by anyone intending to use AI in their organization.
Can the responses be found in existing legislative framework?
There is already a complex geography of data protection regulation across Europe – and further afield – and more is yet to come.
Jakub Lewandowski is Global Data Governance Officer at Commvault.
GDPR – The Granddaddy of Data Regulation
But first let’s start with GDPR. With its fifth anniversary this year, the General Data Protection Regulation (GDPR) already feels like the granddaddy of data regulation in the modern age. That’s not to say it is outdated though.
GDPR has had a significant impact on data protection in both the EU and the UK (UK GDPR). Rapid technological developments are proving that GDPR is standing up to the test of time, even if it was never meant as an exhaustive regulation of the AI field. Flexible definitions of processing activities do not cast any doubt on whether LLMs are subject to GDPR.
One of its gold standards is the right not to be subject to decisions which produce legal effects concerning the individual or that “similarly significantly” affect the individual, based solely on automated processing. This is an important safety valve requiring that a decision of this nature may be supported by LLM, but the final say – with limited exceptions - requires human judgement.
GDPR also defined a useful and powerful mechanism of data protection impact assessments, through which organizations need to consider risks to the rights and freedoms of individuals affected by the use of LLMs. Organizations can also seek advice from supervisory authorities, although not many do. An interesting consequence of this is that some supervisory authorities, for instance CNIL in France, are already positioning themselves as regulators and enforcers under yet to be enacted legislative AI frameworks. Similar processes can be observed in businesses. Experience gained by privacy professionals during the implementation of GDPR now naturally predisposes the very same people to try to tackle challenges connected with AI.
GDPR contains built-in mechanisms for evaluating its effectiveness and a planned review of the legislation will include a special report prepared by the European Commission is due next year.
Other areas to watch for in the upcoming months include the first deliverables from a newly launched special EU taskforce on ChatGPT and most importantly - bolstered by the high-profile success of LLMs - works on AI Act are taking shape. In mid-June 2023, European Parliament adopted common negotiating position, which will now be subject to trilogue procedure between EU’s various bodies.
The UK Data Protection and Digital Information Bill (DPDI Bill)
The DPDI Bill is currently under review by the House of Commons, which means that the provisions described herein may still change. It aims to reduce the administrative burden placed on businesses, promote international trade, and reduce record keeping.
The DPDI Bill is significantly more comprehensive on automated decision-making and, in particular, regarding applicable safeguards including the right to be informed about such decisions, to obtain human intervention in relation to such decisions, and to contest such decisions. In essence though, it does not depart much from GDPR’s approach.
This Spring, the UK government also published an AI whitepaper to guide the use of AI in the UK. It acknowledges that organisations can be held back from using AI to its full potential because a patchwork of legal regimes causes confusion and financial and administrative burdens for businesses trying to comply with rules. I believe the response to that will be more regulation, so watch this space.
2023 - The start of another legislative offensive
Inevitably, data privacy and AI regulations will overlap to a certain degree. Regulators and legislators are in the midst of a legislative offensive. If you are now considering whether to jump on the AI bandwagon, it may be worth taking a moment while we all get a better understanding of both the risks and legal requirements to come.
And in case you are still wondering – this text was generated by a human.
Are you a pro? Subscribe to our newsletter
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Jakub Lewandowski is Global Data Governance Officer at Commvault.