The GDPR is not intended as a compliance overhead for controllers and processors. It is intended to bring higher and consistent standards and processes for the secure treatment of personal data. It’s fundamentally intended to protect the privacy rights of individuals.

This cannot be more true than in emerging data science, analytics, AI, and ML environments, where due to the nature of vast amounts of data sources, there is a higher risk of identifying the personal and sensitive information of an individual.

So how can organizations leverage GDPR Legitimate Interests Processing for data science?

GDPR Legitimate Interests Processing Requirements

The GDPR requires that personal data be collected for “specified, explicit and legitimate purposes” and also that a data controller must define a separate legal basis for each and every purpose for which, e.g., customer data is used.

If a bank customer took out a bank loan, then the bank can only use the collected account data and transactional data for managing and processing that customer to fulfill its obligations for offering a bank loan. This is colloquially referred to as the “primary purpose” for which the data is collected.

If the bank now wanted to re-use this data for any other purpose incompatible with or beyond the scope of the primary purpose, then this is referred to as a “secondary purpose” and will require a separate legal basis for each and every such secondary purpose.

For the avoidance of any doubt, if the bank wanted to use that customer’s data for profiling in a data science environment, then under GDPR the bank is required to document a legal basis for each and every separate purpose for which it stores and processes this customer’s data.

So, for example, a ‘cross sell and up sell’ is one purpose, while ‘customer segmentation’ is another and separate purpose.

If relied upon as the lawful basis, consent must be freely given, specific, informed, and unambiguous, and an additional condition, such as explicit consent, is required when processing special categories of personal data, as described in GDPR Article 9.

Additionally, in this example, the Loan division of the bank cannot share data with its credit card or mortgage divisions without the informed consent of the customer.

We should not get confused with a further and separate legal basis the bank has which is processing necessary for compliance with a legal obligation to which the controller is subject (AML, Fraud, Risk, KYC, etc.).

Selecting a Legal Basis for Secondary Purpose Processing in Data Science

The challenge arises when selecting a legal basis for secondary purpose processing in a data science environment as this needs to be a separate and specific legal basis for each and every purpose. 

It quickly becomes an impractical exercise for the bank, let alone annoying to its customers, to attempt obtaining consent for each and every single purpose in a data science use case. Evidence shows anyway a very low level of positive consent using this approach.

Consent management under GDPR is also tightening up. No more will blackmail clauses or general and ambiguous consent clauses be deemed acceptable.

GDPR offers controllers a more practical and flexible legal basis for exactly these scenarios and encourages controllers to raise their standards towards protecting the privacy of their customers especially in data science environments.

Legitimate interests processing (LIP) is an often misunderstood legal basis under GDPR. This is in part because reliance on LIP may entail the use of additional technical and organizational controls to mitigate the possible impact or the risk of a given data processing on an individual.

Depending on the processing involved, the sensitivity of the data, and the intended purpose, traditional tactical data security solutions such as encryption and hashing methods may not go far enough to mitigate the risk to individuals for the LIP balancing test to come out in favor of the controller’s identified legitimate interest.

If approached correctly, GDPR LIP can provide a framework with defined technical and organizational controls to support controllers’ use of customer data in data science, analytics, AI and ML applications legally.

Without it, controllers may be more exposed to possible non-compliance with GDPR and the risks of legal actions as we are seeing in many high profile privacy-related lawsuits.

Legitimate Interests Processing is the most flexible lawful basis for secondary purpose processing of customer data, especially in data science use cases. But you cannot assume it will always be the most appropriate.

It is likely to be most appropriate where you use an individual’s data in ways they would reasonably expect and which have a minimal privacy impact, or where there is a compelling justification for the processing.

The Responsibilities of GDPR Legitimate Interest Processing

If you choose to rely on GDPR LIP, you are taking on extra responsibility not only for, where needed, implementing technical and organizational controls to support and defend LIP compliance, but also for demonstrating the ethical and proper use of your customer’s data while fully respecting and protecting their privacy rights and interests.

This extra responsibility may include implementing enterprise class, fit for purpose systems and processes (not just paper-based processes).

Automation based privacy solutions such as CryptoNumerics CN-Protect that offer a systems-based (Privacy by Design) risk assessment and scoring capability that detects the risk of re-identification, integrated privacy protection that still retains the analytical value of the data in data science while protecting the identity and privacy of the data subject are available today as examples of demonstrating technical and organizational controls to support LIP.  

Data controllers need to initially perform the GDPR three-part test to validate using LIP as a valid legal basis. You need to:

    • identify a legitimate interest;
    • show that the processing is necessary to achieve it; and
    • balance it against the individual’s interests, rights and freedoms.

The legitimate interests can be your own interests (controllers) or the interests of third parties (processors). They can include commercial interests (marketing), individual interests (risk assessments) or broader societal benefits.

The processing must be necessary. If you can reasonably achieve the same result in another less intrusive way, legitimate interests will not apply.

You must balance your interests against the individual’s. If they would not reasonably expect the processing, or if it would cause unjustified harm, their interests are likely to override your legitimate interests.

Conducting such assessments for accountability purposes is happily now also easier than ever, such as with TrustArc’s Legitimate Interests Assessment (LIA) and Balancing Test that identifies the benefits and risks of data processing.

Assign numerical values to both sides of the scale and uses conditional logic and back-end calculations to generate a full report on the use of legitimate interests at the business process level.

What are the Benefits of Choosing Legitimate Interest Processing?

Because this basis is particularly flexible, it may be applicable in a wide range of different situations such as data science applications. It can also give you more on-going control over your long-term processing than consent, where an individual could withdraw their consent at any time.

Although remember that you still have to consider managing marketing opt outs independently of whatever legal basis you’re using to store and process customer data. 

It also promotes a risk-based approach to data compliance as you need to think about the impact of your processing on individuals, which can help you identify risks and take appropriate safeguards.

This can also support your obligation to ensure “data protection by design,” performing risk assessments for re-identification and demonstrating privacy controls applied to balance out privacy with the demand for retaining analytical value of the data in data science environments.

In turn, it would contribute towards demonstrating your PIAs (Privacy Impact Assessments) which forms part of your DPIA (Data Protection Impact Assessment) requirements and obligations.

LIP as a legal basis, if implemented correctly and supported by the correct organizational and technical controls, also provides the platform to support data collaboration and data sharing.

However, you may need to demonstrate that the data has been sufficiently de-identified, including by showing that the risk assessments for re-identification are performed not just on direct identifiers but also on all indirect identifiers as well. 

Using LIP as a legal basis for processing may help you avoid bombarding people with unnecessary and unwelcome consent requests and can help avoid “consent fatigue.”

It can also, if done properly, be an effective way of protecting the individual’s interests, especially when combined with clear privacy information and an upfront and continuing right to object to such processing.

Lastly, using LIP not only gives you a legal framework to perform data science it also provides a platform that demonstrates the proper and ethical use of customer data, a topic and business objective of most boards of directors.