Zum Hauptinhalt springen
Suche
0

Das Video ansehen

In today’s digital age, GDPR compliance is a crucial aspect for any organization dealing with personal data. With the rise of data warehousing and advanced modeling solutions like Data Vault 2.0 (DV 2.0), questions often arise about how to handle Personally Identifiable Information (PII) within these frameworks. This article addresses some common concerns and provides practical recommendations for ensuring GDPR compliance in data warehouses.



Die Herausforderung verstehen

GDPR mandates that personal data must be handled with the utmost care, ensuring individuals’ privacy and security. In the context of data warehousing, this often translates to managing business keys that might contain PII. Let’s dive into some specific questions raised around this topic:

  1. How should activity history be managed when the main hub contains a PII business key?
  2. Is it best practice to use hashed business keys in link tables to improve load performance?
  3. Should artificial keys originate from each business domain, and how should they be managed if not?

Question #1: Managing Activity History with PII Business Keys

The Problem

In a typical data warehouse model, customer records might include PII, such as social security numbers or tax IDs. According to GDPR, it’s crucial that activity history is not traceable back to the individual once they exercise their right to be forgotten.

Die Lösung

One effective approach is to split descriptive attributes into different satellites—one for personal data and another for non-personal data. This way, when a deletion request is made, only the personal satellite needs to be purged. The non-personal satellite can retain anonymized data, maintaining the integrity of the dataset while ensuring compliance.


The Problem

Hashing business keys is often recommended in DV 2.0 to improve load performance. However, directly using business keys in link tables can pose a challenge, especially when those keys contain PII.

Die Lösung

In DV 2.0, it’s a standard practice to use hashed values of business key components rather than the business keys themselves. This approach ensures better performance and security. Here’s how it works:

  1. Hash the Business Key: Use a cryptographic hash function (e.g., SHA-256) to convert the business key into a hashed key.
  2. Use Hashed Keys in Link Tables: The hashed key then serves as the foreign key in link tables, ensuring that PII is not directly exposed.

Question #3: Originating and Managing Artificial Keys

The Problem

There’s a debate on whether artificial keys should be generated within each business domain or within the data warehouse itself. This raises concerns about consistency and management, especially if the artificial key must be derived from PII.

Die Lösung

Artificial keys should ideally be generated within the data warehouse to maintain consistency and control. Here’s the process:

  1. Generate a UUID: Use a universally unique identifier (UUID) for the artificial key. This ensures randomness and reduces the risk of duplication.
  2. Link Artificial Keys to Business Keys: Establish a relationship between the artificial key and the business key within the data warehouse, ensuring that the artificial key is never exposed in operational systems.

Handling Scenarios Without Artificial Keys

If generating artificial keys within the data warehouse is not feasible, the data warehouse should still generate these keys upon ingestion. This method ensures that all keys are managed consistently and securely.


Ensuring Compliance and Security

Satellite Splitting

By splitting satellites into personal and non-personal data, organizations can easily manage deletion requests without compromising data integrity.

Cryptographic Hashing

Utilizing cryptographic hashing for business keys in link tables enhances both security and performance, crucial for maintaining GDPR compliance.

Artificial Keys Management

Generating artificial keys within the data warehouse ensures consistency and security, reducing the risk of PII exposure.

Regular audits and consultations with legal experts ensure ongoing compliance with GDPR and other regulations. Implementing these practices helps organizations stay ahead of potential compliance issues.


Schlussfolgerung

Handling PII in data warehouses requires careful planning and robust solutions. By implementing satellite splitting, cryptographic hashing, and consistent artificial key management, organizations can ensure GDPR compliance while maintaining data integrity and performance. Regular audits and legal advice further bolster these practices, ensuring that data handling processes remain secure and compliant with evolving regulations.

Treffen mit dem Sprecher

Profilbild von Michael Olschimke

Michael Olschimke

Michael hat mehr als 15 Jahre Erfahrung in der Informationstechnologie. In den letzten acht Jahren hat er sich auf Business Intelligence Themen wie OLAP, Dimensional Modelling und Data Mining spezialisiert. Fordern Sie ihn mit Ihren Fragen heraus!

Updates und Support erhalten

Bitte senden Sie Anfragen und Funktionswünsche an [email protected]

Für Anfragen zu Data Vault-Schulungen und Schulungen vor Ort wenden Sie sich bitte an [email protected] oder registrieren Sie sich unter www.scalefree.com.

Um die Erstellung von Visual Data Vault-Zeichnungen in Microsoft Visio zu unterstützen, wurde eine Schablone implementiert, die zum Zeichnen von Data Vault-Modellen verwendet werden kann. Die Schablone ist erhältlich bei www.visualdatavault.com.

Scalefree

Eine Antwort hinterlassen

Menü schließen