Zum Hauptinhalt springen
Suche
0

Das Video ansehen

Understanding Multi-Active Satellites and Dependent Child Keys in Data Vault

Data Vault is known for its ighly structured data warehousing approach, built on Hubs, Links, and Satellites to capture data lineage, maintain historical accuracy, and ensure scalability. However, specific data scenarios, such as handling different data granularities, often lead to questions on multi-active satellites and dependent child keys. This article breaks down these concepts and clarifies their differences and use cases in a Data Vault environment.

Was ist ein Multi-Aktiv-Satellit?

A multi-active satellite is designed to manage multiple records for a single business object that are active simultaneously. This scenario arises when a business object, like a customer, can have several active data entries at the same time. For example, a customer could have multiple addresses (home and work), both of which are valid at the same time.

In a typical satellite structure, a business key (e.g., customer ID) combined with a load date timestamp defines the primary key. However, in cases of multiple active records, this primary key is insufficient because it won’t uniquely identify each active instance. Instead, an additional attribute, such as an address type (home or work), is added to the primary key to differentiate each record. This approach allows the satellite to track multiple entries for the same business key without duplicating data and helps capture finer details in the data warehouse.

Example of a Multi-Active Satellite

Let’s say our source system has a customer with ID C123 who has two active addresses: one for home and one for work. In a standard satellite, we might have one record per business key. But in a multi-active satellite, we store both addresses simultaneously by using an additional identifier (e.g., “address type”) in the primary key:

  • Customer ID: C123
  • Load Date: Timestamp of data load
  • Additional Identifier: Address type (e.g., home, work)

This approach allows multiple entries for a single business object (in this case, customer C123) while maintaining unique records in the satellite table.

What is a Dependent Child Key?

A dependent child key is used to manage relationships between multiple business objects at a finer granularity level than a standard Data Vault link would allow. Dependent child keys are typically applied in links where we need to track multiple occurrences of a relationship between business objects, such as an order and its line items.

Consider an order containing multiple line items, where each item references a product. Here, the dependent child key (like line item number) uniquely identifies each relationship instance, as it provides additional detail beyond just the order and product identifiers. This allows multiple rows in the link for the same business objects while maintaining unique records.

Example of a Dependent Child Key

Imagine we have an order O123 for a customer C123, which includes two line items for the same product but with different prices or quantities:

  • Order ID: O123
  • Customer ID: C123
  • Product ID: P123
  • Dependent Child Key: Line item number (e.g., 1, 2)

In this case, we create unique rows for each line item, where the line item number differentiates each record. This approach ensures that each entry is stored and tracked individually.

Key Differences Between Multi-Active Satellites and Dependent Child Keys

Although multi-active satellites and dependent child keys both enable handling of finer data granularity, they serve different purposes and are used in distinct contexts:

  1. Multi-Aktiv-Satelliten
    Applied within a single business object to handle multiple active records at the same time. The additional identifier helps capture simultaneous entries for the same object in a satellite.
  2. Dependent Child Keys
    Used in links between multiple business objects, where the additional key captures the finer detail of each relationship instance, such as line items in an order.

When to Use Each Approach

The choice between using a multi-active satellite or a dependent child key depends on the data granularity and relationships in your data model:

  • Use Multi-Active Satellites when handling multiple active records for a single business object, where each entry is related only to the primary business key (e.g., customer with multiple addresses).
  • Use Dependent Child Keys when tracking detailed relationships between different business objects that require additional identifiers to maintain uniqueness (e.g., order and line items).

Zusammenfassung

Multi-active satellites and dependent child keys provide solutions for storing data with complex granularities in Data Vault models. While multi-active satellites allow multiple simultaneous records for a single business object, dependent child keys enable unique identification of complex relationships in links. Both approaches maintain Data Vault’s principles of scalability and data integrity by preserving unique records and enabling detailed tracking of business data.

In short:

  • Multi-Active Satellite: For multiple records active simultaneously within a single business object.
  • Dependent Child Key: For relationships across multiple business objects that need finer detail, typically in links.

Treffen mit dem Sprecher

Profilfoto von Marc Winkelmann

Marc Winkelmann

Marc arbeitet im Bereich Business Intelligence und Enterprise Data Warehousing (EDW) mit Schwerpunkt auf Data Vault 2.0-Implementierung und Coaching. Seit 2016 ist er in der Beratung und Implementierung von Data Vault 2.0-Lösungen bei Branchenführern in den Bereichen Fertigung, Energieversorgung und Facility Management tätig. Im Jahr 2020 wurde er zum Data Vault 2.0-Ausbilder für Scalefree ernannt.

Eine Antwort hinterlassen

Menü schließen