Skip to main content
search
0
Scalefree Knowledge Webinars Data Vault Friday Applying Soft-Deletes in Dimensions

Soft-Deletes in Dimensions

Data Vault practitioners often encounter the challenge of handling soft deletes in dimensional modeling. While source systems may hard delete records, a data warehouse must maintain historical integrity. This is where effectivity satellites and PIT (Point-in-Time) tables become crucial. Let’s explore how to apply soft deletes effectively in Data Vault dimensions.



Why Soft Deletes?

In a data warehouse, deleting a record outright can lead to inconsistencies, particularly when historical data or fact tables reference that entity. Instead of deleting, we mark records as inactive while preserving history.

Effectivity Satellites for Status Tracking

An effectivity satellite helps track changes in a record’s lifecycle, including soft deletes. This satellite contains:

  • A deletion flag (e.g., isDeleted or isActive).
  • A deletion timestamp (deletedDate).
  • Load date timestamps to track changes.

This setup allows toggling between active and inactive states as records are deleted and restored in the source system.

One Satellite per Source System?

Each source system may handle deletions differently. If multiple sources feed the same entity, separate effectivity satellites per source ensure accurate tracking.

Adding Effectivity Satellites to PIT Tables

The PIT table aligns different satellite timelines, allowing efficient query performance. Including effectivity satellites in PIT tables ensures:

  • Accurate status tracking across snapshots.
  • Seamless integration with dimensional views.
  • Faster lookups by pre-aligning timelines.

Virtualized Dimensions Without PIT Tables

Without a PIT table, virtualized dimensions must align effectivity data dynamically. This requires:

  • Joining effectivity satellites on-the-fly.
  • Using window functions like LEAD or LAG for tracking state changes.
  • Referencing a date dimension for time-based joins.

While possible, this approach is less efficient than using PIT tables.

Final Thoughts

Soft deletes in Data Vault dimensions are best managed through effectivity satellites and PIT tables. By storing deletion timestamps and flags, we retain history while allowing records to be marked inactive instead of being removed.

Watch the Video

Meet the Speaker

Profile picture of Michael Olschimke

Michael Olschimke

Michael has more than 15 years of experience in Information Technology. During the last eight years he has specialized in Business Intelligence topics such as OLAP, Dimensional Modelling, and Data Mining. Challenge him with your questions!

The Data Vault Handbook

Build your path to a scalable and resilient Data Platform

The Data Vault Handbook is an accessible introduction to Data Vault. Designed for data practitioners, this guide provides a clear and cohesive overview of Data Vault principles.

Read the Book Now

Leave a Reply

Close Menu