Zum Hauptinhalt springen
Suche
0
Scalefree - Wissen - Webinare - Data Vault Friday - Modelling Salesforce History Tables in Data Vault

Modelling Salesforce History Tables

Salesforce tracks changes to configured attributes by storing them in history tables. This data, which includes record ID, field name, old and new values, and timestamps, presents a unique challenge for Data Vault modeling. In this article, we’ll explore an optimal way to model this data using Data Vault principles.



Understanding Salesforce History Tables

Salesforce allows tracking of specific attribute changes within objects like Contacts. These changes are stored in history tables such as ContactHistory. Each entry logs:

  • Record ID (e.g., Contact ID)
  • Field Name
  • Old Value
  • New Value
  • Timestamp

Challenges in Modeling Salesforce History Data

When designing a Data Vault model for this history data, there are key challenges to consider:

  • Handling multiple changes for the same record within a short time frame
  • Maintaining referential integrity
  • Efficiently querying and pivoting data for reporting

Approach: Multi-Active Satellite

A common initial approach is to model the history table as a multi-active satellite attached to a Contact Hub, with the field name as the dependent key. However, this approach has pitfalls:

  • Duplicates can arise if multiple changes occur for the same field in the same batch
  • Timestamp-based keys are unreliable due to possible duplicate timestamps

To counter this, a unique sequence number should be assigned in the staging area and used as a dependent key.

Instead of a multi-active satellite, a non-historized link can be used to model Salesforce history data more efficiently. Here’s how it works:

  • Create a non-historized link connecting the Contact and User hubs.
  • Store change-related attributes (field name, old value, new value, timestamp) directly within this link.
  • Use the timestamp as an event-based attribute rather than part of the primary key.

This approach avoids the need for complex joins and simplifies querying.

Efficient Data Retrieval: Pivoting

Since history tables are structured in a key-value format, queries often require pivoting. By using database pivot functions, we can restructure the data into a more usable format for reporting without excessive joins.

Alternative Consideration: JSON Storage

Another approach is to store change data as a JSON object in a standard satellite. This method offers flexibility, particularly when dealing with a large number of attributes. However, it complicates querying and should be used only when necessary.

Schlussfolgerung

For most cases, a non-historized link is the optimal way to model Salesforce history tables in Data Vault. It simplifies data storage, reduces the need for extensive joins, and enhances query performance. Multi-active satellites are an alternative but require careful handling of duplicate timestamps and field changes.

Das Video ansehen

Über den Vortragenden

Profilbild von Michael Olschimke

Michael Olschimke

Michael hat mehr als 15 Jahre Erfahrung in der Informationstechnologie. In den letzten acht Jahren hat er sich auf Business Intelligence Themen wie OLAP, Dimensional Modelling und Data Mining spezialisiert. Fordern Sie ihn mit Ihren Fragen heraus!

Eine Antwort hinterlassen

Menü schließen