Zum Hauptinhalt springen
Suche
0

Snapshots in dbt: A Quick Overview

dbt snapshots allow you to “look back” at historical data by capturing changes in your database tables. This is achieved by implementing type-2 Slowly Changing Dimensions (SCDs), which track how a row has changed over time. For example, you can keep track of an order’s status as it moves from ‘pending’ to ‘shipped’ to ‘delivered’.



How Snapshots Work

When you run dbt snapshot, it creates a new table with the same columns as your source table, plus additional metadata columns like dbt_valid_from und dbt_valid_to. On subsequent runs, dbt updates the dbt_valid_to column for changed records and adds new records with dbt_valid_to set to null.

dbt offers different strategies for detecting changes, including comparing timestamps, specific column values, or a unique key. You can also configure dbt to track deletions.

Snapshots Best Practices

  • Store snapshots in a separate schema to keep your historical data organized.
  • Snapshot raw data using the source function and select all columns.
  • Avoid joins in your snapshot queries; instead, create separate snapshots and join them downstream.
  • If you need to perform transformations, do so in ephemeral models before snapshotting.
  • Schedule snapshots to run frequently to capture changes regularly.

By following these best practices, you can ensure that your dbt snapshots are accurate, efficient, and easy to maintain.

Das Video ansehen

Treffen mit dem Sprecher

Dmytro Polishchuk Profilbild

Dmytro Polischtschuk
Senior BI-Berater

Dmytro Polishchuk has 7 years of experience in business intelligence and works as a Senior BI Consultant for Scalefree. Dmytro is a proven Data Vault 2.0 expert and has excellent knowledge of various (cloud) architectures, data modeling, and the implementation of automation frameworks. Dmytro excels in team integration and structured project work. Dmytro has a bachelor’s degree in Finance and Financial Management.

Eine Antwort hinterlassen

Menü schließen