top of page

Data growth summary table - stg_nl_wholesaler

Requirement

Information:

Databricks Time Travel is a feature of Delta Lake that allows users to analyze data as it existed at specific points in time. By leveraging time travel, users can access historical snapshots of data. We need to analyse the data growth of stg_nl_wholesaler table using Databrick time travel based on their version number.

 

h3. Requirement:

 

Create an error-free Databricks PySpark script leveraging the Databricks time travel feature to fetch record counts for the latest and previous versions (latest version -1) of the stg_nl_wholesaler table. If there is only one version, then populate only one version’s record count

 

Create a new table, dg_stg_nl, with the following schema to store the results:

 

version_number* (integer): The version number of the table retrieved using Databricks time travel.

time_stamp* (timestamp): The timestamp corresponding to when the version was created.

record_count* (integer): The number of records present in each respective version.

 

h3. Unity Catalog Information: purgo_playground.stg_nl_wholesaler

 

h3. Expected Output: Databricks pyspark code and purgo_playground.dg_stg_nl.

 

Prerequisite:

 

  1. Drop and create the table dg_stg_nl if already exist.

Purgo AI Agentic Code

bottom of page