top of page

Change Data Feed(CDF) for stg_us_wholesaler table

Requirement

h3. Introduction:

 

Audit logs play a crucial role in data engineering by enabling the tracking of changes made to tables. They are invaluable for troubleshooting and backtracking when issues arise. To maintain an audit trail for future reference, it is necessary to capture changes in the stg_us_wholesaler table.

 

h3. Requirement:

 

Write a Databricks PySpark command to create the stg_us_wholesaler table using CSV file from the Databricks volume also infer the data type of the csv file and enable Change Data Feed (CDF) for the stg_us_wholesaler table . Additionally, implement a PySpark script to create the stg_us_audit_log table, extract all changes from the stg_us_wholesaler table starting from version 0 up to the latest version, and insert the extracted changes into the stg_us_audit_log table.

 

Perquisites:

 

  1. Drop stg_us_wholesaler and stg_us_audit_log tables if exist and create them.

 

Volume Information: /Volumes/agilisium_playground/purgo_playground/stg_wholesaler/stg_us_wholesaler.csv

 

Expected Output: Databricks PySpark code

Purgo AI Agentic Code

bottom of page