top of page

Data Reconciliation after DBR Upgrades

Requirement

Requirement:

 

Develop a Databricks PySpark script to compare data from two tables product_plant and product_plant_v2. The script should compare each column’s (based on the column name) value from both tables and generate a validation column for each, indicating whether the values are a "Match" or "Mismatch" based on the comparison. Store the validation result as table.

 

Prerequisites:

 

  1. Give alias name to all the columns in both the tables to differentiate both the table columns .
  2. Drop pp_validation_results table recreate it

 

 

 

Note: The pp_validation_results table should maintain the column order such that the product_plant column, product_plant_v2 column, and pp_validation_results column appear sequentially next to each other. For example, the sequence should be like product_plant's column1, product_plant_v2's column1 , and column1_validation, and so on.

 

Expected output:

 

pp_validation_results

Purgo AI Agentic Code

bottom of page