top of page

Inserting new records to config master table.

Requirement

h3. Introduction:

 

The config_master table contains information about files stored in S3, including details about data landing zones for various countries and regions such as the US, NL, CA, BE, and others. This table serves as a mapping resource to load data from storage files into the staging table. It includes specific details about files from different regions. When introducing a new region, it is essential to add corresponding records to the config_master table to ensure seamless data migration and processing.

 

Requirement:

 

Develop a Databricks PySpark script to insert a new record into the config_master table. Begin by retrieving an existing record from the same table, replace the required columns with the given values, and append the modified record back to the table.

 

h3. Specified Column Values:

 

src_objt_name and src_sys*: "ID_Sales"

f_format*: "ID_MON_Sales_"

s3_landing_path*: "s3a://s3_bucket/landing/ID/ID_Sales/"

s3_archive_path*: "s3a://s3_bucket/archive/ID/ID_Sales/"

country, region, affiliate_group, affiliate*: "ID"

src_layer and target_src_sys*: "ID_Sales"

delta_stg_tables*: "stg_ID_sales"

source_path*: "/SecureFtp/-InternalX/ID/IN/DATA/Sales/"

actual_file_name: {"ID_MON_Sales_": "stg_ID_wholesaler"}

dag_id*: "LOAD_SALES_ID"

 

Unity Catalog Information: purgo_playground.config_master table.

 

Expected Output: Databricks Pyspark code and add the records back to purgo_playground.config_master table

Purgo AI Agentic Code

bottom of page