top of page

Migrating Vendor S3 File to Purgo S3 Folder

Requirement

Requirement:

 

Create a Databricks PySpark script transfer all the list of files from the vendor's S3 bucket to the Purgo S3 folder based on the below conditions.

 

* List of file names in Vendor folder should not be in list of both Purgo S3 folder and Archive S3 folder

* It must also verify that only active files (“A“) are ingested by checking the active_flag column in ingest_config_master .

 

The script should retrieve the complete S3 folder paths for Vendor, Purgo, and Archive from the ingest_config_master configuration table, using the values stored in the s3_vendor_path, s3_landing_path, and s3_archive_path columns respectively.

 

 

 

Databricks Secret Information: “access_key” and “secret_key” are placed in Databricks secret under the scope “aws_keys”

Purgo AI Agentic Code

bottom of page