top of page
Taking Backup of Onboarded S3 Files
Requirement
Requirement:
Create a Databricks PySpark script to automate the migration of files from the Purgo S3 landing folder to the archive folder. The script must reference the s3_file_process_log table to identify eligible files for archiving the files.
Only files with file_status = 'SUCCESS' should be moved. The script should dynamically read the s3_landing_path (source folder) and s3_archive_path (target folder) for each file from the same log table. Example of S3 folder path in log table: s3://agilisium-playground-dev/filestore/purgo/patient_raw
Databricks Secret Information: “access_key” and “secret_key” are placed in Databricks secret under the scope “aws_keys”
Purgo AI Agentic Code
bottom of page