top of page

Partitioned data transfer from a Databricks table to S3 location

Requirement

Introduction: We are currently using Databricks to process and store large datasets. We need to write vbak table data to our S3 folder as CSV files. The goal is to optimize the data writing process by configuring Databricks to commit each batch individually.

 

Requirement:

 

  1. Configure Databricks to write data to the S3 location in specified partitions.
  2. S3 folder location: s3://agilisium-playground-dev/filestore/vbak/

 

Unity catalog: vbak

 

Key Points:

 

Save the table in 4 partitions.

 

Databricks Secret Information: “access_key” and “secret_key” are placed in Databricks secret under the scope “aws_keys”

Purgo AI Agentic Code

bottom of page