top of page

Data Purging and Retention Strategy for Snapshot Tables

Requirement

Introduction: The snapshot tables are essential for collecting and maintaining data from daily loads. However, this process significantly increases the overall data volume over time. To manage this growing data, a data clean-up activity is implemented to remove older snapshots based on a defined data retention policy. This ensures efficient data storage and maintains system performance.

 

Requirement: The task is to identify and purge data from the f_inv_movmnt table that is older than 90 days from the crt_dt (creation date). If the data exceeds the 90-day threshold, the txn_id should be moved to ref_del_sk for a 7-day retention period, during which the del_status will be updated to 'SD' (soft delete). After the 7-day retention period ends, the del_status will change to 'HD' (hard delete), permanently removing the data from the system.

 

Expected Codebase: PySpark

Purgo AI Agentic Code

bottom of page