Study Duration analysis on clinical trials
Requirement
Introduction: The goal is to process and transform the data by calculating study durations, aggregating key study metrics, performing date-based calculations, and identifying studies with the most and least progress. These calculations will be leveraged to enhance reporting capabilities and improve decision-making in clinical study management. The analytics will cover various therapeutic areas and study roll types, providing valuable insights for stakeholders.
Requirements: Create the Pyspark logic by read the table “purgo_playground.study_duration_analysis“.
Study Duration Calculation:* Adds a new column study_duration by calculating the difference between first_enrl_dt and last_enrl_dt. If last_enrl_dt is null, use the current date to calculate the duration.
Duration Categorization:* Adds a new column duration_category to classify studies, High Study Duration: If the duration is more than or equal to 365 days. Low Study Duration: If the duration is less than 365 days.
Final Output: Displays the results all the columns with additional columns study_duration and duration_category
Unity Catalog: “purgo_playground.study_duration_analysis“