top of page

Calculate revenue based on category and customer purchase columns

Requirement

Information

 

The Product Revenue table tracks the income generated from the sale of products within the life sciences industry, including biotechnology, pharmaceuticals, medical devices, diagnostics, and health-related products. This data is crucial for understanding revenue trends across various sectors. For analytical simplicity, product types that don't fall under major categories—such as Pharmaceuticals, Diagnostics, or Biologics—are grouped under the label "Others." This helps consolidate less significant or undefined categories, aiding in clearer revenue analysis.

 

Additionally, the d_product_revenue table contains information on customer purchasing patterns, such as their initial interactions with specific medical products, research kits, or diagnostic tools. The table includes transaction details for purchases of products like pharmaceuticals, medical devices, and diagnostics. To enhance reporting and analysis, it's essential to track the first purchased date, first product purchased, and the corresponding revenue for each customer.

 

h3. Requirements

 

Product Type Classification Update*:

Write a SQL query as per Databricks SQL syntax to update the product_type column in the d_product_revenue table. Assign the value 'Others' to entries where the product_type does not belong to key categories such as Research Kits, Diagnostics, or Consumables.

Additionally, design a view that aggregates total revenue grouped by product_type, including categories like Research Kits, Diagnostics, or Consumables, and Others.

 

Customer First Purchase Details:

 

Write a Databricks SQL query to determine the earliest purchased_date for each customer_id and populate a customer_first_purchased_date column with this value across all records for that customer_id in d_product_revenue* table . Additionally, identify the product_name and revenue corresponding to the earliest purchased_date for each customer_id, naming them customer_first_product and customer_first_revenue.

 

Create a view named customer_first_purchase_vw for distinct customer_id with the following columns:

** customer_id

** customer_first_purchased_date

** customer_first_product

** customer_first_revenue

 

Refer to the attached sample input and output files for clarification.

 

[^d_product_revenue_output (91d98b26-f831-4bcc-ae56-6685fc86e4b0).csv] [^d_product_revenue_input (e5882fbe-7e51-4f52-95af-47186d19ffc8).csv] .

 

h3. Unity Catalog Information

 

d_product_revenue*

customer_first_purchase_vw*

d_product_revenue_vw*

 

h3. Expected Output

 

* Databricks SQL queries for both requirements.

View: customer_first_purchase_vw and d_product_revenue_vw*

Purgo AI Agentic Code

bottom of page