Back to questions
CVS Health is trying to better understand its pharmacy sales, and how well different products are selling.
Write a query to find the total drug sales for each manufacturer. Round your answer to the closest million, and report your results in descending order of total sales.
Because this data is being directly fed into a dashboard which is being seen by business stakeholders, format your result like this: "$36 million".
If you like this question, try out Pharmacy Analytics (Part 4)!
Column Name | Type |
---|---|
product_id | integer |
units_sold | integer |
total_sales | decimal |
cogs | decimal |
manufacturer | varchar |
drug | varchar |
product_id | units_sold | total_sales | cogs | manufacturer | drug |
---|---|---|---|---|---|
94 | 132362 | 2041758.41 | 1373721.70 | Biogen | UP and UP |
9 | 37410 | 293452.54 | 208876.01 | Eli Lilly | Zyprexa |
50 | 90484 | 2521023.73 | 2742445.9 | Eli Lilly | Dermasorb |
61 | 77023 | 500101.61 | 419174.97 | Biogen | Varicose Relief |
136 | 144814 | 1084258.00 | 1006447.73 | Biogen | Burkhart |
manufacturer | sale |
---|---|
Biogen | $4 million |
Eli Lilly | $3 million |
The total sales for Biogen is $4 million ($2,041,758.41 + $500,101.61 + $1,084,258.00 = $3,626,118.02) and for Eli Lilly is $3 million ($293,452.54 + $2,521,023.73 = $2,814,476.27).
The dataset you are querying against may have different input & output - this is just an example!
Goal: Find the total drug sales in million for each manufacturer.
First, we calculate the sum of total sales using the aggregate function and segregate the results by the manufacturer in the clause.
Output showing the 2 randomly selected records:
manufacturer | sales |
---|---|
Eli Lilly | 81641381.27 |
Biogen | 69824472.58 |
The output above shows that Eli Lilly and Biogen each sold drugs with a total sales value of $81,641,381.27 and $69,824,472.58, respectively.
Although each manufacturer’s sales have been calculated, the figures are not in the million-dollar format.
Next, we round up the to the closest million.
To do so, we must first divide the by one million and round them to the closest million using the function. If the decimal place is unspecified, its default value is 0.
Showing the output for Eli Lilly and Biogen:
manufacturer | sales_mil |
---|---|
Eli Lilly | 82 |
Biogen | 70 |
Eli Lilly’s sales of $81,641,381.27 is rounded to the closest million to $82 and Biogen’s $69,824,472.58 is rounded to $70.
The sales data will be fed into a dashboard, thus it has to be formatted like this: "$xx million".
Using the function, we will concatenate the 3 elements: symbol + in million + string. Remember to keep a space in front of .
P.S. It is not necessary to convert into data type as the function accepts both and data types. Bear in mind that the column is now a data type.
Output:
manufacturer | sales_mil |
---|---|
Eli Lilly | $82 million |
Biogen | $70 million |
Finally, sort the results in the descending order of sales.
But hold on — we can't just apply the clause to the new column because this column is a data type.
Hence, we will utilize the clause on to place the highest total sales at the top followed by the least total sales.
Solution 1
Solution 2: Using CTE
PostgreSQL 14