At Cigna, SQL is essential for extracting and analyzing healthcare data, allowing them to create predictive models that identify potential health risks for patients. It is also used for developing effective health management strategies and improving patient care by tailoring services to individual needs, that is the reason why Cigna asks SQL questions during interviews for Data Science, Data Engineering, and Data Analytics jobs.
To help you prep for the Cigna SQL interview, we've curated 11 Cigna SQL interview questions – can you solve them?
Cigna is a health service company that's committed to their customers' long-lasting health and well-being. Assume you have access to their customer database where you have a table that keeps track of all user interactions with their services. Particularly, it records the , , , and of the interaction. Let's say for Cigna, power (VIP) users are defined as users who frequently engage with Cigna's health checkups and consultations, with the total cost of their activities within the last month exceed $5000.
Write a SQL query to identify these VIP users.
activity_id | user_id | timestamp | activity | cost |
---|---|---|---|---|
101 | 1001 | 09/01/2022 00:00:00 | Health Checkup | 750 |
102 | 1002 | 09/01/2022 00:00:00 | Consultation | 100 |
103 | 1001 | 09/05/2022 00:00:00 | Health Checkup | 750 |
104 | 1003 | 09/15/2022 00:00:00 | Vaccination | 50 |
105 | 1001 | 09/20/2022 00:00:00 | Health Checkup | 750 |
vip_user_id |
---|
1001 |
In this solution, we first filter out the to only consider activities from the last month. Then, we limit it to the activities of interest - 'Health Checkup' and 'Consultation'. By grouping the records by and summing the for each user, we can identify those users whose total cost exceed $5000, who are considered as VIP users.
To practice a related customer analytics question on DataLemur's free interactive coding environment, try this recently asked Microsoft SQL interview question:
Dive into Cigna's newsroom to discover the latest updates and initiatives that are shaping the future of health services! Staying informed about their efforts can provide you with valuable insights into how they are addressing the challenges in the healthcare landscape.
Given a table of Cigna employee salary information, write a SQL query to find employees who earn more money than their direct boss.
employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | |
2 | Daniel Rodriguez | 2230 | 1 | 10 |
3 | Olivia Smith | 8000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 8 |
5 | Sophia Martinez | 1750 | 1 | 10 |
8 | William Davis | 7000 | 2 | NULL |
10 | James Anderson | 4000 | 1 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
This is the output because Olivia Smith earns $8,000, surpassing her manager, William Davis who earns 7,800.
Test your SQL query for this problem directly within the browser on DataLemur:
First, we perform a SELF-JOIN where we treat the first table () as the managers' table and the second table () as the employees' table. Then we use a clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.
If the code above is hard to understand, you can find a detailed solution here: Employees Earning More Than Their Boss.
In SQL, both and are used to rank rows within a result set. The key difference between the two functions is how deal with two or more rows having the same value in the ranked column (aka how the break ties).
RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the second row, and a rank of 4 to the third row.
DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.
Assume that you are working for Cigna, a health services company. They have provided you with a table called containing data about their company's claims. Each row in the Claims table represents a claim made by a client.
The Claims table has the following columns:
Write a SQL query that ranks the clients by the total amount they claimed in descending order over the past year (365 days). For clients with the same total claim amount, they should be ranked by the earliest claim date they have. You should return the client ID, the total claim amount, the earliest claim date, and the rank in the output.
ClaimID | ClientID | ClaimDate | ClaimAmount |
---|---|---|---|
8435 | 123 | 2022-06-08 | 500 |
2791 | 265 | 2022-06-10 | 300 |
5284 | 265 | 2022-06-18 | 400 |
3795 | 123 | 2022-07-26 | 800 |
4610 | 362 | 2022-07-05 | 700 |
Expected output would look like this:
ClientID | TotalClaimAmount | EarliestClaimDate | Rank |
---|---|---|---|
265 | 700 | 2022-06-10 | 1 |
123 | 1300 | 2022-06-08 | 2 |
362 | 700 | 2022-07-05 | 3 |
This query is using a window function and a clause to aggregate the of and of by . The claim data is filtered to include only the past year claims. The function is applied on the total claim amount in descending order and for claims with same total, it is ranked by the earliest claim date. In the output, it returns the , , , and .
Pro Tip: Window functions are a popular SQL interview topic, so practice all the window function problems on DataLemur
The and operators are both used to filter data based on certain criteria, but they work in different ways. is used to select values within a range, while is used to select values that match a list.
For instance, if you have a table called that contains the salary of each employee, along with which country they reside in, you could use the operator to find all employees who make between 160k:
To find all employees that reside in France and Germany, you could use the operator:
Cigna, being a health insurance company, consistently handles a large number of healthcare provider records which include details about the providers, their specialization, the services they provide and the corresponding cost for those services. A patient is able to consult any provider linked to Cigna.
Design the table structure required to store this information effectively. Each patient can have numerous provider visits, and each provider visit will have a corresponding service provided and its cost.
provider_id | provider_name | specialization |
---|---|---|
101 | Dr. Adams | Cardiologist |
102 | Dr. Baker | Dermatologist |
103 | Dr. Carter | Orthopedic |
service_id | service_name | service_cost |
---|---|---|
201 | Heart Checkup | $200 |
202 | Skin treatment | $100 |
203 | Bone treatment | $150 |
patient_id | patient_name |
---|---|
301 | John Doe |
302 | Jane Doe |
visit_id | patient_id | provider_id | service_id | visit_date |
---|---|---|---|---|
401 | 301 | 101 | 201 | 2023-01-23 |
402 | 302 | 102 | 202 | 2023-01-24 |
403 | 301 | 103 | 203 | 2023-02-15 |
Given the database design and samples above, suppose we want to query the total cost of all services a patient has received. We can achieve this with a query in PostgreSQL as shown below:
This query will provide us with a table where for each patient, we have a sum of all service costs that patient has received. We first join and tables on the field. Then, we join the table using the from the table. Finally, we group the results by and sum up the .
In SQL, a value is NOT the same as a zero or a blank space. A NULL value represents the absence of a value or a missing value, whereas both zero and blank space are actually values.
To handle nulls, the function is often used to check for a null, and if there is a null, you can assign a default value.
As a Data Analyst at Cigna Corporation, an American worldwide health services organization, you are tasked with calculating the click-through rate (CTR) for the company's digital marketing campaigns. Click-through rate is the ratio of users who click on a specific link to the number of total users who view a page, email, or advertisement. In this context, you are given 'impressions' that is the total number of views on the ads and 'clicks', which is a count of how many people actually clicked on them.
Use the data provided in the 'ad_campaign' table to calculate the daily click-through-rate for each campaign.
date | campaign_id | impressions | clicks |
---|---|---|---|
01/01/2022 | 101 | 1000 | 250 |
02/01/2022 | 101 | 1200 | 300 |
01/01/2022 | 102 | 1500 | 375 |
02/01/2022 | 102 | 1800 | 450 |
01/01/2022 | 103 | 1300 | 325 |
02/01/2022 | 103 | 1400 | 350 |
This SQL query calculates the daily CTR for each campaign by taking the ratio of clicks to impressions and then multiplying by 100 to get it in percentage terms. The output is ordered by date and campaign id.
date | campaign_id | clicks | impressions | ctr |
---|---|---|---|---|
01/01/2022 | 101 | 250 | 1000 | 25.00 |
01/01/2022 | 102 | 375 | 1500 | 25.00 |
01/01/2022 | 103 | 325 | 1300 | 25.00 |
02/01/2022 | 101 | 300 | 1200 | 25.00 |
02/01/2022 | 102 | 450 | 1800 | 25.00 |
02/01/2022 | 103 | 350 | 1400 | 25.00 |
Note, it is assumed that the and fields are numeric types and can be safely converted to float for the division operation.
To practice a similar problem on DataLemur's free interactive coding environment, attempt this Meta SQL interview question:
Each patient at the healthcare company Cigna receives various treatment services. For research and analysis purposes, Cigna wants to know the average cost of each treatment service it provides per year. Write a SQL query to find out the average cost on a yearly basis for each treatment service within the database.
record_id | patient_id | treatment_date | service_id | cost |
---|---|---|---|---|
8761 | 523 | 01/08/2020 | 10001 | 250 |
3412 | 983 | 03/10/2020 | 20052 | 150 |
7893 | 376 | 07/18/2020 | 10001 | 300 |
6354 | 231 | 09/26/2020 | 20052 | 200 |
9335 | 782 | 12/05/2021 | 20052 | 180 |
year | service | avg_cost |
---|---|---|
2020 | 10001 | 275.00 |
2020 | 20052 | 175.00 |
2021 | 20052 | 180.00 |
This query uses the GROUP BY clause to group the data by the year of the treatment_date and service_id. It calculates the average cost of each service for each year. The AVG() function calculates the average cost of the rows that fall into each group, giving us the average cost of each service Cigna provides on a yearly basis.
combines the results from two or more SELECT statements, but only returns the rows that are exactly the same between the two sets.
For example, let's use to find all of Cigna's Facebook video ads with more than 10k views that are also being run on YouTube:
As a Database Analyst at Cigna, you are tasked with filtering out the customer records from the database in order to track customers who have a certain pattern in their email addresses. Specifically, you are to write a SQL query that gives you a list of all customer records who have the '@cigna.com' email extension.
The customer database has the following simplified structure:
customer_id | first_name | last_name | signup_date | |
---|---|---|---|---|
101 | John | Doe | john.doe@cigna.com | 01/01/2022 00:00:00 |
102 | Jane | Smith | jane.smith@gmail.com | 01/05/2022 00:00:00 |
103 | Sam | Brown | sam.brown@cigna.com | 01/10/2022 00:00:00 |
104 | Emily | Johnson | emily.johnson@gmail.com | 01/15/2022 00:00:00 |
105 | Bob | Davis | bob.davis@cigna.com | 01/20/2022 00:00:00 |
Your task is to return a table in the following format:
customer_id | first_name | last_name | |
---|---|---|---|
101 | John | Doe | john.doe@cigna.com |
103 | Sam | Brown | sam.brown@cigna.com |
105 | Bob | Davis | bob.davis@cigna.com |
This query is filtering the customers table to only include rows where the email address ends with '@cigna.com'. The '%' functions as a wildcard character that matches any sequence of characters, ensuring that all email addresses ending with '@cigna.com' are included, regardless of what precedes this string in the email field.
The best way to prepare for a Cigna SQL interview is to practice, practice, practice. In addition to solving the earlier Cigna SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like FAANG tech companies and tech startups.
Each exercise has multiple hints, fully explained answers along with a discussion board to see how others solved it and crucially, there's an online SQL code editor so you can right in the browser run your query and have it executed.
To prep for the Cigna SQL interview it is also wise to solve SQL problems from other healthcare and pharmaceutical companies like:
However, if your SQL skills are weak, forget about going right into solving questions – refresh your SQL knowledge with this SQL tutorial for Data Scientists & Analysts.
This tutorial covers SQL topics like SQL joins with practice exercises and joining a table to itself – both of which pop up often during SQL job interviews at Cigna.
Besides SQL interview questions, the other question categories covered in the Cigna Data Science Interview include:
To prepare for Cigna Data Science interviews read the book Ace the Data Science Interview because it's got:
Don't ignore the behavioral interview – prepare for it using this list of common Data Scientist behavioral interview questions.