At United Internet, SQL is used day-to-day for analyzing user behavior patterns to improve service delivery and managing databases for optimized data storage and retrieval. Unsurprisingly this is why United Internet almost always evaluates jobseekers on SQL query questions in interviews for Data Science, Data Engineering and Data Analytics jobs.
To help you practice for the United Internet SQL interview, this blog covers 9 United Internet SQL interview questions – able to answer them all?
United Internet is a major internet services company that provides a variety of services including email, website hosting, and high-speed connectivity. For the purpose of this exercise, let's look at their email services. The company wants to identify VIP users, which, in this scenario, are defined as users who send more than 1000 emails per month.
You have been given two tables – and . The table keeps track of all registered users along with details such as user_id and registration date. The table keeps a record of all emails sent by these users. Each email sent is logged as a separate entry with its own email_id, sender (user_id), and sent_date.
user_id | registration_date |
---|---|
1 | 2022-01-03 |
2 | 2022-02-15 |
3 | 2022-03-21 |
4 | 2022-04-30 |
5 | 2022-06-05 |
email_id | user_id | sent_date |
---|---|---|
101 | 1 | 2022-07-01 |
102 | 2 | 2022-07-02 |
103 | 3 | 2022-07-03 |
104 | 3 | 2022-07-03 |
105 | 1 | 2022-07-04 |
... | ... | ... |
2000 | 1 | 2022-07-30 |
2001 | 2 | 2022-07-31 |
Your task is to write a SQL query that will identify which users sent more than 1000 emails in the month of July 2022.
Explanation: The SQL query begins by selecting the user_id and number of emails sent () from between 1st of July, 2022 and 31st of July, 2022. This result is then grouped by user_id, effectively creating a subquery that summarizes the number of emails sent per user for July 2022. Finally, the query selects users from this result that have sent more than 1000 emails.
To solve a related customer analytics question on DataLemur's free online SQL code editor, try this recently asked Microsoft SQL interview question:
United Internet is a large internet provider, dealing with millions of customers across a variety of internet products. They would like to gain a better understanding of the usage patterns of their customers.
As a Data analyst, your task is to write a SQL query that will identify the total number of unique customers using each internet product for each month in the last one year.
More specifically, you have to find the total number of unique customers per product for each month. Given the following table , can you write a PostgreSQL query to solve this problem?
record_id | timestamp | customer_id | product_id |
---|---|---|---|
1 | 01/03/2022 12:30:00 | 128 | 501 |
2 | 01/04/2022 23:10:00 | 128 | 501 |
3 | 01/03/2022 15:00:00 | 200 | 702 |
4 | 02/03/2022 10:20:00 | 300 | 501 |
5 | 02/04/2022 20:00:00 | 200 | 702 |
Your output should display the year and month, the product_id, and the number of unique customers for each product each month in the given year.
The function is used to round the to the nearest month, forming the grouping for counting the number of unique customers per product. The is a window function used to get the number of unique customers per product per month.
The clause is used to confine our analysis to the most recent one year of data. We order by and to get results in a chronological and product-wise order.
p.s. Window functions show up super often during SQL interviews, so practice the 27+ window function questions on DataLemur
When using , only rows that are identical in both sets will be returned.
For a concrete example, say you were a Data Analyst supporting the Sales Analytics team at United Internet, and data on potential sales leads lived in both Salesforce and Hubspot CRMs. To write a query to analyze leads created before 2023 started, that show up in BOTH CRMs, you would use the command:
Suppose you're at United Internet, and you wish to analyze customer usage patterns. Specifically, you want to find out the average amount of data usage (in GB) per customer on a monthly basis. Create a query to calculate this.
usage_id | user_id | date | data_in_gb |
---|---|---|---|
101 | 1001 | 06/02/2022 | 10 |
102 | 1001 | 06/15/2022 | 15 |
103 | 1001 | 06/30/2022 | 20 |
104 | 1002 | 06/01/2022 | 8 |
105 | 1002 | 06/14/2022 | 12 |
106 | 1002 | 06/28/2022 | 14 |
107 | 1002 | 07/02/2022 | 16 |
108 | 1003 | 07/01/2022 | 5 |
109 | 1003 | 07/18/2022 | 10 |
mth | user_id | avg_data_usage_gb |
---|---|---|
6 | 1001 | 15.00 |
6 | 1002 | 11.33 |
7 | 1002 | 16.00 |
7 | 1003 | 7.5 |
This query groups the data usage records by month and user_id. It then calculates the average data usage for each user and each month. The function is used to extract the month from the date, and the function calculates the average data usage.
An inner join returns only the rows that match the join condition between the two tables, whereas a full outer join returns all rows from both tables, including any unmatched rows.
Example:
Example:
United Internet, as an internet services company, has a database containing all their customer records. One field in their database is "Email". United Internet is interested in filtering their customer records to find only those customers who have '.com' in their email addresses.
Here is an example of their 'customers' database schema.
customer_id | first_name | last_name | signup_date | |
---|---|---|---|---|
101 | John | Doe | 06/08/2021 | john.doe@gmail.com |
102 | Jane | Smith | 06/10/2021 | jane.smith@hotmail.com |
103 | Mike | Johnson | 06/18/2021 | mike.johnson@yahoo.co.uk |
104 | Emily | Wilson | 07/26/2021 | emily.wilson@company.com |
105 | Daniel | Brown | 07/05/2021 | daniel.brown@unitedint.net |
Your task is to write an SQL query to provide a list of customer_ids who have '.com' in their email addresses.
This query uses PostgreSQL's LIKE keyword to filter down the customer records. The '%' symbol is a wildcard that matches any sequence of characters. In our query, it's used before and after ".com", which helps match any email addresses that contains ".com". This query will return the customer_id of all customers who have '.com' in their email addresses.
A cross-join, also known as a cartesian join, is a type of join that returns the cross-product of two tables. In a cross-join, each row from the first table is paired with every row from the second table, resulting in a new table that has a row for each possible combination of rows from the two input tables.
For example, say you worked on the Marketing Analytics team at United Internet, and needed to understand what advertising copy (text) goes best with what advertising creative (the photo that gets posted with the advertisement copy).
Here is an example of a cross-join between a table of ad_copy and a table of ad_creative:
You could this query to generate all possible combinations of ad copy and ad creative to help you create more effective ads for United Internet. Just be careful: if you had 1,000 rows of ad copy and 1,000 different image creatives, the resulting cross-join would have 1 million rows!
In United Internet there are two important data tables, the Customer table that contains all the customer details and the Purchases table that records all product purchases carried out through their platform. Your task is to write a SQL query to identify the most purchased product by customers of a specific age range. For this specific task, focus on the age range 25 - 30.
For simplification, consider that today's date is '2022-08-01'.
Given the 'customers' table schema is like:
and the 'purchases' table schema is:
This query calculates the age of the customers from the given birthdate by comparing it to today's date (considered as '2022-08-01'). It then selects those customers that fall within the given age range of 25 to 30 years. The query joins the ''customers' table with the 'purchases' table using the customer id and groups by product id to get the count of purchases of each product. Finally, it orders the count in descending order and selects the product with the maximum purchase count.
Because joins come up routinely during SQL interviews, practice this interactive Snapchat JOIN SQL interview question:
United Internet provides a variety of internet-based products and services. Your task is to determine the average daily website visits per user over a one month period.
Given a table that has columns , , .
visit_id | user_id | visit_date |
---|---|---|
1 | 123 | 2022-07-01 10:30:00 |
2 | 321 | 2022-07-01 11:00:00 |
3 | 123 | 2022-07-01 15:45:00 |
4 | 456 | 2022-07-01 16:10:00 |
5 | 123 | 2022-07-02 10:20:00 |
6 | 321 | 2022-07-02 11:00:00 |
7 | 321 | 2022-07-02 15:00:00 |
8 | 456 | 2022-07-02 16:00:00 |
9 | 123 | 2022-07-02 17:15:00 |
10 | 123 | 2022-07-03 10:00:00 |
The objective is to write a SQL query which calculates the average number of website visits per user per day for the month of July, 2022.
This query selects the and the day of the Further, it calculates the average daily visits per user by dividing the total number of visits () by the number of distinct days a user has visited. The date range is limited to the month of July, 2022 by the clause. The results are grouped by , giving us the desired average daily visits per user for the specified month.
The key to acing a United Internet SQL interview is to practice, practice, and then practice some more! Besides solving the above United Internet SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Microsoft, Google, and Facebook.
Each DataLemur SQL question has hints to guide you, full answers and crucially, there is an online SQL coding environment so you can easily right in the browser your SQL query answer and have it checked.
To prep for the United Internet SQL interview it is also wise to solve SQL questions from other tech companies like:
But if your SQL skills are weak, forget about going right into solving questions – refresh your SQL knowledge with this free SQL tutorial.
This tutorial covers things like HAVING and sorting data with ORDER BY – both of which pop up frequently during SQL job interviews at United Internet.
In addition to SQL query questions, the other types of questions to practice for the United Internet Data Science Interview are:
The best way to prepare for United Internet Data Science interviews is by reading Ace the Data Science Interview. The book's got: