At Schibsted, SQL is often used for analyzing user behavior on their digital publications and optimizing ad targeting based on precise customer segmentation. That's why Schibsted LOVES to ask SQL query questions during interviews for Data Science, Data Engineering and Data Analytics jobs.
So, to help you study for the Schibsted SQL interview, we've curated 11 Schibsted SQL interview questions – how many can you solve?
Schibsted, a global family of digital consumer brands, needs to monitor user interaction with their online products to optimize user experience and improve their products. As a data analyst, you are tasked to analyze user login behavior from their web logs over a period.
Specifically, you are asked to write a SQL query to analyze the following scenario for each user within a 7-day period:
Your task will help Schibsted understand user behavior in terms of sessions.
log_id | user_id | session_date |
---|---|---|
1001 | 123 | 2022-08-01 00:00:00 |
1002 | 123 | 2022-08-02 00:00:00 |
1003 | 123 | 2022-08-04 00:00:00 |
1004 | 123 | 2022-08-07 00:00:00 |
1005 | 333 | 2022-08-01 00:00:00 |
1006 | 333 | 2022-08-03 00:00:00 |
1007 | 333 | 2022-08-05 00:00:00 |
1008 | 333 | 2022-08-07 00:00:00 |
log_id | user_id | session_date | total_logins | login_diff | mean_logins |
---|---|---|---|---|---|
1001 | 123 | 2022-08-01 00:00:00 | 1 | null | 1.00 |
1002 | 123 | 2022-08-02 00:00:00 | 2 | 1 | 1.50 |
1003 | 123 | 2022-08-04 00:00:00 | 3 | 2 | 1.67 |
1004 | 123 | 2022-08-07 00:00:00 | 4 | 3 | 2.00 |
1005 | 333 | 2022-08-01 00:00:00 | 1 | null | 1.00 |
1006 | 333 | 2022-08-03 00:00:00 | 2 | 2 | 1.50 |
1007 | 333 | 2022-08-05 00:00:00 | 3 | 2 | 1.67 |
1008 | 333 | 2022-08-07 00:00:00 | 4 | 2 | 2.00 |
Here is an example PostgreSQL query for this:
The query first partitions the data by the and in each partition, the rows are ordered by . The then counts the number of logins over a 7-day period and the function compares each login day with the previous login day within the 7-day window. The mean number of logins in the 7-day window is then computed using the function.
To solve a related window function SQL problem on DataLemur's free online SQL code editor, solve this Amazon SQL question asked in a BI Engineer interview:
Schibsted owns a conglomerate of digital consumer brands across media, online marketplaces, and technology ventures. Given the hypothetical scenario that you are provided with two tables, and , design a database to track and analyze user activity on Schibsted platforms.
The table contains two columns: (PK) and . The table has (FK), , and .
For this question, assume Schibsted wants to understand the average time spent by users from each country on each platform per day. Design a PostgreSQL query to find this information.
Sample Data:
user_id | country |
---|---|
101 | Norway |
102 | Sweden |
103 | Norway |
104 | France |
user_id | platform_id | activity_date | time_spent_in_hours |
---|---|---|---|
101 | 201 | 2022-07-01 | 2 |
102 | 202 | 2022-07-01 | 1 |
101 | 201 | 2022-07-02 | 1.5 |
103 | 202 | 2022-07-02 | 2 |
104 | 201 | 2022-07-03 | 1 |
Expected Output:
country | platform_id | activity_date | avg_time_spent_in_hours |
---|---|---|---|
Norway | 201 | 2022-07-01 | 2.00 |
Sweden | 202 | 2022-07-01 | 1.00 |
Norway | 201 | 2022-07-02 | 1.50 |
Norway | 202 | 2022-07-02 | 2.00 |
France | 201 | 2022-07-03 | 1.00 |
This SQL query joins the and table on and then uses an aggregation function on to give the average time spent by users from each country on each platform per day. The results are grouped by , and and ordered in the same manner for readability.
A primary key is a special type of column or set of columns in a database table that is used to identify each row in the table. It helps to ensure that all the rows are distinct and prevents null values. Additionally, primary keys play a crucial role in maintaining the accuracy of the data by making sure that there are no duplicate rows. They are also used to establish relationships between tables.
To create a primary key in a SQL database, you can use the constraint. For example, in the table , the column is the primary key.
Given a Schibdsted customer database, write a SQL query to filter customers who have spent more than 1000 units in advertisements and have been a customer for over a year. Assume that the database comprises two tables: one for customer information and another for their transaction history.
The customer information table () is structured as follows:
customer_id | signup_date | category |
---|---|---|
675 | 2019-07-21 | Private |
982 | 2021-01-15 | Commercial |
435 | 2020-03-19 | Private |
291 | 2021-06-07 | Commercial |
113 | 2020-01-10 | Private |
The transaction history table () is structured as below:
transaction_id | customer_id | transaction_date | amount_spent |
---|---|---|---|
9643 | 675 | 2022-07-20 | 500 |
7532 | 982 | 2022-02-21 | 1200 |
6453 | 435 | 2022-08-31 | 1500 |
3952 | 291 | 2022-03-10 | 950 |
7253 | 113 | 2022-09-01 | 1050 |
This SQL query joins the and tables on . It then filters out customers who haven't been with the company for at least a year, and whose total spending in that first year exceeded 1000 units. The output will list the , , and for customers who meet these criteria.
While both types of databases are used to store data (no duh!), relational databases and non-relational (also known as NoSQL databases) differ in a few important ways:
Data model: Relational databases use a data model consisting of tables and rows, while NoSQL databases use a variety of data models, including document, key-value, columnar, and graph storage formats.
Data integrity: Relational databases use structured query language (SQL) and enforce strict data integrity rules through the use of foreign keys and transactions. NoSQL databases may not use SQL and may have more relaxed data integrity rules.
Structure: Relational databases store data in a fixed, structured format, while NoSQL databases allow for more flexibility in terms of data structure.
ACID compliance: Relational databases are typically into shrooms and are ACID-compliant (atomic, consistent, isolated, and durable), while NoSQL databases may not be fully ACID-compliant (but they try their best... and it's effort that counts...or at least that's what my gym teacher told me!)
Schibsted is one of the largest media groups in Europe. Among other things, they own multiple online classified ads platforms where individual publishers can post advertisements.
Given a table of ad postings, can you write a SQL query to find the average number of ads posted by each publisher on a monthly basis?
posting_id | publisher_id | post_date | ad_id |
---|---|---|---|
101 | 1337 | 06/12/2022 | 4001 |
102 | 526 | 06/15/2022 | 4002 |
103 | 1337 | 06/18/2022 | 4003 |
104 | 789 | 07/22/2022 | 4004 |
105 | 1337 | 07/23/2022 | 4005 |
mth | publisher_id | avg_ads |
---|---|---|
6 | 1337 | 2.00 |
6 | 526 | 1.00 |
7 | 789 | 1.00 |
7 | 1337 | 1.00 |
This SQL query first counts the total ads posted by each publisher for every month using a window function. Then it groups the subquery's result by month and publisher_id and calculates the average number of advertisements posted by a publisher in a month. The EXTRACT function is used to get the month number from the post_date column.
SQL constraints are used to specify rules for the data in a table. They can be applied to single or multiple fields in a table when the table is created, or after the table has been created using the ALTER TABLE command.
For example, say you had a database that stores ad campaign data from Schibsted's Google Analytics account.
Here's what some constraints could look like:
Schibsted is a company with many digital subscription types, including newspapers and magazines. In the subscriptions table, there's a record for each subscription transaction with the user's id, the date of the transaction, the type of the subscription, and the price. The question is: Can you write a SQL query that calculates the monthly average revenue by each type of subscription?
Here is the SQL query that would solve this task in PostgreSQL:
In the above query, we first extract the month from the transaction date. Then we group by the month and the subscription type and find the average price (revenue) for each group. We finally order the results by month. This query calculates the average revenue per subscription type for each month. The AVG function calculates the average value of a numeric column.
As a company, Schibsted holds a significant amount of user data. For this exercise, assume that the company wants to target a marketing campaign. They are particularly interested in users who have signed up with a Gmail account. Write a SQL query that would filter all users recorded in the database that have an email registered with Gmail.
user_id | username | created_date | |
---|---|---|---|
201 | john_doe | 2022-05-16 | john_doe@mail.com |
356 | jenny_smith | 2022-06-10 | jenny_smith@gmail.com |
987 | jack_jones | 2022-07-05 | jack_jones@yahoo.com |
104 | sara_white | 2022-07-15 | sara_white@gmail.com |
658 | peter_brown | 2022-07-21 | peter_brown@hotmail.com |
This query uses the keyword in SQL to filter for a specific pattern in the email addresses. The '%' is a wildcard that stands for zero, one, or more characters. This combined with '@gmail.com' ensures we only get email addresses that end with '@gmail.com', returning the user data of those who have registered with a Gmail account.
If you want to return records with no duplicates, you can use the keyword in your statement.
For example, if you had a table of Schibsted employees, and wanted to see what are all the unique job titles that currently worked at the company, you could write the following query:
If had the following data:
f_name | job_title |
---|---|
Akash | Data Analyst |
Brittany | Data Scientist |
Carlos | Data Engineer |
Diego | Data Engineer |
Eva | Data Analyst |
Then the output from the query would be:
job_title |
---|
Data Analyst |
Data Scientist |
Data Engineer |
Find the number of unique customers who bought each product, and the average price of each product. Assume we have two tables, and .
customer_id | first_name | last_name |
---|---|---|
1 | Hans | Hansen |
2 | Karin | Karlsson |
3 | Jens | Jensen |
4 | Anna | Andersson |
5 | Margareta | Magnusson |
product_id | purchase_id | customer_id | price |
---|---|---|---|
101 | 1 | 1 | 50.00 |
102 | 2 | 2 | 75.00 |
103 | 3 | 3 | 100.00 |
104 | 4 | 4 | 125.00 |
101 | 5 | 2 | 50.00 |
102 | 6 | 1 | 75.00 |
103 | 7 | 5 | 100.00 |
104 | 8 | 3 | 125.00 |
101 | 9 | 4 | 50.00 |
102 | 10 | 5 | 75.00 |
Please note, the is for the individual product and a customer can buy the same product more than once.
The above SQL command joins the and tables on the column, which is common to both. Then it groups by . For each , it uses the function to find the number of unique customers, and the function to compute the average price. The result is a table with one row for each product, showing the number of unique customers who purchased that product, and the average price of that product.
Because join questions come up so often during SQL interviews, take a stab at this interactive Snapchat JOIN SQL interview question:
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Beyond just solving the earlier Schibsted SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like FAANG tech companies and tech startups.
Each problem on DataLemur has hints to guide you, step-by-step solutions and best of all, there's an interactive SQL code editor so you can right in the browser run your SQL query and have it graded.
To prep for the Schibsted SQL interview you can also be wise to solve SQL problems from other tech companies like:
In case your SQL query skills are weak, forget about diving straight into solving questions – go learn SQL with this SQL tutorial for Data Scientists & Analysts.
This tutorial covers SQL topics like math functions in SQL and joining a table to itself – both of which pop up frequently during Schibsted SQL interviews.
In addition to SQL query questions, the other types of questions to practice for the Schibsted Data Science Interview are:
The best way to prepare for Schibsted Data Science interviews is by reading Ace the Data Science Interview. The book's got: