logo

10 Fastly SQL Interview Questions (Updated 2024)

Updated on

March 3, 2024

At Fastly, SQL is crucial for analyzing vast network traffic data for insights and managing databases for real-time edge computing services. For this reason Fastly almost always evaluates jobseekers on SQL problems in interviews for Data Science, Data Engineering and Data Analytics jobs.

So, to help you prepare for the Fastly SQL interview, we've collected 10 Fastly SQL interview questions – how many can you solve?

10 Fastly SQL Interview Questions

SQL Question 1: Analysis of VIP Users at Fastly

Fastly is a cloud computing services provider. They have a wide variety of customers using their services ranging from individuals to large enterprises. In order to better understand and service their most valuable customers, they need an analysis of their "VIP" users. These VIP users are those who are using the most bandwidth.

Define a VIP user as one who consumes more than 1TB (Terabyte) of bandwidth per month on average. Write a SQL query to identify these VIP users, and provide a list of these users sorted by the total bandwidth they consumed over the last 12 months.

Example Input:
usage_iduser_idmonth_yearbandwidth_used_TB
10112021-08500
10222021-081200
10332021-081000
10412021-091500
10522021-091100
10632021-09900
10712021-10600
10822021-102500
10932021-101000
Example Output:
user_idtotal_used_last_12_months
24800
13600
32900

Answer:

Here's a PostgreSQL query to solve the problem:


In this query, we first filter the rows that are within last 12 months. We then aggregate the rows by and calculate the total bandwidth used within the last 12 months by each user. We use the clause to retain only users who on average use more than 1TB bandwidth per month. Finally, we sort the resulting users based on the total bandwidth they used within the last 12 months.

To practice a similar customer analytics SQL question where you can solve it interactively and have your SQL solution instantly graded, try this Walmart Labs SQL Interview Question: Walmart Labs SQL Interview Question

SQL Question 2: Average Bandwidth Usage per Client

Fastly is a content delivery network (CDN) that helps businesses handle high-traffic websites. One common analytical question might be to understand the bandwidth usage of different clients to aid in capacity planning and pricing adjustment.

Suppose Fastly has a database table tracking client bandwidth usage. Each row records the total bytes a client transferred in a specific month.

Consider the following table:

Example Input
client_idmonth_yeartotal_bytes
1232022-0150000
1232022-0260000
2652022-0170000
2652022-0280000
1232022-0340000

As an interviewer, I would like to ask you to write a SQL query to calculate the average monthly bandwidth usage for each client, ranked by using the window function.

The result should look something like this:

Example Output
client_idaverage_monthly_bandwidth
12350000.00
26575000.00

Answer

The SQL query for this problem might look like this:


This query uses the aggregate function together with the clause to create a window function partitioned by . This means that for each row in the table, the function is computed over the group of rows having the same . Finally, the result is sorted by .

p.s. Window functions show up pretty frequently during SQL interviews, so practice the 27+ window function questions on DataLemur

SQL Interview Questions on DataLemur

SQL Question 3: What does the SQL function do?

The COALESCE() function can be used to replace NULL values with a specified value. For instance, if you are a Data Analyst at a company and are working on a customer analytics project to identify the most engaged customers, you may have access to the following data in the table:

customer_idemail_engagementsms_engagement
101moderately_activenot_opted_in
201un-subscribedNULL
301NULLnot_opted_in
401not_activevery_active
501very_activevery_active
601NULLNULL

Before you could procede with your analysis, you would need to remove the NULLs, and replace them with the default value for email engagement (not_active), and the default sms_engagement level (not_opted_in).

To do this, you'd run the following query:


This would get you the following output:

customer_idemail_engagementsms_engagement
101moderately_activenot_opted_in
102un-subscribednot_opted_in
103not_activenot_opted_in
104not_activevery_active
105very_activevery_active
106not_activenot_opted_in

Fastly SQL Interview Questions

SQL Question 4: Calculate the Average Web Traffic

Fastly is a cloud computing services provider and one of the main services they sell is content delivery network (CDN). Assume your client is a company and they want to assess their daily traffic through Fastly's CDN service for their main website.

You have been tasked with writing a SQL query to calculate the average number of unique daily users on the company website for each month. The data is stored in a table called . Each row represents one unique user's visit on a particular day.

Example Input:

traffic_idvisit_dateuser_id
200012022-06-013345
200022022-06-014321
200032022-06-015564
200042022-06-025564
200052022-06-023345
200062022-06-027667
200072022-07-013245
200082022-07-015564
200092022-07-023345
200102022-07-024344

Example Output:

monthaverage_daily_users
62.00
72.5

Answer:


In this query we first create a subquery to get number of unique users each day. The function is used to get distinct users in a day and this data is grouped by . Then in the outer query we calculate the average daily users by month using function and .

To practice a very similar question try this interactive Facebook Active User Retention Question which is similar for calculating unique active users or this Facebook App Click-through Rate (CTR) Question which is similar for assessing daily traffic.

SQL Question 5: What are the various types of joins used in SQL?

Joins in SQL allow you to combine data from different tables based on a shared key or set of keys.

Four JOIN types are available in SQL. For an example of each one, say you had sales data exported from Fastly's Salesforce CRM stored in a PostgreSQL database, and had access to two tables: and .

  • : retrieves rows from both tables where there is a match in the shared key or keys. For example, an INNER JOIN between the Sales table and the Customers table would retrieve only the rows where the customer_id in the Sales table matches the customer_id in the table.
  • : retrieves all rows from the left table (in this case, the sales table) and any matching rows from the right table (the table). If there is no match in the right table, NULL values will be returned for the right table's columns.
  • : retrieves all rows from the right table (in this case, the customers table) and any matching rows from the left table (the sales table). If there is no match in the left table, NULL values will be returned for the left table's columns.
  • : retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be returned for the columns of the non-matching table.

SQL Question 6: Calculating Click-Through-Rate

Fastly is a company that provides CDN, streaming and other security services. They want to track the clickthrough rate of its emails regarding a new product update. The clickthrough rate is defined as the number of unique users who clicked the link in the email (clicked = 'yes') and visited the product update page divided by the total number of unique users who received the email (delivered = 'yes'). Can you write a query to calculate this rate?

The 'email_logs' table contains each instance of a user receiving an email. A unique user is identified by their 'user_id' and all fields are assumed to be not null.

Example Input:
user_idemail_iddeliveredclicked
478210678'yes''no'
753310679'yes''yes'
541110680'yes''no'
263810681'yes''yes'
937510682'yes''yes'
189210683'yes''no'

Answer:


This query first counts the distinct number of users who clicked the email link and visited the product update page. It then counts the total number of users who received the email. The two counts are divided to calculate the click-through rate and the result is converted to percentage format by multiplying by 100. The result is finally rounded to two decimal places for presentation.

To solve a related SQL interview question on DataLemur's free online SQL code editor, attempt this SQL interview question asked by Facebook: Facebook App CTR SQL Interview question

SQL Question 7: How would you improve the performance of a slow SQL query?

There's several steps you can take to troubleshoot a slow SQL query.

First things first, figure out why the query is slow! You can use and commands in PostgreSQL to identify any performance bottlenecks. After that, you can start changing your query, depending on what the source of the query performance issue is.

Generally, indexes can help speed up queries. Also de-normalizing your tables might help, to remove slow joins.

Lastly, you could always just upgrade your hardware! Time and money wasted improving query performance could just be better spent on mroe powerful database servers!

While this is a very surface-level approach to optimizing SQL query performance, it should do the trick for Data Analyst interviews and Data Science interviews at Fastly. Data Engineers should know a bit more about the and before the interview.

SQL Question 8: Max Bandwidth Usage per Customer

As an engineer at Fastly, a company that provides cloud computing services to businesses, you are asked to find the maximum bandwidth usage per customer in a specific month for understanding the load on their services.

Use the below schema and data for the problem.

Example Input:
usage_idcustomer_idtime_stampbandwidth_in_GB
011232022-06-08 00:00:0023.5
022652022-06-10 00:00:0030.3
031232022-06-18 00:00:0028.1
042652022-07-26 00:00:0022.2
051232022-07-05 00:00:0020.5
Example Output:
monthcustomermax_bandwidth
612328.1
626530.3
712320.5
726522.2

Answer:

Here is a PostgreSQL query that will provide you the solution.


In the query, we extracted the month from and grouped the data by and the month. We then found the max bandwidth for each group sorted it by and month.

SQL Question 9: Customer Searches on Fastly Network

As a part of Fastly analytics team, you have been tasked to find out all the customers whose 'email' contains the string 'fastly' from the 'Customers' table. This will help in identifying customers that may have created their email ids with the brand name in it and can be potential brand advocates.

Example Input:
customer_idfirst_namelast_nameemail
123JohnSmithjohnsmith@gmail.com
265SarahDoesarahdoe@fastlymail.com
362BrianTaylorbriantaylor@hotmail.com
192KellyLeekelly.fastly.lee@yahoo.com
981AndrewBrownandrewbrown@gmail.ca
Example Output:
customer_idfirst_namelast_nameemail
265SarahDoesarahdoe@fastlymail.com
192KellyLeekelly.fastly.lee@yahoo.com

Answer:

Using PostgreSQL, you can use the keyword to filter records. The query will look like this:


This PostgreSQL query would return all records from the Customers table where the email field contains the string 'fastly'. The '%' on either side of 'fastly' in the SQL query is a wildcard that matches zero or more characters. This means it will match email addresses where 'fastly' appears anywhere in the string.

SQL Question 10: What are the various forms of normalization?

Normal forms are guidelines that are used to help design a relational database in a way that minimizes redundancy and ensures the integrity of the data. The 3 most commonly use normal forms are the 1st, 2nd, and 3rd normal forms. Here's a brief explanation of each:

  • 1st Normal Form (1NF) is all about keeping it simple - each column should only have one value and there should be no repeating groups of data.

  • 2nd Normal Form (2NF) is about organization - your database should already be in 1NF and all the non-key columns should depend on the primary key. This means that each non-key column should be completely dependent on the entire primary key, not just part of it.

  • 3rd Normal Form (3NF) is about independence - if your database is already in 2NF, then all the non-key columns should not depend on each other. They should be self-sufficient and not rely on other non-key columns.

Fastly SQL Interview Tips

The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. In addition to solving the above Fastly SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like FAANG and tech startups. DataLemur Question Bank

Each interview question has hints to guide you, detailed solutions and most importantly, there's an online SQL code editor so you can right online code up your SQL query and have it graded.

To prep for the Fastly SQL interview it is also a great idea to practice interview questions from other tech companies like:

But if your SQL coding skills are weak, don't worry about going right into solving questions – improve your SQL foundations with this SQL tutorial for Data Scientists & Analysts.

DataLemur SQL tutorial

This tutorial covers SQL topics like UNION vs. joins and filtering data with WHERE – both of these pop up frequently in Fastly SQL assessments.

Fastly Data Science Interview Tips

What Do Fastly Data Science Interviews Cover?

Beyond writing SQL queries, the other question categories tested in the Fastly Data Science Interview are:

  • Probability & Statistics Questions
  • Python Pandas or R Coding Questions
  • Product Analytics Questions
  • ML Modelling Questions
  • Behavioral Based Interview Questions

Fastly Data Scientist

How To Prepare for Fastly Data Science Interviews?

The best way to prepare for Fastly Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from FAANG tech companies
  • A Refresher on Stats, SQL & ML
  • Amazing Reviews (900+ 5-star reviews on Amazon)

Acing Data Science Interview