At CrowdStrike, SQL is used all the damn time for analyzing cybersecurity threat patterns and optimizing database structures for more efficient threat detection. Because of this, CrowdStrike asks SQL questions during interviews for Data Science and Data Engineering positions.
In case you're studying for a SQL Assessment, we've curated 8 CrowdStrike SQL interview questions to practice, which are similar to commonly asked questions at CrowdStrike – can you solve them?
CrowdStrike is a cybersecurity technology company. You are given a table named which contains data about threats detected by CrowdStrike systems in various customer networks.
Each row in the table represents a unique threat detected at a given time for a particular customer.
The columns in the table are as follows:
: A unique identifier for the threat.
: The time the threat was detected.
: The category of the threat detected.
: The identifier for the customer in whose network the threat was detected.
: The identifier for the location where the threat was detected.
The task is to write a query that will show the total number of threats detected for each customer for every day, ranking each day by the number of threats detected.
Here is some sample data for the problem:
threat_id | detect_time | threat_type | customer_id | location_id |
---|---|---|---|---|
1 | 2022-06-01 00:00:00 | Malware | C123 | L789 |
2 | 2022-06-01 00:10:00 | Ransomware | C123 | L789 |
3 | 2022-06-02 00:12:00 | Phishing | C456 | L123 |
4 | 2022-06-02 00:15:00 | Malware | C456 | L123 |
5 | 2022-06-02 00:00:00 | Trojan | C123 | L789 |
6 | 2022-06-03 00:00:00 | Malware | C456 | L123 |
date_detected | customer_id | total_threats | rank |
---|---|---|---|
2022-06-01 | C123 | 2 | 1 |
2022-06-02 | C123 | 1 | 2 |
2022-06-02 | C456 | 2 | 1 |
2022-06-03 | C456 | 1 | 2 |
The query uses the function to extract the date from the field. It then counts the number of threats detected on each day for each customer.
The window function is used to rank the dates for each customer based on the number of threats detected in descending order. The result is a list of dates ranked by the number of threats detected on each day for each customer.
To solve a similar SQL problem on DataLemur's free online SQL coding environment, solve this Meta SQL interview question:
CrowdStrike has a database of customers who subscribe to its cybersecurity platform. Create a SQL statement that filters down this customer database to only show customers who have:
Assume the 'customers' table schema as:
customer_id | subscription_status | region | spend | user_count | sector |
---|---|---|---|---|---|
101 | Active | North America | 15000 | 3 | Private |
202 | Inactive | Europe | 5000 | 8 | Private |
303 | Active | Asia | 12000 | 6 | Government |
404 | Active | North America | 9500 | 4 | Public |
505 | Active | Europe | 11000 | 7 | Private |
Write a PostgreSQL query to solve this.
This query filters the customers table for customers with active subscription status, located in North America or Europe, with more than $10,000 spend or more than 5 users, and not in the Government sector. Here, SQL's clause filters rows based on conditions and , and are used to combine these conditions.
To practice a related problem on DataLemur's free online SQL code editor, attempt this SQL interview question asked by Facebook:
Unique indexes help ensure that there are no duplicate key values in a table, maintaining data integrity. They enforce uniqueness whenever keys are added or changed within the index.
Non-unique indexes on the other hand, are used to improve query performance by maintaining a sorted order of frequently used data values, but they do not enforce constraints on the associated table.
Unique Indexes are blazing fast. Non unique indexes can improve query performance, but they are often slower because of their non-unique nature.
CrowdStrike, a cybersecurity technology firm, collects data on numerous threats it identifies each day. As part of its threat intelligence database, each row signifies a separate threat, where an associated numerical designates the threat's severity.
For each day, your task is to calculate the average of all threats identified. This will assist in gaining an understanding of the average seriousness of threats identified each day.
threat_id | date_identified | threat_level |
---|---|---|
1 | 09/01/2022 | 7 |
2 | 09/01/2022 | 6 |
3 | 09/02/2022 | 8 |
4 | 09/03/2022 | 9 |
5 | 09/03/2022 | 5 |
6 | 09/03/2022 | 8 |
7 | 09/04/2022 | 7 |
date_identified | avg_threat_level |
---|---|
09/01/2022 | 6.50 |
09/02/2022 | 8.00 |
09/03/2022 | 7.33 |
09/04/2022 | 7.00 |
This solution functions by utilizing the AVG and GROUP BY functions. GROUP BY is used to group the threat levels by each individual day (). The AVG function then calculates the average threat level for each distinct day within the dataset.
To practice a similar problem on DataLemur's free interactive SQL code editor, try this Meta SQL interview question:
NULLs are NOT the same as zero or blank spaces in SQL. NULLs are used to represent a missing value or the abscence of a value, whereas zero and blank space are legitimate values.
It's important to handle NULLs carefully, because they can mess up your analysis very easily. For example, if you compare a NULL value using the = operator, the result will always be NULL (because just like Drake, nothing be dared compared to NULL). That's why many data analysis in SQL start with removing NULLs using the function.
CrowdStrike maintains a database of information about their customers. The following table shows an example data from the table:
customer_id | first_name | last_name | email_domain |
---|---|---|---|
8945 | Maria | Miley | yahoo.com |
5455 | John | Stewart | gmail.com |
1166 | Robert | Jones | hotmail.com |
3685 | Patricia | Brown | crowdstrike.com |
7981 | James | Lopez | crowdstrike.com |
As part of a promotion targeted at employees of certain companies, CrowdStrike wishes to filter out customer records who have their registered as .
Can you write a PostgreSQL query to fetch information of customers whose email is registered with CrowdStrike?
The PostgreSQL query to solve this will use the keyword to filter the results:
This query checks the for each record in the table, and only includes those that match the string . The result will be a table containing only customers whose email domain is .
customer_id | first_name | last_name | email_domain |
---|---|---|---|
3685 | Patricia | Brown | crowdstrike.com |
7981 | James | Lopez | crowdstrike.com |
These are the customers whose emails are registered with the domain name 'crowdstrike.com'.
To solve a similar SQL problem on DataLemur's free interactive coding environment, solve this SQL interview question asked by Facebook:
Joins in SQL allow you to combine data from different tables based on a shared key or set of keys.
Four JOIN types are available in SQL. For an example of each one, say you had sales data exported from CrowdStrike's Salesforce CRM stored in a PostgreSQL database, and had access to two tables: and .
You are a Data Analyst in CrowdStrike and your task is to analyze the customer database. You want to find out what software products are being used by each customer and also how much they use it; you also want to link this data with detailed information about the customer, like their location and their contact information.
For this purpose, you have two tables. The first table, , contains data related to the customers - specifically, , , , and . The second table, , stores data related to which customer uses which software product, including columns for (linked to the in the table), , and .
customer_id | name | country | city | |
---|---|---|---|---|
111 | John Doe | johndoe@example.com | USA | Los Angeles |
112 | Jane Smith | janesmith@example.com | Canada | Vancouver |
113 | Alois Alzheimer | alois@example.com | Germany | Berlin |
114 | Kain Tapper | kaintapper@example.com | Finland | Helsinki |
115 | Gregor Mendel | gregor@example.com | Czech Republic | Brno |
usage_id | customer_id | software_product | hours_used |
---|---|---|---|
1 | 111 | Falcon Pro | 200 |
2 | 112 | Falcon Pro | 150 |
3 | 112 | Falcon Enterprise | 300 |
4 | 113 | Falcon Enterprise | 400 |
5 | 114 | Falcon Complete | 350 |
6 | 115 | Falcon Pro | 100 |
7 | 115 | Falcon Complete | 250 |
Write the SQL query in PostgreSQL:
This SQL JOIN will link the table with the table based on the that is common in both tables. The output will provide detailed information about each customer and the usage of their software product, which gives valuable insights for marketing and customer service purposes.
To solve a related problem on DataLemur's free online SQL coding environment, attempt this Meta SQL interview question:
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the CrowdStrike SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the earlier CrowdStrike SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like Netflix, Google, and Amazon.
Each interview question has multiple hints, fully explained answers along with a discussion board to see how others solved it and most importantly, there is an online SQL coding environment so you can right in the browser run your query and have it graded.
To prep for the CrowdStrike SQL interview it is also wise to solve SQL problems from other tech companies like:
In case your SQL foundations are weak, forget about going right into solving questions – improve your SQL foundations with this SQL tutorial for Data Analytics.
This tutorial covers topics including handling NULLs in SQL and filtering data with WHERE – both of these pop up often during CrowdStrike interviews.
For the CrowdStrike Data Science Interview, beyond writing SQL queries, the other types of questions to prepare for are:
Unsure if Data Science vs. Cybersecurity is right for you? Read all about it and see which career path is best for you.
The best way to prepare for CrowdStrike Data Science interviews is by reading Ace the Data Science Interview. The book's got: