Back to questions

Duplicate Job Listings Linkedin SQL Interview Question

Duplicate Job Listings

Linkedin SQL Interview Question

This is the same question as problem #8 in the SQL Chapter of Ace the Data Science Interview!

Assume you're given a table containing job postings from various companies on the LinkedIn platform. Write a query to retrieve the count of companies that have posted duplicate job listings.

Definition:

  • Duplicate job listings are defined as two job listings within the same company that share identical titles and descriptions.

Table:

Column NameType
job_idinteger
company_idinteger
titlestring
descriptionstring

Example Input:

job_idcompany_idtitledescription
248827Business AnalystBusiness analyst evaluates past and current business data with the primary goal of improving decision-making processes within organizations.
149845Business AnalystBusiness analyst evaluates past and current business data with the primary goal of improving decision-making processes within organizations.
945345Data AnalystData analyst reviews data to identify key insights into a business's customers and ways the data can be used to solve problems.
164345Data AnalystData analyst reviews data to identify key insights into a business's customers and ways the data can be used to solve problems.
172244Data EngineerData engineer works in a variety of settings to build systems that collect, manage, and convert raw data into usable information for data scientists and business analysts to interpret.

Example Output:

duplicate_companies
1

Explanation:

There is one company ID 345 that posted duplicate job listings. The duplicate listings, IDs 945 and 164 have identical titles and descriptions.

The dataset you are querying against may have different input & output - this is just an example!

Input

Output