Back to questions

Data Conference Attendees FAANG Python Interview Question

Data Conference Attendees

FAANG Python Interview Question

You’re at a Data Science conference with an unknown number of attendees. The attendees have a wide range of titles, such as Data Scientist, Data Analyst, ML Product Manager, ML Engineer, Founder, CTO, and others.

During the event, you ask a sample of n attendees the following question:

"How many other people here have you met with the same title as you?"

Each response is recorded in the array, where is the response of the i-th person you surveyed. Assume that each person you surveyed has encountered every other person at the conference with the same title.

Given the array , return the minimum possible total number of attendees at the conference.

Examples:

Example 1:

Input:
Output:
Explanation:

The first person (perhaps a Data Scientist) reports encountering 2 other people with the same title.
The second and third persons each report encountering 1 other person with their title, which suggests these two respondents share the same title.
Therefore, there must be 3 people with the title of the first person and 2 people with the title shared by the second and third persons, for a minimum total of 5 attendees.

You could assume that both people who answered 1 are talking about different people you didn’t ask, making your answer 7, but this would not give the minimum attendance.

Example 2:

Input:
Output:
Explanation:

The first two respondents each report 0 others with their title, implying they have unique titles.
The third person reports encountering 1 other person with their title, which suggests a minimum of 2 people with this title.
Thus, the minimum total number of attendees at the conference is 4: two with unique titles and two sharing the same title.

Solution

Intuition

If the question didn’t ask for the minimum number of people at the conference, the solution would be much simpler. We could assume that each time someone mentioned a group size, we hadn’t already counted those people. For example, if two people both said they saw “3 others,” we would assume they were each talking about completely separate groups of 4, including themselves. In that case, we’d just add everyone up and be done.

However, since we’re asked for the minimum, we can make an assumption: if multiple people give the same answer, they’re likely talking about the same group of people. This allows us to minimize the total number of attendees by grouping together responses where possible. So instead of counting each answer separately, we’ll group answers in batches, assuming that each batch represents a group of people.

Approach

We’ll count how often each answer appears in the list. Then, for each unique answer , we’ll calculate the minimum number of batches needed to represent all responses that said “k others.” Each batch represents a group of people (the respondent plus the people they mentioned).

To count the batches, we’ll use for each unique answer . This ensures that even if a batch isn’t “full,” we still count it, since each batch needs to be counted as a group of people.

Here’s the code:

If we want to avoid pre-calculating the frequency, we can instead check each answer as we encounter it:

For each new answer, initialize its batch size to zero.
As we go through the list, if the count for a particular answer reaches a multiple of , we add a new batch to the result.
We then increment the count for that answer each time it appears.

Sourced from

FAANG

Difficulty

Medium

Input

Python

Output