Back to questions
At LinkedIn, maintaining the integrity of people's professional network is of utmost importance, and as such, there is a model in place which identifies spammy connection requests.
Your team has spent the last two months prototyping a model which you hope performs better because it incorporates a new 3rd-party dataset/API which helps flag IP addresses and browser fingerprints as risky.
The company built their dataset/API by helping thousands of other companies with their spam and fraud issues, and have aggregated data about bad actors across the internet to create an almost global internet blacklist. However, using this 3rd-party IP address/browser fingerprinting data would cost $2M per year if LinkedIn puts the model into production.
How would you determine if it's worth buying access to this 3rd party solution?
The core of the problem is figuring out how much your model would have to improve spammy connection request identification metrics to justify the $2M price-tag for using the 3rd-party API/dataset.
Before you interview with LinkedIn, it would be great to understand their business model, and some key numbers for their business, so that you could better connect technical decisions to business impact.
Of course, this isn't enough to directly understand the scale and significance of LinkedIn's spammy connection request problem, and how it connects to the business's bottom line.
What are some other details you'd want to know?
Remember, for an open-ended question like this one, an interview is meant to be collaborative!
Comment in the discussion section how you'd approach this problem!