Mean, Median, Mode

You are given a list of numbers representing how many emails each Microsoft Outlook user has in their inbox. Before the Product Management team can work on features related to inbox zero or bulk-deleting email, they simply want to know what the mean, median, and mode are for the number of emails.

Output the median, median and mode (in this order). Round the mean to the the closest integer and assume that there are no ties for mode.

Column Name | Type |
---|---|

user_id | integer |

email_count | integer |

user_id | email_count |
---|---|

123 | 100 |

234 | 200 |

345 | 300 |

456 | 200 |

567 | 200 |

mean | median | mode |
---|---|---|

200 | 200 | 200 |

The mean is 200 which is calculated by taking the total number of emails as 1,000 (100 + 200 + 300 + 200 + 200) divided by 5 users.

The mode is 200, as there are 3 instances of this email count, meaning it is the most frequent instance.

The median is also 200 since if we order the dataset by email count, it is the value that separates higher half from the lower half of the values.

In a real interview, it's best to start with what's easiest, so that we can show some forward progress and get partial credit. Thus, let's start with the mean.

In case you forgot Stat 101, we will provide the definitions of each below.

**Mean**
is the sum of a collection of numbers divided by the count of numbers in the collection. To calculate it, simply use function, and then round the result up to 0 decimals.

**Mode**
is the number in a dataset that occurs most frequently. Fortunately, we can use the function that requires the additional clause to specify to which group the mode belongs.

**Median**
is the value separating the first half from the second half of a dataset. There isn't a specific Median function for this, but we can use a percentile function that outputs the value specific.

Thus, to get the median, let's use . It is important for the dataset to be ordered by the count of emails.

See how it's used in this StackOverflow question.

PostgreSQL 14