Back to questions
Sometimes, payment transactions are repeated by accident; it could be due to user error, API failure or a retry error that causes a credit card to be charged twice.
Using the transactions table, identify any payments made at the same merchant with the same credit card for the same amount within 10 minutes of each other. Count such repeated payments.
Assumptions:
Column Name | Type |
---|---|
transaction_id | integer |
merchant_id | integer |
credit_card_id | integer |
amount | integer |
transaction_timestamp | datetime |
transaction_id | merchant_id | credit_card_id | amount | transaction_timestamp |
---|---|---|---|---|
1 | 101 | 1 | 100 | 09/25/2022 12:00:00 |
2 | 101 | 1 | 100 | 09/25/2022 12:08:00 |
3 | 101 | 1 | 100 | 09/25/2022 12:28:00 |
4 | 102 | 2 | 300 | 09/25/2022 12:00:00 |
6 | 102 | 2 | 400 | 09/25/2022 14:00:00 |
payment_count |
---|
1 |
Within 10 minutes after Transaction 1, Transaction 2 is conducted at Merchant 1 using the same credit card for the same amount. This is the only instance of repeated payment in the given sample data.
Since Transaction 3 is completed after Transactions 2 and 1, each of which occurs after 20 and 28 minutes, respectively hence it does not meet the repeated payments' conditions. Whereas, Transactions 4 and 6 have different amounts.
The dataset you are querying against may have different input & output - this is just an example!
Here are the steps to solve this question:
Step 1
For each transaction present in the table, we will obtain the time of the most recent identical transaction. With the , it is feasible. This function accesses the specified field’s values from the previous rows.
Run this query and we will explain more below.
Let's interpret the function:
Showing 4 transactions worth $100 performed at Merchant ID 101 with credit card ID 1:
transaction_id | merchant_id | credit_card_id | amount | transaction_timestamp | previous_transaction |
---|---|---|---|---|---|
1 | 101 | 1 | 100 | 09/25/2022 12:00:00 | |
2 | 101 | 1 | 100 | 09/25/2022 12:08:00 | 09/25/2022 12:00:00 |
3 | 101 | 1 | 100 | 09/25/2022 12:17:00 | 09/25/2022 12:28:00 |
5 | 101 | 1 | 100 | 09/25/2022 12:27:00 | 09/25/2022 13:17:00 |
Can you follow the pattern of the records in the ?
Step 2
Next, we should evaluate the difference in time between two consecutive identical transactions. We can simply subtract the values from the values as we now have the previous transaction time for each completed payment.
The following statement can be easily incorporated into the clause of the previous query.
transaction_id | merchant_id | credit_card_id | amount | transaction_timestamp | previous_transaction | time_difference |
---|---|---|---|---|---|---|
1 | 101 | 1 | 100 | 09/25/2022 12:00:00 | ||
2 | 101 | 1 | 100 | 09/25/2022 12:08:00 | 09/25/2022 12:00:00 | "minutes":8 |
3 | 101 | 1 | 100 | 09/25/2022 12:28:00 | 09/25/2022 12:08:00 | "minutes":20 |
5 | 101 | 1 | 100 | 09/25/2022 13:37:00 | 09/25/2022 12:28:00 | "hours":1,"minutes":9 |
The field makes it clear that there is an 8-minute lag between the first and second transactions. We will now convert this field into minutes format so that we can easily filter them.
One of the best methods to achieve that is to use the function.
This is how the outcome looks:
merchant_id | credit_card_id | amount | transaction_timestamp | minute_difference |
---|---|---|---|---|
101 | 1 | 100 | 09/25/2022 12:00:00 | |
101 | 1 | 100 | 09/25/2022 12:08:00 | 8 |
101 | 1 | 100 | 09/25/2022 12:28:00 | 20 |
101 | 1 | 100 | 09/25/2022 13:37:00 | 69 |
Step 3
The last thing we need to do is to gather all the identical transactions which occurred within a 10-minute window.
To do that, we must first convert the query into a common table expression (CTE). Then, we will filter the records using the clause for transactions a 10-minute or lesser window.
Additionally, the function allows us to count the filtered records.
Output for merchant ID 1:
payment_count |
---|
1 |
Now, we will eliminate the unnecessary columns from the clause since they were only there for exploration purposes.
That's it! We have our final query :)
PostgreSQL 14