Aquí hay una solución basada en subconsultas anidadas. Primero, agregué algunas filas para detectar algunos casos más. La transacción 10, por ejemplo, no debe ser cancelada por la transacción 12, porque la transacción 11 se encuentra en el medio.
> select * from transactions order by date_time;
+----+---------+------+---------------------+--------+
| id | account | type | date_time | amount |
+----+---------+------+---------------------+--------+
| 1 | 1 | R | 2012-01-01 10:01:00 | 1000 |
| 2 | 3 | R | 2012-01-02 12:53:10 | 1500 |
| 3 | 3 | A | 2012-01-03 13:10:01 | -1500 |
| 4 | 2 | R | 2012-01-03 17:56:00 | 2000 |
| 5 | 1 | R | 2012-01-04 12:30:01 | 1000 |
| 6 | 2 | A | 2012-01-04 13:23:01 | -2000 |
| 7 | 3 | R | 2012-01-04 15:13:10 | 3000 |
| 8 | 3 | R | 2012-01-05 12:12:00 | 1250 |
| 9 | 3 | A | 2012-01-06 17:24:01 | -1250 |
| 10 | 3 | R | 2012-01-07 00:00:00 | 1250 |
| 11 | 3 | R | 2012-01-07 05:00:00 | 4000 |
| 12 | 3 | A | 2012-01-08 00:00:00 | -1250 |
| 14 | 2 | R | 2012-01-09 00:00:00 | 2000 |
| 13 | 3 | A | 2012-01-10 00:00:00 | -1500 |
| 15 | 2 | A | 2012-01-11 04:00:00 | -2000 |
| 16 | 2 | R | 2012-01-12 00:00:00 | 5000 |
+----+---------+------+---------------------+--------+
16 rows in set (0.00 sec)
Primero, cree una consulta para obtener, para cada transacción, "la fecha de la transacción más reciente antes de esa en la misma cuenta":
SELECT t2.*,
MAX(t1.date_time) AS prev_date
FROM transactions t1
JOIN transactions t2
ON (t1.account = t2.account
AND t2.date_time > t1.date_time)
GROUP BY t2.account,t2.date_time
ORDER BY t2.date_time;
+----+---------+------+---------------------+--------+---------------------+
| id | account | type | date_time | amount | prev_date |
+----+---------+------+---------------------+--------+---------------------+
| 3 | 3 | A | 2012-01-03 13:10:01 | -1500 | 2012-01-02 12:53:10 |
| 5 | 1 | R | 2012-01-04 12:30:01 | 1000 | 2012-01-01 10:01:00 |
| 6 | 2 | A | 2012-01-04 13:23:01 | -2000 | 2012-01-03 17:56:00 |
| 7 | 3 | R | 2012-01-04 15:13:10 | 3000 | 2012-01-03 13:10:01 |
| 8 | 3 | R | 2012-01-05 12:12:00 | 1250 | 2012-01-04 15:13:10 |
| 9 | 3 | A | 2012-01-06 17:24:01 | -1250 | 2012-01-05 12:12:00 |
| 10 | 3 | R | 2012-01-07 00:00:00 | 1250 | 2012-01-06 17:24:01 |
| 11 | 3 | R | 2012-01-07 05:00:00 | 4000 | 2012-01-07 00:00:00 |
| 12 | 3 | A | 2012-01-08 00:00:00 | -1250 | 2012-01-07 05:00:00 |
| 14 | 2 | R | 2012-01-09 00:00:00 | 2000 | 2012-01-04 13:23:01 |
| 13 | 3 | A | 2012-01-10 00:00:00 | -1500 | 2012-01-08 00:00:00 |
| 15 | 2 | A | 2012-01-11 04:00:00 | -2000 | 2012-01-09 00:00:00 |
| 16 | 2 | R | 2012-01-12 00:00:00 | 5000 | 2012-01-11 04:00:00 |
+----+---------+------+---------------------+--------+---------------------+
13 rows in set (0.00 sec)
Úselo como una subconsulta para obtener cada transacción y su predecesora en la misma fila. Use algún filtro para extraer las transacciones que nos interesan, es decir, transacciones 'A' cuyos predecesores son transacciones 'R' que cancelan exactamente -
SELECT
t3.*,transactions.*
FROM
transactions
JOIN
(SELECT t2.*,
MAX(t1.date_time) AS prev_date
FROM transactions t1
JOIN transactions t2
ON (t1.account = t2.account
AND t2.date_time > t1.date_time)
GROUP BY t2.account,t2.date_time) t3
ON t3.account = transactions.account
AND t3.prev_date = transactions.date_time
AND t3.type='A'
AND transactions.type='R'
AND t3.amount + transactions.amount = 0
ORDER BY t3.date_time;
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
| id | account | type | date_time | amount | prev_date | id | account | type | date_time | amount |
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
| 3 | 3 | A | 2012-01-03 13:10:01 | -1500 | 2012-01-02 12:53:10 | 2 | 3 | R | 2012-01-02 12:53:10 | 1500 |
| 6 | 2 | A | 2012-01-04 13:23:01 | -2000 | 2012-01-03 17:56:00 | 4 | 2 | R | 2012-01-03 17:56:00 | 2000 |
| 9 | 3 | A | 2012-01-06 17:24:01 | -1250 | 2012-01-05 12:12:00 | 8 | 3 | R | 2012-01-05 12:12:00 | 1250 |
| 15 | 2 | A | 2012-01-11 04:00:00 | -2000 | 2012-01-09 00:00:00 | 14 | 2 | R | 2012-01-09 00:00:00 | 2000 |
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
4 rows in set (0.00 sec)
A partir del resultado anterior, es evidente que casi llegamos:hemos identificado las transacciones no deseadas. Usando LEFT JOIN
podemos filtrarlos de todo el conjunto de transacciones:
SELECT
transactions.*
FROM
transactions
LEFT JOIN
(SELECT
transactions.id
FROM
transactions
JOIN
(SELECT t2.*,
MAX(t1.date_time) AS prev_date
FROM transactions t1
JOIN transactions t2
ON (t1.account = t2.account
AND t2.date_time > t1.date_time)
GROUP BY t2.account,t2.date_time) t3
ON t3.account = transactions.account
AND t3.prev_date = transactions.date_time
AND t3.type='A'
AND transactions.type='R'
AND t3.amount + transactions.amount = 0) t4
USING(id)
WHERE t4.id IS NULL
AND transactions.type = 'R'
ORDER BY transactions.date_time;
+----+---------+------+---------------------+--------+
| id | account | type | date_time | amount |
+----+---------+------+---------------------+--------+
| 1 | 1 | R | 2012-01-01 10:01:00 | 1000 |
| 5 | 1 | R | 2012-01-04 12:30:01 | 1000 |
| 7 | 3 | R | 2012-01-04 15:13:10 | 3000 |
| 10 | 3 | R | 2012-01-07 00:00:00 | 1250 |
| 11 | 3 | R | 2012-01-07 05:00:00 | 4000 |
| 16 | 2 | R | 2012-01-12 00:00:00 | 5000 |
+----+---------+------+---------------------+--------+