I'm trying to implement a simple event table in MySQL, that I want other processes to poll for new information. The design is pretty simple:
- I hava a table event with an AUTO_INCREMENT id column and then additional columns describing the event
- Via triggers on other tables, rows are inserted
- Other processes poll this table and by using the ID column knows where in the "event stream" they are and select accordingly for new information.
[2] is always performed inside a transaction.
The problem we are seeing, and I should have foreseen this, is that this really doesn't work when writes are done within transactions. The ID of the next row in the event table is decided within the transaction, but there is no guarantee that from an external point of view that events are written in ascending order. This means that you can not simply select for the next highest ID, because there still may be rows with a lower ID not yet committed.
Example:
Transaction 1
start transaction;
insert into event (...) values (...);
commit;
Transaction 2
start transaction;
insert into event (...) values (...);
commit;
Consider when transaction 1 is first to insert into event, and then transaction 2. Transaction 1's event will be given 1 as value for its auto incrementing column (assuming table is empty, first writes), whereas transaction 2's will be given 2. If then transaction 2 is first to commit, for external processes polling this table they will just see the event having id 2, process it and incorrectly assume that from now on they should select for rows having id > 2. The event from transaction 1 is effectively lost.
Is there a clever way to work around this? In practice, I'd like to be able to poll for committed rows in ascending order. Unfortunately, I can not expect the IDs in the table to be an unbroken sequence; transactions may be rolled back effectively creating gaps of "unused" ids.
I'm stumped. Include a timestamp with each event, and instead always select for rows having a timestamp greater than what we expect the longest running transactions to have, and then ignore events that were already processed?
Edit
I'm adding more information to try to explain the problem I want to solve. A common requirement is that whenever table X is updated, we want to notify external processes. A simple design would then be: add trigger to table X, write row to event table, have external process poll this table and act accordingly.
In a perfect world, I'd like "event" to instead be pushed to a topic and not involve the database at all, but since writes to table X are done all over the place it would be very error prone to get right; having a trigger that writes a row solved that problem very nicely. In fact, if the polling of the event table was simple to get right we could just add a simple bridge solution that polls and pushes to a topic.
Edit 2
A possible solution expanding on previous edit would be to have a single process polling the event table selecting rows in ascending order by id. Information about events are pushed to a topic, and then deleted from the table. This way we are not dependent on rows getting stricter higher ids as they are written.