You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently if FireFly receives a batch with any malformed contents (such as a mismatched key due to #1175), it writes the batch to the database but none of the messages or data.
returnnil, false, nil// This is not retryable. skip this batch
However, the saved batch has no marking to indicate that the contents were invalid. Therefore every time the aggregator rewinds to this batch, it will stall with "message not yet available".
l.Debugf("Message '%s' in batch '%s' is not yet available", msgEntry.ID, manifest.ID)
When batch contents are skipped in this fashion, FireFly should actually mark the batch somehow, and when the aggregator attempts to process a message from that batch, it should simply mark the pin dispatched and move on. This is different from rejecting the message, because the message will never even be inserted to the database - but there's no point in blocking the pin forever, since the message will never be valid.
The text was updated successfully, but these errors were encountered:
awrichar
changed the title
Batches with invalid contents should be flagged an skipped during aggregation
Batches with invalid contents should be flagged and skipped during aggregation
Apr 12, 2023
One follow-on observation to note - we are performing certain batch validation checks only in the handler for shared storage downloads.
Ideally the exact same checks would be performed on a batch 1) before sending it, and 2) immediately after downloading it from either DX or shared storage. Anything we can do to unify those validation paths would help, because even if we fix the root issue noted above, there's a secondary problem in that the sender and the receivers are in different states (sender thinks the batch was ok).
Currently if FireFly receives a batch with any malformed contents (such as a mismatched key due to #1175), it writes the batch to the database but none of the messages or data.
firefly/internal/events/persist_batch.go
Line 45 in 057a7af
However, the saved batch has no marking to indicate that the contents were invalid. Therefore every time the aggregator rewinds to this batch, it will stall with "message not yet available".
firefly/internal/events/aggregator.go
Line 467 in 057a7af
When batch contents are skipped in this fashion, FireFly should actually mark the batch somehow, and when the aggregator attempts to process a message from that batch, it should simply mark the pin dispatched and move on. This is different from rejecting the message, because the message will never even be inserted to the database - but there's no point in blocking the pin forever, since the message will never be valid.
The text was updated successfully, but these errors were encountered: