-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicated Report IDs #535
Comments
I checked the code and noticed, this should not happen because there is a duplication check: When i now try and move one of the processed mails from the archive to the inbox, the check works: I enabled debug and try to reproduce the issue. |
Ok, already reproduced it. I cleared my index and moved both mails in the inbox.
|
I observe same thing. But when I open the XML report from Google, I have three "record" corresponding to my three rows in Elasticsearch. |
Correct. Each report can have multiple records, and each of those rows will be a separate event in Elastic/Splunk/CSVs/etc. So multiple entries with the same report ID is normal. |
@seanthegeek my issue is not with multiple entries in one XML file like @EwenBara is reporting.
Please reread my initial issue report. Can you please reopen my issue? EDIT: like in my previous comment mentioned this looks like a missing deduplication check for the report ID. |
I did some tests and I can reproduce the issue. It's happen when the two reports was parsed in same batch. To reproduce:
To confirm, I run same test with |
@fabm3n My apologies. I just published a release to try and solve this problem. It will keep track of up to 1 million report IDs seen in an hour and ignore duplicates. 8.16.0...8.16.1#diff-a1dcb2664f7e405007ed531c6c33eb4432f86a2fe4f4782a4763c95811f5754f Let me know if this solves the problem. |
This didn't fixed my issue with the same report id. For testing i moved all of my reports in the inbox again (176 reports). when i search in the elasticsearch indicies i found the ID twice: This is the debug log: The debug message "Skipping duplicate report ID" is missing, so the deduplication does not work. |
I just started using parsedmarc and got the same DMARC Report from Google twice:
As found out by this Reddit post, this is a TTL issue. https://www.reddit.com/r/DMARC/comments/1bafpk5/getting_multiple_identical_reports_from_google/
For me, the behaviour from parsedmarc is also wrong because both reports with the same report id have been added to the Elasticsearch database:
I expected that the report id is unique and there should be only one document in the database.
The best way could be to override the document with the last processed dmarc report.
The text was updated successfully, but these errors were encountered: