Having an issue with duplicate events.
My setup is as follows:
- An Azure AKS cluster, and the pods are running FluentD.
- 16 cores / 64GB memory
- Data is large, around 600+ million records per day.
- TCP connection to data
- Output to Azure EventHub
When we initially started to receive logs, I thought the issue was network related. For every logs, we were receiving 1-240 duplicate records. Our source sender was traversing through a FW and thought there could be an issue with TCP resets causing this. To mitigate, I connected the Source/Dest through VNET peering.
Even with a direct connection through VNET peering, I am still getting around 20% duplication of logs.
Is there anything that could cause duplication of logs in regards to FluentD configuration?