Duplicate logs on ingestion


Having an issue with duplicate events.
My setup is as follows:

  • An Azure AKS cluster, and the pods are running FluentD.
  • 16 cores / 64GB memory
  • Data is large, around 600+ million records per day.
  • TCP connection to data
  • Output to Azure EventHub

When we initially started to receive logs, I thought the issue was network related. For every logs, we were receiving 1-240 duplicate records. Our source sender was traversing through a FW and thought there could be an issue with TCP resets causing this. To mitigate, I connected the Source/Dest through VNET peering.
Even with a direct connection through VNET peering, I am still getting around 20% duplication of logs.
Is there anything that could cause duplication of logs in regards to FluentD configuration?