Apologies in advance for my general lack of knowledge. I’m fairly new the the observability world. I’m wanting to know if FluentD would be a good fit for my use case. Any feedback or guidance would be appreciated.
My company has tons of varying log types that are mostly stored in S3. These logs are ranging from common log types such as, AWS CloudTrail & VPC flow logs, to common linux logs such as /var/log/secure, to specific application logs, Kubernetes logs, etc.
Part of my team’s responsibility is to take these logs and move them into our data lake: Google BigQuery.
So far what my team has been doing is writing Dataflow templates per type of logs … so we’d have a specific Dataflow template for CloudTrail, one for VPC flow logs, etc. The Dataflow jobs are basically reading things out of a bucket and then streaming them into BigQuery. The reason that we’re having to create new templates is to accommodate the format of the logs… since CloudTrail’s json needs to be parsed out, whereas VPC flow logs are a flat file that, etc.
What I’d like is a solution that doesn’t force me to create a new parser for each log type out there… it sort of feels like we’re reinventing the wheel. I’m sure we’ll have to create unique parsers for stuff like application logs, and hopefully that’s not too difficult either.
Another thing that I’d like to have is some kind of transforming ability so that the schema is more consistent between log types, so that it’s easier to correlate data between different log sources.
We also stream a lot of data into our data lake, 20 petabytes per year and growing.
I know that there are plugins for things like S3 and BigQuery… I’d just like to get some advice on whether what I’m trying to do is feasible. We have dozens of different S3 buckets with multiple different log source types… all outputting to tables in BigQuery that have different schemas depending on the log source. I’d just like to get some insight as to whether on not it makes sense to use FluentD to help streamline this from folks who have experience with it.
Thanks in advance for any advice you can offer