Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus exporter metrics' labels missing #48

Open
innogrey opened this issue May 18, 2022 · 4 comments
Open

Prometheus exporter metrics' labels missing #48

innogrey opened this issue May 18, 2022 · 4 comments

Comments

@innogrey
Copy link

/flowdata metrics don't seem to be most reliable. Direction label always indicates "Incoming" (regardless of "- segment: remoteaddress" configuration and actual state; btw, documentation is a bit vague on this one), protoname is always empty, as well as remotecountry (again, regardless of "- segment: geolocation" configuration, with database provided). Furthermore, correct me if I'm wrong, but shouldn't /flowdata metrics be periodically flushed/cleared? There's no point in scraping old flow data again and again, and without that, the amount of records to scrape quickly grows to quite a significant number, esp. if we keep record of ports.
It's also quite possible that none of the above are actual bugs/issues, and I missed some important piece of documentation, so in that case please push me in the right direction.

@debugloop
Copy link
Collaborator

Hi,

thanks for trying out flowpipeline! Regarding your first two issues, have you checked whether the actual annotation is done to your liking? I.e., are the missing fields populated in the output of the json segment, for instance? Anyhow, I'll be rechecking its implementation.

Regarding the issue of exporter cache clearing: Prometheus exporter guidelines say that one should not worry about clearing. This is of course not 100% applicable in our case, where label cardinality can explode very easily. When we implemented the segment, we thought about that, but in the scenarios we envisioned the prometheus segment being used, the stream is already either very tightly filtered, from a very specific interface, or not meant to be running permanently. Generally, I'd say Prometheus is not a good match for large scale and high cardinality flow keeping. What do you think would be an appropriate way of clearing the exporter cache? Tracking of individual counter update times and clearing every x minutes without activity?

Regarding the vague documentation of the remoteaddress segment, there is not only https://github.com/bwNetFlow/flowpipeline/blob/master/CONFIGURATION.md#remoteaddress but also the more detailed https://pkg.go.dev/github.com/bwNetFlow/flowpipeline/segments/modify/remoteaddress linked from CONFIGURATION.md.

@innogrey
Copy link
Author

Thank you for your explanation and help. After a bit more testing and trial, I have come to some conclusions that you may find useful:

  • /flowdata protoname remains empty regardless of configuration, although printflowdump shows it perfectly well, so I think it disappears somewhere in metrics generator,
  • the problem with incorrect direction is on my side; the netflow implementation on OpenBSD which I use for testing does not return this information (or rather always sets it to 0),
  • remotecountry remains empty; also, I can't find a way of providing ASN database - AS labels are always 'unset',
  • beside the above, it would be more convenient if IPv6 addresses were represented in hexadecimal notation.

When it comes to clearing cache: for small and medium size projects, when Kafka is an overkill, a solution that is based solely on Prometheus isn't as bad an idea as it might seem. Especially if you substitute the proper Prometheus with VictoriaMetrics. The savings in system resources appears to be significant. The simplest clearing strategy that I can think of could be based on fixed time intervals that would allow flushing already scraped metrics with Prometheus scrape interval. Tracking the counter is an option as well.

@debugloop
Copy link
Collaborator

Thanks for getting back on this.

  • I will look into that, as well as the caching. I think I am looking at a thorough rewrite with this, so if you have any ideas for additional or better metrics, now would be a good time :) As for caching, I don't think one can tell which metrics are already scraped, even if one wanted to have to config scrape timings replicated in the flowpipeline... I'll be looking at tracking the update times.
  • Yeah, different exporters do wildly different things... I think we could have a segment for setting the direction, just as we do for the remote address with its cidr mode?
  • remotecountry is working for me, but dependent on a set remote address. I only just learned that there are ASN databases and am a bit embarrassed. The segment that sets ASN and routing related fields is bgp, which requires a BGP session with a router, ideally the exporting router.
  • IPv6 is being formatted by our printer segments. If you mean from json, that is required for it to be decoded again. If you mean that prometheus is labeling with badly formatted IPs, that's of course an issue that I'll address as part of the first point. As I've said, we've tried to avoid having addresses with timeseries.

@innogrey
Copy link
Author

innogrey commented Jun 1, 2022

  • Okay then, as I can't think of any clever changes to the metrics at the moment, the caching matter is sorted out :)
  • As for flow direction, that's exactly the kind of solution I was thinking about. And, just a loose idea here: the aggregation of flow records into bidirectional flows is worth considering at this point, although I have no idea how this can be achieved and whether it is worth the effort (or is it even possible). In theory, matching 5-tuple protocol with corresponding equivalent in reversed order could do the trick, but I definitely lack the knowledge in this area to propose anything with any certainty.
  • I got the country codes working as well. I haven't experimented with bgp at all yet, but yeah, putting AS stuff in bgp segment makes perfect sense.
  • What I meant about ipv6 actually only applied to prometheus metrics. Here's an example: flow_bits{dst_addr="[254 128 0 0 0 0 0 0 126 211 10 255 254 24 202 147]",dst_port="135",ipversion="IPv6",peer="",protoname="",remoteas="unset",remotecountry="US",src_addr="[254 128 0 0 0 0 0 0 126 194 198 255 254 18 20 81]",src_port="48679"} 18432 16 8bits blocks, not very practical :) printflowdump returns addresses in hexadecimal notation as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants