Skip to content

percentage bandwidth calculation by thold plugin is wrong #713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gj00354347 opened this issue Dec 20, 2024 · 11 comments
Open

percentage bandwidth calculation by thold plugin is wrong #713

gj00354347 opened this issue Dec 20, 2024 · 11 comments

Comments

@gj00354347
Copy link

Describe the bug
Hi Team ,

we are having a threshold for interface traffic in cacti 1.2.28 and thold plugin version 1.2.0 . there we have a legacy device having small bandwidth like 1000 Mbps . so when Thold plugin has to calculate the utilisation of bandwidth , it is doing incorrectly and the strange thing is that device is having 5 interfaces but this wrong calculation happens only with one and that too with only inbound traffic not even with outbound traffic of same interface .

below are a few important details

thold - 1.2.0
cacti - 1.2.28
rpn expression - |ds:traffic_in|,|query_ifSpeed|,1000,,/,800,

(yes ifSpeed as it's a legacy one and 1000 as it is suggested by device manager )

see in the snap , in graph legend that is being calculated correctly . even in Thold the outbound traffic percenstage is right as 1000 Mbps is bandwidth .

image

the inbound is 5 M so it should be 5 % not 194,631 but this is the value we are getting but see in outbound in thold calculation that is right .
so this is very strange that inbound calculation is wrong but inbound is wrong . as every now and then it is breaching and creating a false positive and a false alert and corresponding .

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Plugin (please complete the following information):

  • Version: [e.g. 1.2]
  • Source: [e.g. cacti.net, package, github]
  • Identifier: [e.g. apt/yum package name or github commit ref]

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@xmacan
Copy link
Member

xmacan commented Dec 20, 2024

Thold 1.2 is very old (released 2019). You should update to 1.8.x.
1.8.x needs Cacti 1.2.25, so you need older release (maybe 1.6) or update Cacti too.

@gj00354347
Copy link
Author

@xmacan Thanks for the update . In fact my cacti is on 1.2.28 so we will update the Thold plugin .

we will upgrade the thold plugin after Christmas and New Year Change Freeze and then come back to you .

may be you can keep this in backlog status or awaiting feedback status .

Best Regards,
Gopal

@arno-st
Copy link

arno-st commented Dec 23, 2024

Are you sure of your RPN expression ?
|ds:traffic_in|,|query_ifSpeed|,1000,,/,800,

you take DS, then ifspeed but you do nothing with it, then 1000, then an empty coma, then you divide by 1000, then take 800 and do nothing with it.

@TheWitness
Copy link
Member

That RPN is for sure wrong.

@gj00354347
Copy link
Author

gj00354347 commented Dec 23, 2024

@TheWitness actually if you are seeing the double comma , that's typo , sorry for that ,as the same rpn is calculating the outbound traffic percentage very correctly as you can see in my snap attached previously .

the correct rpn is |ds:traffic_in|,|query_ifSpeed|,1000 , * ,/, 800,*

@TheWitness
Copy link
Member

TheWitness commented Dec 23, 2024

Yea, that looks better. You can insert that comment with three leading and trailing back ticks to show it in raw mono space BTW.

@TheWitness
Copy link
Member

TheWitness commented Dec 23, 2024

|ds:traffic_in|,|query_ifSpeed|,1000,*,/,800,*

@bernisys
Copy link

Picking up this issue, as we still face this problem.

First of all I don't think that thold 1.2 is the bad guy, as it is working absolutely fine for other interfaces. I could imagine there is something wrong inside the database, you remember we did run a few DB cleanups and this might have shifted some IDs internally, what do you think @TheWitness ?

From the threshold edit i can see that the "out" part is calculated correctly, but the "in" part is way off:

Image

It should show something around 0.9% actually

Any ideas where we in the DB could dive into to track down this issue?

@xmacan
Copy link
Member

xmacan commented Apr 29, 2025

Have you tried adding the device again? Don't delete the old one, just add it again.

@bernisys
Copy link

bernisys commented May 8, 2025

hi @xmacan - no i believe we didn't do that yet, and we usually take this only as a last-resort "solution".
Problem is that on several devices we have alerting and reporting, which then needs to be conveyed to the new integration once it works. But the people doing the operations .. well .. let's say they don't pay too much attention to such mundane details ...
Therefore I usually tend to try and repair things in-place, because this raises less questions months after the activity. ;)

However, i can try to get our operations team to play around with that option. Usually things start working again after re-adding. But to be honest, i'd rather have stuff keep working and not breaking suddenly somewhere in between. :)

@bernisys
Copy link

After adding a copy of the device and the threshold this is getting even weirder ...

Check this out .. in the threshold overview list the copy shows ~1000x higher values than the original and goes instantly into alerting.
Image

Same in the config pages:
Image
Image

BTW i cannot filter properly, if i put part of the device name into the search field, only the CPU and space on /var are listed .. totally strange.

We're still on thold 1.2.0, i tried upgrading it one day a while ago but had to roll back because nothing worked afterwards.
We're currently waiting for a RedHat uplift to a newer major version, i will try updating thold a while afterwards to see if it might be related to the php version. But i'd say the version we are using (which is 7.3) is new enough to cover the requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants