Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EasySNMPTimeoutError if not first session #146

Open
Guens opened this issue Apr 15, 2022 · 15 comments
Open

EasySNMPTimeoutError if not first session #146

Guens opened this issue Apr 15, 2022 · 15 comments
Milestone

Comments

@Guens
Copy link

Guens commented Apr 15, 2022

I have a very strange behavior (bug ?).
If I run this simple script

import easysnmp

snmp_opts = {
    "version": 3,
    "security_level": "auth_with_privacy", 
    "auth_protocol": "MD5",  
    "privacy_protocol": "DES",

    "auth_password": "xxx",  
    "privacy_password": "xxx",  
    "security_username": "xxx", 
}

oid = [".1.3.6.1.2.1.1.3.0", ]


s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)
print(s2.get(oid))

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
print(s1.get(oid))

It works

[<SNMPVariable value='25134890' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
[<SNMPVariable value='24988592' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]

But If I run s1 before s2

...
s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
print(s1.get(oid))

s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)
print(s2.get(oid))

the second session fails :

[<SNMPVariable value='25003713' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
Traceback (most recent call last):
  File "/home/pg218277/.config/JetBrains/PyCharmCE2021.3/scratches/bug_easysnmp.py", line 27, in <module>
    print(s2.get(oid))
  File "/home/pg218277/git/equtils/venv/lib64/python3.6/site-packages/easysnmp/session.py", line 315, in get
    interface.get(self, varlist)
easysnmp.exceptions.EasySNMPTimeoutError: timed out while connecting to remote host

It happens with

  • net-snmp 5.81 and easysnmp 0.2.5
  • net-snmp 5.91 and easysnmp latest master

I'm very annoyed because I didn't find any workaround.
Can somebody help me ?

@kamakazikamikaze
Copy link
Collaborator

Interesting issue. What happens if you create the sessions and then loop ten times? Do they all fail?

For example:

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)

for i in range(10):
    try:
        print(s1.get(oid))
    except Exception as e:
        print('S1 error on iteration {}: {}'.format(i, e))
    try:
        print(s2.get(oid))
    except Exception as e:
        print('S2 error on iteration {}: {}'.format(i, e))

@Guens
Copy link
Author

Guens commented Apr 16, 2022

Thanks Kent for your help.

I tried your script and the second session always fails too :

[<SNMPVariable value='35292381' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 0: timed out while connecting to remote host
[<SNMPVariable value='35292582' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 1: timed out while connecting to remote host
[<SNMPVariable value='35292782' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 2: timed out while connecting to remote host
[<SNMPVariable value='35292983' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 3: timed out while connecting to remote host
[<SNMPVariable value='35293183' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 4: timed out while connecting to remote host
[<SNMPVariable value='35293384' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 5: timed out while connecting to remote host
[<SNMPVariable value='35293584' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 6: timed out while connecting to remote host
[<SNMPVariable value='35293784' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 7: timed out while connecting to remote host
[<SNMPVariable value='35293985' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 8: timed out while connecting to remote host
[<SNMPVariable value='35294185' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 9: timed out while connecting to remote host

If I try to invert S2 an S1 :

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)

for i in range(10):
    try:
        print(s2.get(oid))
    except Exception as e:
        print('S2 error on iteration {}: {}'.format(i, e))
    try:
        print(s1.get(oid))
    except Exception as e:
        print('S1 error on iteration {}: {}'.format(i, e))

the first S2 succeeds, but all other S2 sessions fail after the first S1 :

[<SNMPVariable value='35464538' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
[<SNMPVariable value='35318239' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 1: timed out while connecting to remote host
[<SNMPVariable value='35318439' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 2: timed out while connecting to remote host
[<SNMPVariable value='35318640' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 3: timed out while connecting to remote host
[<SNMPVariable value='35318840' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 4: timed out while connecting to remote host
[<SNMPVariable value='35319041' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 5: timed out while connecting to remote host
[<SNMPVariable value='35319241' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 6: timed out while connecting to remote host
[<SNMPVariable value='35319442' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 7: timed out while connecting to remote host
[<SNMPVariable value='35319642' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 8: timed out while connecting to remote host
[<SNMPVariable value='35319842' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]
S2 error on iteration 9: timed out while connecting to remote host
[<SNMPVariable value='35320043' (oid='sysUpTimeInstance', oid_index='', snmp_type='TICKS')>]

Il also have the same result if sessions are created inside the loop :

for i in range(10):
    try:
        s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)
        print(s2.get(oid))
    except Exception as e:
        print('S2 error on iteration {}: {}'.format(i, e))
    try:
        s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
        print(s1.get(oid))
    except Exception as e:
        print('S1 error on iteration {}: {}'.format(i, e))

What's might be happening ?

@kamakazikamikaze
Copy link
Collaborator

I'm honestly not completely certain, but my best guess is that something internal may not be parsing the data properly and causes an (inaccurate) exception to be made.

Can you run Wireshark and see if the packets leave your station and a response comes back? If there is, then we'll need to get down into the C code to figure out where it's erroring out.

@Guens
Copy link
Author

Guens commented Apr 16, 2022

I follow your advice with wireshark.

When it's OK with

s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)
print(s2.get(oid))

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
print(s1.get(oid))

Hers is the wireshark capture

$ tshark -r ok.pcapng
    1 0.000000000  188.3.3.204 → 10.224.113.113 SNMP 106 get-request
    2 0.001631285 10.224.113.113 → 188.3.3.204  SNMP 162 report 1.3.6.1.6.3.15.1.1.4.0
    3 0.001699119  188.3.3.204 → 10.224.113.113 SNMP 186 encryptedPDU: privKey Unknown
    4 0.003590253 10.224.113.113 → 188.3.3.204  SNMP 194 encryptedPDU: privKey Unknown
    5 0.022075249  188.3.3.204 → 10.224.115.121 SNMP 106 get-request
    6 0.023651344 10.224.115.121 → 188.3.3.204  SNMP 162 report 1.3.6.1.6.3.15.1.1.4.0
    7 0.023723488  188.3.3.204 → 10.224.115.121 SNMP 186 encryptedPDU: privKey Unknown
    8 0.025355497 10.224.115.121 → 188.3.3.204  SNMP 174 report 1.3.6.1.6.3.15.1.1.2.0
    9 0.025402634  188.3.3.204 → 10.224.115.121 SNMP 186 encryptedPDU: privKey Unknown
   10 0.027161327 10.224.115.121 → 188.3.3.204  SNMP 194 encryptedPDU: privKey Unknown

And when it fails with

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
print(s1.get(oid))

s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)
print(s2.get(oid))

Here is the capture

$ tshark -r ko.pcapng
    1 0.000000000  188.3.3.204 → 10.224.115.121 SNMP 106 get-request
    2 0.001586401 10.224.115.121 → 188.3.3.204  SNMP 162 report 1.3.6.1.6.3.15.1.1.4.0
    3 0.001660955  188.3.3.204 → 10.224.115.121 SNMP 186 encryptedPDU: privKey Unknown
    4 0.003383640 10.224.115.121 → 188.3.3.204  SNMP 194 encryptedPDU: privKey Unknown
    5 0.021496701  188.3.3.204 → 10.224.113.113 SNMP 106 get-request
    6 0.023240485 10.224.113.113 → 188.3.3.204  SNMP 162 report 1.3.6.1.6.3.15.1.1.4.0
    7 0.023307782  188.3.3.204 → 10.224.113.113 SNMP 186 encryptedPDU: privKey Unknown
    8 0.025056734 10.224.113.113 → 188.3.3.204  SNMP 174 report 1.3.6.1.6.3.15.1.1.2.0

It seems something is missing, isn't it ?

@Guens
Copy link
Author

Guens commented Apr 18, 2022

Single sessions s1 or s2 contains only the following packets :

$ tshark -r s1.pcapng
    1 0.000000000  188.3.3.204 → 10.224.115.121 SNMP 106 get-request
    2 0.001558097 10.224.115.121 → 188.3.3.204  SNMP 162 report 1.3.6.1.6.3.15.1.1.4.0
    3 0.002624055  188.3.3.204 → 10.224.115.121 SNMP 186 encryptedPDU: privKey Unknown
    4 0.004359447 10.224.115.121 → 188.3.3.204  SNMP 194 encryptedPDU: privKey Unknown
$ tshark -r s2.pcapng
    1 0.000000000  188.3.3.204 → 10.224.113.113 SNMP 106 get-request
    2 0.001603048 10.224.113.113 → 188.3.3.204  SNMP 162 report 1.3.6.1.6.3.15.1.1.4.0
    3 0.007255996  188.3.3.204 → 10.224.113.113 SNMP 186 encryptedPDU: privKey Unknown
    4 0.009015309 10.224.113.113 → 188.3.3.204  SNMP 194 encryptedPDU: privKey Unknown

I don't understand why chaining s1 and s2, or s2 and s1, generates such packet 8 (and may be 7 ?) :

    7 0.023723488  188.3.3.204 → 10.224.115.121 SNMP 186 encryptedPDU: privKey Unknown
    8 0.025355497 10.224.115.121 → 188.3.3.204  SNMP 174 report 1.3.6.1.6.3.15.1.1.2.0

or

    7 0.023307782  188.3.3.204 → 10.224.113.113 SNMP 186 encryptedPDU: privKey Unknown
    8 0.025056734 10.224.113.113 → 188.3.3.204  SNMP 174 report 1.3.6.1.6.3.15.1.1.2.0

FYI same result with the new 0.2.6a1.

@kamakazikamikaze
Copy link
Collaborator

I will try to replicate your scenario this week by querying two different hosts with SNMPv3. If that doesn't work, I'll see about creating a Docker image with debug symbols that will run scripts and output the full traceback of lines that get executed.

@amamitzsch
Copy link

I' m seeing the same issue on our system. Is there any way to properly close the "old" session before creating a new one ? If I understood it right, easysnmp uses the interface of netsnmp, so I wonder if there anything left what may cause it. I tried to solve the issue by selecting different local_ports for the sessions, but no luck.

@kamakazikamikaze
Copy link
Collaborator

Looking into #156, it could be that there's something being cached by the library that gets improperly used between the devices. By chance, are the EngineID values returned by each of the devices the same?

@Guens
Copy link
Author

Guens commented Jul 17, 2022

Good shot Ken !

Problem occurs when two devices have the same EngineID.

Let's look at this example :

oid = [".1.3.6.1.6.3.10.2.1.1.0", ]  # SNMP-FRAMEWORK-MIB::snmpEngineID.0

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)
s3 = easysnmp.Session(hostname="10.224.117.121", **snmp_opts)

print(f"s1 = {s1.get(oid)}")
print(f"s2 = {s2.get(oid)}")
print(f"s3 = {s3.get(oid)}")

We can see s1 and s3 have the same EngineId, but not s2 :

s1 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s2 = [<SNMPVariable value='Ybb (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s3 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]

But if s3 and s1 are swaped

print(f"s3 = {s3.get(oid)}")
print(f"s2 = {s2.get(oid)}")
print(f"s1 = {s1.get(oid)}")

s1 query fails :

s3 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s2 = [<SNMPVariable value='Ybb (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
Traceback (most recent call last):
  File "/home/pg218277/.config/JetBrains/PyCharmCE2022.1/scratches/bug_easysnmp.py", line 32, in <module>
    print(f"s1 = {s1.get(oid)}")
  File "/home/pg218277/git/equtils/venv/lib64/python3.6/site-packages/easysnmp/session.py", line 298, in get
    interface.get(self, varlist)
easysnmp.exceptions.EasySNMPTimeoutError: timed out while connecting to remote host

@kamakazikamikaze
Copy link
Collaborator

Problem occurs when two devices have the same EngineID.

Let's look at this example :

oid = [".1.3.6.1.6.3.10.2.1.1.0", ]  # SNMP-FRAMEWORK-MIB::snmpEngineID.0

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
s2 = easysnmp.Session(hostname="10.224.113.113", **snmp_opts)
s3 = easysnmp.Session(hostname="10.224.117.121", **snmp_opts)

print(f"s1 = {s1.get(oid)}")
print(f"s2 = {s2.get(oid)}")
print(f"s3 = {s3.get(oid)}")

We can see s1 and s3 have the same EngineId, but not s2 :

s1 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s2 = [<SNMPVariable value='Ybb (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s3 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]

But if s3 and s1 are swaped

print(f"s3 = {s3.get(oid)}")
print(f"s2 = {s2.get(oid)}")
print(f"s1 = {s1.get(oid)}")

s1 query fails :

s3 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s2 = [<SNMPVariable value='Ybb (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
Traceback (most recent call last):
  File "/home/pg218277/.config/JetBrains/PyCharmCE2022.1/scratches/bug_easysnmp.py", line 32, in <module>
    print(f"s1 = {s1.get(oid)}")
  File "/home/pg218277/git/equtils/venv/lib64/python3.6/site-packages/easysnmp/session.py", line 298, in get
    interface.get(self, varlist)
easysnmp.exceptions.EasySNMPTimeoutError: timed out while connecting to remote host

Ok, I've not been able to troubleshoot this yet, but two things come to mind:

  1. Every SNMP agent must have a unique Engine ID. It is unclear if this infers globally unique or per-device unique.
  2. Your captures from .113 and .121 report an OID I've not come across yet. 10.224.115.121 → 188.3.3.204 SNMP 174 report 1.3.6.1.6.3.15.1.1.2.0. OID reference. This leads me to believe that either Net-SNMP is using something cached because the Engine IDs match, or our instructions when we pass off to Net-SNMP are causing it to use something that is cached.

Are you able to change the Engine ID of the two devices so that they're unique from each other? If you can and it clears everything up, then my next steps will include adding testcases for when two engines run with the same ID and then diving back into our calls to Net-SNMP.

@kamakazikamikaze
Copy link
Collaborator

RFC3411 section 3.1.1.1 states that the EngineID is unique across entities within an administrative domain. Your devices may be violating this if they're on the same network. It's odd that they both have the same ID, especially if each device uses something internal to auto-generated an ID. Are these the same host?

@Guens
Copy link
Author

Guens commented Jul 24, 2022

These 2 devices are 2 different hosts. They both have exactly the same software and configuration.
They only differ with their network addresses, of course.

I tried to restart one of them, and it got a different Engine ID :

oid = [".1.3.6.1.6.3.10.2.1.1.0", ]  # SNMP-FRAMEWORK-MIB::snmpEngineID.0

s1 = easysnmp.Session(hostname="10.224.115.121", **snmp_opts)
s3 = easysnmp.Session(hostname="10.224.117.121", **snmp_opts)

print(f"s1 = {s1.get(oid)}")
print(f"s3 = {s3.get(oid)}")
s1 = [<SNMPVariable value='a{b (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s3 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]

And now if I swap queries :

print(f"s3 = {s3.get(oid)}")
print(f"s1 = {s1.get(oid)}")

Now it works :

s3 = [<SNMPVariable value='~DQ (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]
s1 = [<SNMPVariable value='a{b (contains binary)' (oid='snmpEngineID', oid_index='0', snmp_type='OCTETSTR')>]

Before that, this test failed.

It seems the problem is really due to the same Engine Id situation.

FYI these 2 hosts run the same net-snmp 7.5 agent, without specify an Engine Id configuration (default conf I suppose). I don't know how 2 hosts may have same Engine Id. But it happens...

@kamakazikamikaze
Copy link
Collaborator

Well I'm glad to see the mystery cleared up. However, I'm of the opinion that we should look into determining how and why EasySNMP/Net-SNMP behaves this way for this specific scenario and fix it within the library. It may take some time but I do believe that EasySNMP shouldn't care if two different devices have the same Engine ID. So for the time being I'm going to leave this issue open.

@kamakazikamikaze kamakazikamikaze added this to the 0.2.8 milestone Jul 24, 2022
@Guens
Copy link
Author

Guens commented Jul 25, 2022

I agree with you : EasySNMP shouldn't care if two different devices have the same Engine ID.

IMHO 2 "entities", with 2 different IP addresses, should be seen as 2 SNMP different devices, regardless of EngineId. But I don't know if it's RFC compliant.

@kamakazikamikaze
Copy link
Collaborator

I agree with you : EasySNMP shouldn't care if two different devices have the same Engine ID.

IMHO 2 "entities", with 2 different IP addresses, should be seen as 2 SNMP different devices, regardless of EngineId. But I don't know if it's RFC compliant.

I agree. It would make sense to make the ID a pair of the target IP and Engine ID. However, given that some devices can have multiple IPs (routers and layer 3 switches for example), the Engine ID ensures that a device is not counted twice because it acts as the domain-specific unique identifier.

We may be at the mercy of the implementation of Net-SNMP since it handles the packet send/receive and caches the v3 information. Discovering the v3 user information being cached was a fluke since there was just one relevant result from Google that pointed me to StackOverflow. I would never have found this otherwise because the high-level Net-SNMP documentation didn't cover it. If this behavior is due to something deep within Net-SNMP and considered "the right way of handling it" by their team, you may be forced to ensure each device has a unique Engine ID to avoid the problem from recurring.

I'll try and take some time to investigate it while I also work with @mmattb to resolve long-standing issues with the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants