Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data getting lost with sniff() and a callback function? #1789

Closed
sixmillonisntenough opened this issue Jan 14, 2019 · 4 comments
Closed

Data getting lost with sniff() and a callback function? #1789

sixmillonisntenough opened this issue Jan 14, 2019 · 4 comments

Comments

@sixmillonisntenough
Copy link

This was originally a response to another issue that was closed because of "no response"; this issue certainly exists it seems as I'm having the same problem routinely with lots of traffic (and thus lots of callbacks).

Re: I'm having the same issue with payload loss-- details:

I am taking all TCP packets in a filter ("ip and tcp") and, with the sniff() function, passing the matched packets to a callback function. This callback function is converting the packet payload to a string and running a printable check against each character and, if its printable, it is appending to the string buffer. As my protocol is all ASCII, this works fine; however I am losing some packets. At some point, data will get lost and the next step in the protocol will appear appended to the end of a prior incomplete protocol message, missing its new-line and command separator. The client sending the data is not excluding this data.

As for packet fragmentation I have verified that the MF flag is NOT set on any of these packets; in fact 'DF' is set on all of them.

When I tcpdump I see all the data just fine, but in the callback via sniff() with a simple filter ("ip and tcp") and my simple printable-character filter, it doesn't aggregate all of it all of the time; sometimes it works just fine, other times it seems to miss entire packets.

I have a very, VERY high amount of network traffic and a single thread calling the callback function and sniff(). Is there any known problem w/ traffic load like that and, if so, is there a way to alleviate it and get everything processed? Would there be a chance anything could be dropped?

I'm calling sniff() like this FWIW:
sniff(iface="enp2s1", prn=packetCallback, filter="ip and tcp", store=0)

Thanks in advance!

@guedou
Copy link
Member

guedou commented Jan 15, 2019

Scapy is not able to cope with a high volume of network traffic and, unfortunately, will never be. We try to optimize the core now (#642) and then (#1735), but improvements are overall limited.

You can try to make things slightly faster by:

  • implementing your own capture logic (see Stop sniff() asynchronously #989 (comment))
  • sniff packets using the pcap Python module (set conf.use_pcap = True after installing the module)
  • use a more specific BPF filter
  • read raw data out of the network and use threads to dissect them

@sixmillonisntenough
Copy link
Author

Thanks. I was thinking I would have to do that. I am not familiar, however, with how the sniff() function works in multiple threads concurrently. For instance, if I were to construct say 4 different BPF filters that are filtering mutually-exclusive packets, can I have four different threads running sniff() with each of these filters?

Also, what is the performance advantage (and any programmatic issues?) of the pcap module versus what it currently uses? I was under the impression that the filter used pcap to begin with; guess i am not very familiar with how it all works.

Thanks

@guedou
Copy link
Member

guedou commented Jan 15, 2019 via email

@sixmillonisntenough
Copy link
Author

Okay-- so read the raw packets (can I apply a filter with this to reduce the amount of stuff to dissect? I just need to see TCP traffic from a certain source IP to a certain destination port) and perhaps put them into a thread-safe Queue (e.g. Queue class in Py 2) and pop them from the queue from a pool of threads dedicated to that?
e.g. the box has 16 cores; have four threads running that are get()'ing from the queue while the raw reader is grabbing the traffic and putting them into that queue.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants