-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review relative imports #501
Comments
Regarding # 1 (init pulling in the universe)To test without imports in
ResultsMake sure
Time with imports (first run):
Time with imports (
Time without the imports (first run): real 0m0.195s
user 0m0.000s
sys 0m0.015s Time without imports (
I should probably run them a 100 times and then avg, but i did it a few times and the results were ~ the same. IMO - There is a little speed up, that might be significant for someone processing pcaps all day, everyday. |
@kbandla agreed the changes (depending on what we decide to do) might break existing scripts, so we'll have to proceed with caution. So, for me, 2) and 3) are the most annoying.. so perhaps we can leave 1) to avoid script breakage. Although as part of this process we should definitely review exactly what
@obormot you know what this is doing? |
Yeah |
Ah, okay.. so that's another good reason to leave |
So just to add some context, while I don't know the most kosher way of dealing with relative imports, I've tried applying the following trick to several modules:
This allows running module-level unit tests individually, like
However, tests that depend on the other module initializations (discussed above) do fail in this case.
Perhaps this could be fixed by adding something like |
I think the If we moved to static mapping dictionaries between protocol numbers and their handler classes, there would be no need for the module initialization. This would involve hard-coding the mappings, but I think may lead to more easily understandable code. Once python2 is dropped (#551), we can have a much cleaner implementation of how all of this works. I've made a quick first pass in https://github.com/kbandla/dpkt/commits/feature/absolute_imports. If Combining these two things would mean we could have a completely empty One area where speeding up dpkt init is as part of use in a serverless environment (like an AWS Lambda). |
Looping in (@obormot @kbandla ) This is one of those areas that was setup before I became involved, and I never really understood the current setup. Also, I have to admit, my knowledge/experience with 'meta programming' is basically nothing. The Packet(_MetaPacket) code in dpkt.py kinda blows my mind. It's obviously super interesting/powerful, but I would have to stare at it for a LONG time to really understand what it's doing.... @crocogorical what's your understanding of Python Meta Programming? Do you feel like you have a good understanding/knowledge of the Packet() and _MetaPacket() classes? Also, just as a logistical concern, lets get a few more short term items knocked off the list (I'll do the 1.9.5 release).. and put this ticket/issue perhaps as a goal for 2.0 release. |
I've been looking at the MetaPacket setup, and it took me a while to understand it, but I think I'm there. There are some improvements that we can make for efficiency too. My understanding is that when a class is initialized (the class, not an instance of it, so during import of dpkt), say Ethernet, we follow the inheritance chain to Packet, which then does some stuff with the _MetaPacket. This becomes confusing because _MetaPacket is never initialized. _MetaPacket receives 3x arguments: clsname ( During the execution of I had a play with trying to extend this further, with some success, but the With _MetaPacket, we produce (in my opinion), a slightly confusing hierarchy, as _MetaPacket is never actually initialized. Instead the inheritance chain is from the arguments that Packet passes, which "Temp" is the named class. _MetaPacket is a class factory. This results in a method resolution chain of: >>> from dpkt.ethernet import Ethernet
>>> Ethernet.mro()
[<class 'dpkt.ethernet.Ethernet'>, <class 'dpkt.dpkt.Packet'>, <class 'dpkt.dpkt.Temp'>, <class 'object'>] We should probably change Temp to something like BasePacket just for readability. I have a theory that _MetaPacket could probably be incorporated into Packet, but haven't explored that further. |
The current dpkt package seems oddly setup IMO. Issues I see/struggle with...
The __init__.py pulls in the universe, I like the simplicity of having an empty __init__.py that does nothiing
Note: After folks have reviewed/talked about it looks like for a couple reasons we'll just leave the
__init__.py
as is.The dpkt.py file thats in dpkt/dpkt/dpkt.py means there's a naming conflict when trying to import the dpkt package
The code is full of relative imports which means you can't just run a file/test with
$ python <blah.py>
Would be nice if @kbandla, @obormot and I could maybe meet up and discuss how to go forward here, this feels like a fairly big issue IMO but we'd need to talk/agree/team up to make the changes.
Resources:
The text was updated successfully, but these errors were encountered: