Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New MachOStructure DSL #454

Merged
merged 15 commits into from
Apr 21, 2022
Merged

New MachOStructure DSL #454

merged 15 commits into from
Apr 21, 2022

Conversation

apainintheneck
Copy link
Contributor

@apainintheneck apainintheneck commented Mar 31, 2022

Here's my first attempt at a new dsl for the MachOStructure class. At the moment, it seems to be functional but not nearly as performant as the previous hardcoded version. That seems to be related to all logic stuffed in the constructor to deal with all of the edge cases needed to initialize derived classes. So I think the next step will be to move some of that out of the constructor and into the #field definition method. Of course, it's Ruby so that work won't happen at compile time but I'm hoping it will help speed it up a little.

Other than that there is also the possibility of generating the #serialize, #to_s and #to_h methods as well which I haven't explored yet. Let me know what you think of it so far.

Resolves #70

@woodruffw
Copy link
Member

Thanks! I'll try and get to this tomorrow, or over the weekend.

@woodruffw woodruffw self-requested a review March 31, 2022 22:58
@apainintheneck
Copy link
Contributor Author

I managed to pull essentially all of the logic out of the constructor and moved it into the #field method and a bunch of helpers that are used to define new methods. This makes it easier to hide meta programming methods away and the performance now seems comparable to the master branch. Also, each variable reader method now evaluates lazily which is kind of necessary since everything was moved out of the constructor.

@apainintheneck apainintheneck marked this pull request as ready for review April 4, 2022 01:41
@apainintheneck
Copy link
Contributor Author

apainintheneck commented Apr 4, 2022

One thing I'm not really sure about though is how to go about adding tests for the new DSL since most of the new methods are used to just generate methods.

Oh and I almost forgot about trying to generate the #to_s, #to_h and #serialize methods. That and dsl documentation is probably also necessary at some point. I'm gonna wait on writing that until you give me the go ahead with the dsl syntax.

From the looks of it I think that #to_h would be kind of hard but might be possible and #serialize isn't really worth it because each 3 out of 4 are different internally. Have you given any thought to those two?

@apainintheneck apainintheneck marked this pull request as draft April 4, 2022 05:59
@woodruffw
Copy link
Member

From the looks of it I think that #to_h would be kind of hard but might be possible and #serialize isn't really worth it because each 3 out of 4 are different internally. Have you given any thought to those two?

I think we can skip these for now -- I agree that #serialize isn't worth it given the different implementations, and #to_h will require more thought.

Copy link
Member

@woodruffw woodruffw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall DSL looks good to me; great work!

I'll do a more in-depth review tonight + add some thoughts about testing the DSL specifically.

lib/macho/headers.rb Outdated Show resolved Hide resolved
@apainintheneck apainintheneck marked this pull request as ready for review April 8, 2022 19:38
@apainintheneck
Copy link
Contributor Author

Okay, I changed the name of the string types to what you suggested and I like how that turned out. I also added documentation and tests (though that was a few commits ago). I think overall this is in pretty good shape. Let me know if there is anything else I need to do before this gets merged in.

I've had a lot of fun working on this and I've learned a lot so far. This has been a great experience. 😄

@woodruffw
Copy link
Member

Okay, I changed the name of the string types to what you suggested and I like how that turned out. I also added documentation and tests (though that was a few commits ago). I think overall this is in pretty good shape. Let me know if there is anything else I need to do before this gets merged in.

Fantastic! This is looking really great; I'll do a final review pass tonight or tomorrow.

I'm also glad to hear it's been a great experience! Your help keeping ruby-macho maintained is greatly appreciated 🙂

@apainintheneck apainintheneck force-pushed the new-dsl branch 5 times, most recently from ec4a187 to 5b29cfd Compare April 16, 2022 08:30
I've sketched out a basic syntax for the dsl.

Syntax:
field :(field name), :(field type)
field :(field name), :custom, :fmt => 'format string', :size => 'binary size'
field :(field name), :(field type), :mask => 'binary mask'
Added the :lcstr and :view field types and then
updated the field and init functions to work
with these new types.
Added fields to all derived class of MachOStructure
and removed unnecessary #initialize methods.

Added a work around to the FatArch64 class
because the dsl as currently constructed doesn't
permit changing the sizes of fields that already
were declared. That should be added in the future.
Started work on the load_commands.rb file.
Last one that was changed was SegmentCommand64
and I'm working top to bottom.

Also added another variable to the #field
method that allows for additional unpacking
called :unpack that takes another format string.
This is necessary because many structures have
a 64 bit version which is only slightly different.

Also, removed the custom field because it seems
unnecessary now.

Now #format and #bytesize use lazy evaluation to
facilitate the updating of fields.
Now all fields that don't have explicit endianness
have a trailing equals sign that can be used by
the Utils#special_format() method to specify it.
- Finished updating load commands in load_commands.rb
- Moved Fields module inside the MachOStructure class.
- Added special initialization for TwolevelHintsTable
  and ToolEntries classes.
- Updated references to old SIZEOF and FORMAT constants to match
  new self.format and self.bytesize methods.
All of the logic in the constructor except for one
runtime check has been moved to the self.field() method.
This in turn calls one of a bunch of def_reader() methods
that act like attr_reader but lazily define variables
internally.

I also made all of the class methods private that are
used by the dsl.
Now instead of having types that dictated endianness
like :uint32_net endianness is just an option passed
to the field method.

Ex. field name, :uint64, :endian => :big

I also moved the utils.rb file to the top of the
includes in the macho.rb file to access
Utils#specialize_format in the #field method.

I also changed the :bin_string field type to
:binary to make it more succinct.
Added tests for all field types and options.

Also, fixed bug related to calculating
MachoStructure#min_args.
Also, updated name of binary and char string types.
Now the default :string is not null padded and
the :padding option is used to specify what type
of padding is used.
@woodruffw
Copy link
Member

Sorry, fell off my stack again. Reviewing this evening.

Copy link
Member

@woodruffw woodruffw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@woodruffw woodruffw merged commit e899a77 into Homebrew:master Apr 21, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New DSL for MachOStructure
2 participants