-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(crit): add SearchPattern method on MemoryReader #163
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #163 +/- ##
==========================================
+ Coverage 49.38% 50.06% +0.67%
==========================================
Files 22 22
Lines 2359 2403 +44
==========================================
+ Hits 1165 1203 +38
- Misses 1058 1062 +4
- Partials 136 138 +2 ☔ View full report in Codecov by Sentry. |
That would kind of match the behaviour of grep. That is also what I would expect when searching for a pattern.
Maybe not as default, but with an additional option. Maybe like the context option from grep. Where you can say how many lines before and after the match. A whole page on the terminal sounds a lot, so if you let the user select how many bytes to show, that sounds like a useful option to me. |
2ae58cb
to
bdf1795
Compare
3fac1c7
to
4fc6dbc
Compare
I have updated the cc: @adrianreber , @rst0git , @snprajwal |
4fc6dbc
to
f5bc2dc
Compare
This commit adds a new method `SearchPattern` to `MemoryReader` to search for patterns inside the process memory pages. This method accept regular expressions for flexible pattern matching and a context (number of bytes before and after the pattern match). Signed-off-by: Kouame Behouba Manasse <behouba@gmail.com>
Signed-off-by: Kouame Behouba Manasse <behouba@gmail.com>
f5bc2dc
to
2a6f42a
Compare
|
||
// Read the entire pages content | ||
buffer, err := mr.GetMemPages(startAddr, endAddr) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@behouba What would be the performance of the search functionality if we have a container checkpoint that has pages file larger than 100 GB and checkpointctl is running on a laptop with 16 GB memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this laptop will run out of memory because the function will try to read the entire 100GB at once.
Thank you @rst0git for pointing this out. I will work on an optimization alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this laptop will run out of memory because the function will try to read the entire 100GB at once.
I'm not sure. It looks like getPage()
reads one page at a time. However, if we have a checkpoint with 100 GB pages file (≈ 26,214,400 pages), the search functionality might be very slow.
It might be useful to create a draft PR for checkpointctl so that we can evaluate the performance and optimise it.
@behouba dropping a note to check if you're still working on this :) |
Hi @snprajwal! Yes, I am still working on it :) |
I am currently working on a feature to allow users to search for a pattern within process memory pages, as suggested by @rst0git last year. This could be used to extend
checkpointctl memparse
with search capability.Currently, the method
SearchPattern(pattern string) (uint64, error)
takes the pattern to search as an argument and returns the address of the first match along with an error if the pattern is not found. I am not sure about the implementation yet, and I'm considering alternative approaches:Instead of providing just the address of the first match, should we consider returning the entire memory pages where the pattern is found?
Another consideration is whether to return not just the first match but all occurrences of the pattern.
@rst0git, @adrianreber , @snprajwal PTAL.