Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(crit): add SearchPattern method on MemoryReader #163

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

behouba
Copy link
Contributor

@behouba behouba commented Mar 6, 2024

I am currently working on a feature to allow users to search for a pattern within process memory pages, as suggested by @rst0git last year. This could be used to extend checkpointctl memparse with search capability.

Currently, the method SearchPattern(pattern string) (uint64, error) takes the pattern to search as an argument and returns the address of the first match along with an error if the pattern is not found. I am not sure about the implementation yet, and I'm considering alternative approaches:

  • Instead of providing just the address of the first match, should we consider returning the entire memory pages where the pattern is found?

  • Another consideration is whether to return not just the first match but all occurrences of the pattern.

@rst0git, @adrianreber , @snprajwal PTAL.

Copy link

codecov bot commented Mar 6, 2024

Codecov Report

Attention: Patch coverage is 86.36364% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 50.06%. Comparing base (972bd31) to head (2a6f42a).

Files Patch % Lines
crit/mempages.go 86.36% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #163      +/-   ##
==========================================
+ Coverage   49.38%   50.06%   +0.67%     
==========================================
  Files          22       22              
  Lines        2359     2403      +44     
==========================================
+ Hits         1165     1203      +38     
- Misses       1058     1062       +4     
- Partials      136      138       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

crit/mempages.go Outdated Show resolved Hide resolved
@adrianreber
Copy link
Member

Another consideration is whether to return not just the first match but all occurrences of the pattern.

That would kind of match the behaviour of grep. That is also what I would expect when searching for a pattern.

Instead of providing just the address of the first match, should we consider returning the entire memory pages where the pattern is found?

Maybe not as default, but with an additional option. Maybe like the context option from grep. Where you can say how many lines before and after the match. A whole page on the terminal sounds a lot, so if you let the user select how many bytes to show, that sounds like a useful option to me.

@behouba behouba force-pushed the search-memory-pages branch 3 times, most recently from 2ae58cb to bdf1795 Compare March 11, 2024 22:25
@rst0git rst0git mentioned this pull request Mar 14, 2024
@behouba behouba force-pushed the search-memory-pages branch 5 times, most recently from 3fac1c7 to 4fc6dbc Compare March 17, 2024 04:32
@behouba
Copy link
Contributor Author

behouba commented Mar 17, 2024

I have updated the SearchPattern method. Now, the function takes both the pattern and a context (specified by the number of bytes before and after the pattern). I opted for byte counts instead of line counts, considering the content of pages-*.img files, which doesn't neatly correspond to lines.
Also, the function now returns a slice of strings representing all occurrences of the pattern found in memory.

cc: @adrianreber , @rst0git , @snprajwal

crit/mempages.go Outdated Show resolved Hide resolved
@behouba behouba marked this pull request as ready for review March 17, 2024 22:55
This commit adds a new method `SearchPattern` to `MemoryReader` to search
for patterns inside the process memory pages. This method accept regular
expressions for flexible pattern matching and a context (number of bytes
before and after the pattern match).

Signed-off-by: Kouame Behouba Manasse <behouba@gmail.com>
Signed-off-by: Kouame Behouba Manasse <behouba@gmail.com>

// Read the entire pages content
buffer, err := mr.GetMemPages(startAddr, endAddr)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@behouba What would be the performance of the search functionality if we have a container checkpoint that has pages file larger than 100 GB and checkpointctl is running on a laptop with 16 GB memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this laptop will run out of memory because the function will try to read the entire 100GB at once.
Thank you @rst0git for pointing this out. I will work on an optimization alternative.

Copy link
Member

@rst0git rst0git Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this laptop will run out of memory because the function will try to read the entire 100GB at once.

I'm not sure. It looks like getPage() reads one page at a time. However, if we have a checkpoint with 100 GB pages file (≈ 26,214,400 pages), the search functionality might be very slow.

It might be useful to create a draft PR for checkpointctl so that we can evaluate the performance and optimise it.

@snprajwal
Copy link
Member

@behouba dropping a note to check if you're still working on this :)

@behouba
Copy link
Contributor Author

behouba commented Jul 15, 2024

@behouba dropping a note to check if you're still working on this :)

Hi @snprajwal! Yes, I am still working on it :)
I will soon open a draft PR on checkpointctl so we can assess the potential performance issues with the current implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants