-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New API function mps_pool_walk #34
Conversation
baa0826
to
23c1f38
Compare
78a905a
to
f48c215
Compare
a6e93c9
to
37b0ee3
Compare
e0a717d
to
d542200
Compare
Please tell me if there is anything I can do to help. |
Since this pull request is (partly) aimed at solving your problem, maybe you can have a look and check that it would have done so! Also, if you'd like to review the changes, that would be helpful — in particular, is the documentation for the new feature clear and easy to follow? |
Thank you for the updates! I have looked through the suggested changes (mostly focusing on documentation), and it looks good to me. The documentation is clear and easy to follow. The only thing that might be worth pointing out in the documentation is that if the area scanning function modifies any references, it should scan them after they have been modified. If I understand the semantics of the summaries correctly, it would not break the MPS if the area scanning function scanned some references before and after they are updated, just cause some additional scanning in future collections. One detail that might be relevant to document is whether or not it is possible to allocate memory from the MPS from the area scanning function. I assume the answer is "no", since the MPS is not reentrant in general. If that is the case, the documentation is fine as it is. If it is possible in some circumstances, it would be useful to note that there (if this would be possible, I could probably avoid an extra heap traversal in some cases. I assume other use-cases would benefit as well). I think this PR is enough to implement what I need in terms of updating the heap. With it, I will be able to find and update all references in the entire program, so I think it would cover almost any use case I might come up with as well. Allocating memory in the area scan function would probably help in some situations, but I am able to work around that limitation, so this PR does not need to be extended for that. |
Good point — I will emphasize that in the docs.
That's right — it's not possible to call back into the MPS from the scanning function. This is mentioned in the Cautions section of the Object formats chapter, but there ought to be something similar in the Scanning chapter since the constraints are not identical.
This is interesting — can you describe the use case? (For future reference.) |
This would be relevant when a field has been added to a type, so that I need to grow objects of that type. Currently, I can solve this by: 1: walking the heap to find all instances of the object and recording them somewhere. 2: allocating copies of these objects (after walking the heap). 3: walking the heap again, updating all references to the copied objects. If it was possible to allocate memory while walking the heap, I could probably merge it into one heap walk. Each time There are still some details that I need to work out here to get all edge cases correct, for example when I have a linked structure of objects that all need to be copied. The details for this depends on whether or not allocations done during the heap walk will be walked or not. Regardless of the semantics, I think it would be doable. But don't worry too much about this now. The three step process is probably fast enough (updating a running program does not happen too often, so if it takes an extra 100 ms it is fine), and I can probably (ab)use the allocation point protocol to get a buffer I can allocate some objects from without reentering the MPS as a starting point for further optimization/experimentation if I need more speed. |
This will allow us to reuse the scanning protocol with an arbitrary area scanning function (replacing traceFormatScan) in order to implement formatted object walking without an extra segment method. Don't insist on scanning only grey segments: we want to be able to reuse the scan protocol for walking, when the segments are black.
d542200
to
c05079f
Compare
c05079f
to
1dd501e
Compare
For the record, "your problem" refers to a detailed mail thread starting at https://info.ravenbrook.com/mail/2020/08/20/21-01-34/0.txt |
As part of work on #110 Ravenbrook is assessing the risk to Configura of all changes since 1.115 . This is a risky change because it touches the tracer. It's an important change because it could replace the custom "transforms" code deployed at Configura as part of resolving #110. Therefore we'd like to put this through formal review, even though it's already merged. Executing proc.review.entry.
The mail thread leading to this work, discovered by
|
job004090 is a source for the review, because we must ensure we don't regress. |
[Actually written by @rptb1 with @thejayps ] Executing proc.review.plan:
|
[Actually written by @rptb1 with @thejayps ]
Possible PI here to add more detail to proc.review.plan for size estimation. |
[Actually written by @rptb1 with @thejayps ] More possible PI: Discussing how we paired on planning here, we could perhaps have a short guide on how to use MPS procedure documents effectively, so that PIs take hold. Repeatedly re-reading the procedure and all of its bullet points is key -- treating the procedure as a checklist of what's been done more than set of steps. |
I read the mail thread with a focus on requirements. |
On 2023-03-01 21:08, UNAA008 wrote:
I read the mail thread with a focus on requirements.
Here's what I wrote
Review_mps_pool_walk_2023-03.PDF
<https://github.com/Ravenbrook/mps/files/10865310/Review_mps_pool_walk_2023-03.PDF>
Apologies that something has disabled the hyperlinks in this rendering.
You could attach the HTML.
|
GITHUB won't allow HTML attachments but this ZIP file contains the HTML and CSS. |
Executing proc.review.kickoff
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent most of the review time examining test code to see what requirements were explicitly checked.
AMCSSTH.C
Checks that "walking works" while other threads continue to allocate.
I don't think walked objects are modified during walking.
Tests 104.C and 97.C are better documented and appear to address req.walk.all and req.walk.modify.
It isn't clear if req.walk.examine or req.walk.perf are tested.
I can't see any tests for serialisation.
In the documentation I noticed that 43.5 References contains a broken hyperlink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 minor defects found: details are in in-line comments.
:c:data:`closure` is an arbitrary pointer that will be passed to | ||
:c:data:`scan_area`. | ||
|
||
The scanning function is called multiple times with disjoint areas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m: clarity - although the documentation is clear in explaining what the scanner must do, it's less clear what the scanner might be expected or permitted to do in common applications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#33 title mentions "common applications" and this is a reference to that. This is referring to the manual's documentation of mps_pool_walk. Something like "This function can be used to update strawberries to raspberries everywhere in the heap." This might be a general comment on the manual in fact, and lead to some rules for how to write the manual. Quite cheap to add a few sentences. Writing out a detailed use case might be something for the Scheme example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Executing proc.review.check
- C. In https://info.ravenbrook.com/mail/2020/08/31/07-39-12/0/, we should consider what it would take to walk objects without parking the arena. It could be a test of multiple traces.
- C. In https://info.ravenbrook.com/mail/2020/08/23/16-19-00/0/, we should study Storm's reloading and see if it could help Clasp or meet Clasp's loading requirements.
- C. In https://info.ravenbrook.com/mail/2020/08/25/07-06-48/0/ there is an implied requirement to remove all protection in the MPS. I think we provided a "portmortem" type of call to do this for Configura. The next collection will have to scan the world, but maybe that's acceptable sometimes.
- m. design.mps.walk.sol.walk.all says it only visits live objects. How does it determine that? Does it mean reachable? rule.generic.clear
- m.
Line 124 in 1dd501e
* .assume.parked: The root walker must be invoked with a parked - m.
Line 124 in 1dd501e
* .assume.parked: The root walker must be invoked with a parked - Checking incomplete but stopped at 16:13 for tea and logging.
|
||
_`.case.serialize`: A language runtime that offers serialization and | ||
deserialization of the heap will need to walk all formatted objects in | ||
order to identify references to globals (during serialization) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. "globals" not defined or linked. rule.generic.clear.
------------ | ||
|
||
_`.req.walk.all`: It must be possible for the client program to visit | ||
all automatically managed formatted objects using a callback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. Why "using a callback"? Do we mean "apply a visitor function"? rule.generic.clear
_`.req.walk.all`: It must be possible for the client program to visit | ||
all automatically managed formatted objects using a callback. | ||
|
||
_`.req.walk.assume-format`: The callback should not need to switch on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. What does it mean to "switch on the format"? rule.generic.clear
_`.req.walk.modify`: It must be possible for the callback to modify | ||
the references in the objects. | ||
|
||
_`.req.walk.overhead`: The overhead of calling the callback should be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. What does this mean and why? rule.generic.clear
* white set means that the MPS_FIX1 test will always fail and | ||
* _mps_fix2 will never be called. */ | ||
res = TraceCreate(&trace, arena, TraceStartWhyWALK); | ||
/* Fail if no trace available. Unlikely due to .assume.parked. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. It's more than unlikely. Could add NOTREACHED to this case. rule.code.minimal
if (res != ResOK) | ||
return res; | ||
trace->white = ZoneSetEMPTY; | ||
trace->state = TraceFLIPPED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
M. There's an assumption here about how traces work and that you can force flip a trace safely by updating a field and not going through TraceFlip
. The assumption is not documented or linked to e.g. traceFlip, or the design of the tracer. Could go badly wrong. rule.code.assume.
arena->flippedTraces = TraceSetAdd(arena->flippedTraces, trace); | ||
ts = TraceSetSingle(trace); | ||
|
||
ScanStateInit(&ss, ts, arena, RankEXACT, trace->white); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. Why RankEXACT
? Probably has no effect, but if so, we should say so. rule.generic.clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raise: Created issue #195
/* TraceScanFormat -- scan a formatted area of memory for references | ||
* | ||
* This is a wrapper for format scanning functions, which should not | ||
* otherwise be called directly from within the MPS. This function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. Why should they not be called? rule.generic.clear
|
||
/* Synthesize a flipped trace with an empty white set. The empty | ||
* white set means that the MPS_FIX1 test will always fail and | ||
* _mps_fix2 will never be called. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. Is this dependency documented at both ends? rule.code.deps.
Executing proc.review.log
|
Executing proc.review.plan
|
Executing proc.review.kickoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1M 1m 2c 1q:
M: numbered tests eg 104.c and 97.c which were changed in this merge are not documented in the manual or source code index. How does a developer know that they might need to change an arbitrarily named piece of test source when in order to test new functionality? This introduces a risk that tests become out of date and fail to find defects. It also presents a danger in that reviewers can't easily identify whether the appropriate test files have been modified when changes are reviewed.
m: without considering the numbered tests, test coverage for pool_walk is introduced in amcss, amcssth and sncss. Why were these chosen? There may be defects for other pools and configurations that don't test pool_walk, even though the intention for the pool_walk functionality is to make it as far as possible pool-agnostic. If this decision was documented, I haven't yet found the documentation.
comment: although it appears the intention is for mps_pool_walk to replace mps_arena_formatted_objects_walk, which is described as deprecated in the merge, test coverage for the latter is preserved where it exists.
@@ -343,11 +343,10 @@ _`.interface.tags.alloc`: Two functions to extend the existing | |||
``mps_alloc()`` (request.???.??? proposes to remove the varargs) | |||
|
|||
``void (*mps_objects_step_t)(mps_addr_t addr, size_t size, mps_fmt_t format, mps_pool_t pool, void *tag_data, void *p)`` | |||
``void mps_pool_walk(mps_arena_t arena, mps_pool_t pool, mps_objects_step_t step, void *p)`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: why was mps_pool_walk no longer included in this document? (although a single reference to mps_pool_walk does appear in this document in master)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the diffs might be misleading in some way. Needs investigation. How could this be deleted before it existed?
Visit all :term:`formatted objects` in a :term:`pool`. The pool | ||
must be :term:`automatically managed <automatic memory | ||
management>`. The pool's :term:`arena` must be in the | ||
:term:`parked state`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment: this document does not explain that mps_pool_walk replaces the deprecated function mps_arena_formatted_objects_walk (although the description of the latter in deprecated.rst advises users to use mps_pool_walk instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Executing proc.review.check
- Start time 14:09.
- End time 14:58
- 11 minor, 3 comments
:c:data:`closure` is an arbitrary pointer that will be passed to | ||
:c:data:`scan_area`. | ||
|
||
The scanning function is called multiple times with disjoint areas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#33 title mentions "common applications" and this is a reference to that. This is referring to the manual's documentation of mps_pool_walk. Something like "This function can be used to update strawberries to raspberries everywhere in the heap." This might be a general comment on the manual in fact, and lead to some rules for how to write the manual. Quite cheap to add a few sentences. Writing out a detailed use case might be something for the Scheme example.
AVERT(ScanState, ss); | ||
AVER(refIO != NULL); | ||
|
||
NOTREACHED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C. Because walkNoFix isn't reached, the pools aren't doing any real garbage collecting. We're only calling their scanners. So mps_pool_walk could be used to try out multiple traces incrementally.
/* walkNoFix -- third-stage fix function for poolWalk. | ||
* | ||
* The second-stage fix is not called via poolWalk; so this is not | ||
* called either. The NOTREACHED checks that this is the case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. The statement "is not called" needs to link to why that is (currently design.mps.walk.sol.walk.maint), in the design, and the other code that ensures it. rule.code.dep
} | ||
|
||
|
||
/* poolWalkScan -- format scanner for poolWalk */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. Explain why this exists. (To unpack and pass the closure to the area scanner.) rule.generic.clear, rule.code.justified
@@ -402,6 +402,9 @@ typedef struct ScanStateStruct { | |||
Sig sig; /* <design/sig> */ | |||
struct mps_ss_s ss_s; /* .ss <http://bash.org/?400459> */ | |||
Arena arena; /* owning arena */ | |||
mps_fmt_scan_t formatScan; /* callback for scanning formatted objects */ | |||
mps_area_scan_t areaScan; /* ditto via the area scanning interface */ | |||
void *areaScanClosure; /* closure argument for areaScan */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C. Rather than adding fields to the ScanStateStruct that are only used in Pool Walk, why not extend ScanStateStruct locally, like rootsStepClosureStruct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. In fact, why is Pool Walk different from Roots Walk, when it's basically using the same trick? Why don't they share more code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. The comment about improvement in Root Walking seems to have been ignored. "there's no direct support for creating a trace without also condemning part of the heap. (@@@@ This looks like a useful candidate for inclusion in the future)"
|
||
AVER(totalReturn != NULL); | ||
AVERT(Seg, seg); | ||
AVERT(ScanState, ss); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. If we expect loSegScan to only be called during a Pool Walk, then it should assert that it's in a Pool Walk. rule.code.minimal.
p = BufferLimit(buffer); | ||
continue; | ||
} | ||
/* since we skip over the buffered area we are always */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. This logic is duplicated several times. Grep for the string above. Copy-paste coding?
@@ -86,6 +86,24 @@ static void test_stepper(mps_addr_t object, mps_fmt_t fmt, mps_pool_t pool, | |||
} | |||
|
|||
|
|||
/* area_scan -- area scanning function for mps_pool_walk */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. Duplicate of code in amcss.c. Move both to fmtdy,c?
@@ -312,15 +312,18 @@ called via the generic function ``SegBlacken()``. | |||
``typedef Res (*SegScanMethod)(Bool *totalReturn, Seg seg, ScanState ss)`` | |||
|
|||
_`.method.scan`: The ``scan`` method scans all the grey objects on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m. But the restriction that the objects are grey was lifted elsewhere in this branch, and is that is necessary for the implementation.
_`.method.walk.deprecated`: The ``walk`` method is deprecated along | ||
with the public functions ``mps_arena_formatted_objects_walk()`` and | ||
``mps_amc_apply()`` and will be removed along with them in a future | ||
release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C. This might be premature. Noting here because this statement and other deprecations in this branch will need to change depending.
Executing proc.review.log
NM. Similarly with the stress and coverage tests in code, though they are a bit more proximate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird extra review.
@@ -343,11 +343,10 @@ _`.interface.tags.alloc`: Two functions to extend the existing | |||
``mps_alloc()`` (request.???.??? proposes to remove the varargs) | |||
|
|||
``void (*mps_objects_step_t)(mps_addr_t addr, size_t size, mps_fmt_t format, mps_pool_t pool, void *tag_data, void *p)`` | |||
``void mps_pool_walk(mps_arena_t arena, mps_pool_t pool, mps_objects_step_t step, void *p)`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the diffs might be misleading in some way. Needs investigation. How could this be deleted before it existed?
Executing proc.review.brainstorm
|
Fixes #33 (Walk interface is not suitable for common applications)