Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New API function mps_pool_walk #34

Merged
merged 5 commits into from
Jan 17, 2022
Merged

New API function mps_pool_walk #34

merged 5 commits into from
Jan 17, 2022

Conversation

gareth-rees
Copy link
Member

Fixes #33 (Walk interface is not suitable for common applications)

@gareth-rees gareth-rees force-pushed the branch/2020-08-31/walk branch 8 times, most recently from baa0826 to 23c1f38 Compare August 31, 2020 16:44
@gareth-rees gareth-rees marked this pull request as ready for review August 31, 2020 17:40
code/trace.c Outdated Show resolved Hide resolved
@gareth-rees gareth-rees force-pushed the branch/2020-08-31/walk branch 3 times, most recently from 78a905a to f48c215 Compare September 7, 2020 17:58
@gareth-rees gareth-rees force-pushed the branch/2020-08-31/walk branch 2 times, most recently from a6e93c9 to 37b0ee3 Compare September 26, 2020 16:45
@gareth-rees gareth-rees force-pushed the branch/2020-08-31/walk branch 2 times, most recently from e0a717d to d542200 Compare September 4, 2021 15:59
@fstromback
Copy link
Contributor

Please tell me if there is anything I can do to help.

code/amcss.c Show resolved Hide resolved
@gareth-rees
Copy link
Member Author

Please tell me if there is anything I can do to help.

Since this pull request is (partly) aimed at solving your problem, maybe you can have a look and check that it would have done so! Also, if you'd like to review the changes, that would be helpful — in particular, is the documentation for the new feature clear and easy to follow?

@fstromback
Copy link
Contributor

Thank you for the updates!

I have looked through the suggested changes (mostly focusing on documentation), and it looks good to me. The documentation is clear and easy to follow. The only thing that might be worth pointing out in the documentation is that if the area scanning function modifies any references, it should scan them after they have been modified. If I understand the semantics of the summaries correctly, it would not break the MPS if the area scanning function scanned some references before and after they are updated, just cause some additional scanning in future collections.

One detail that might be relevant to document is whether or not it is possible to allocate memory from the MPS from the area scanning function. I assume the answer is "no", since the MPS is not reentrant in general. If that is the case, the documentation is fine as it is. If it is possible in some circumstances, it would be useful to note that there (if this would be possible, I could probably avoid an extra heap traversal in some cases. I assume other use-cases would benefit as well).

I think this PR is enough to implement what I need in terms of updating the heap. With it, I will be able to find and update all references in the entire program, so I think it would cover almost any use case I might come up with as well. Allocating memory in the area scan function would probably help in some situations, but I am able to work around that limitation, so this PR does not need to be extended for that.

@gareth-rees
Copy link
Member Author

The only thing that might be worth pointing out in the documentation is that if the area scanning function modifies any references, it should scan them after they have been modified.

Good point — I will emphasize that in the docs.

One detail that might be relevant to document is whether or not it is possible to allocate memory from the MPS from the area scanning function. I assume the answer is "no", since the MPS is not reentrant in general. If that is the case, the documentation is fine as it is. If it is possible in some circumstances, it would be useful to note that there (if this would be possible, I could probably avoid an extra heap traversal in some cases. I assume other use-cases would benefit as well).

That's right — it's not possible to call back into the MPS from the scanning function. This is mentioned in the Cautions section of the Object formats chapter, but there ought to be something similar in the Scanning chapter since the constraints are not identical.

Allocating memory in the area scan function would probably help in some situations

This is interesting — can you describe the use case? (For future reference.)

@fstromback
Copy link
Contributor

fstromback commented Sep 18, 2021

Allocating memory in the area scan function would probably help in some situations

This is interesting — can you describe the use case? (For future reference.)

This would be relevant when a field has been added to a type, so that I need to grow objects of that type. Currently, I can solve this by: 1: walking the heap to find all instances of the object and recording them somewhere. 2: allocating copies of these objects (after walking the heap). 3: walking the heap again, updating all references to the copied objects.

If it was possible to allocate memory while walking the heap, I could probably merge it into one heap walk. Each time fix would be called, I examine the object referred by the pointer, if it is an object I need to update, I make a larger copy with the required changes and replace the original with a forwarding object. When I encounter a pointer to a forwarding object, I can then replace the pointers for remaining instances quite easily (this process is quite similar to what I understand happens in the amc pool during a GC). Note: it might not be possible to use forwarding objects in this manner, as it might confuse the MPS later on, but I can make something that does not look like a forwarding object to the MPS, or use a separate data structure for the same thing.

There are still some details that I need to work out here to get all edge cases correct, for example when I have a linked structure of objects that all need to be copied. The details for this depends on whether or not allocations done during the heap walk will be walked or not. Regardless of the semantics, I think it would be doable.

But don't worry too much about this now. The three step process is probably fast enough (updating a running program does not happen too often, so if it takes an extra 100 ms it is fine), and I can probably (ab)use the allocation point protocol to get a buffer I can allocate some objects from without reentering the MPS as a starting point for further optimization/experimentation if I need more speed.

This will allow us to reuse the scanning protocol with an arbitrary area
scanning function (replacing traceFormatScan) in order to implement
formatted object walking without an extra segment method.

Don't insist on scanning only grey segments: we want to be able to
reuse the scan protocol for walking, when the segments are black.
@Ravenbot Ravenbot merged commit 79df693 into master Jan 17, 2022
@rptb1
Copy link
Member

rptb1 commented Feb 14, 2023

Since this pull request is (partly) aimed at solving your problem

For the record, "your problem" refers to a detailed mail thread starting at https://info.ravenbrook.com/mail/2020/08/20/21-01-34/0.txt

@rptb1
Copy link
Member

rptb1 commented Feb 14, 2023

As part of work on #110 Ravenbrook is assessing the risk to Configura of all changes since 1.115 . This is a risky change because it touches the tracer. It's an important change because it could replace the custom "transforms" code deployed at Configura as part of resolving #110.

Therefore we'd like to put this through formal review, even though it's already merged.

Executing proc.review.entry.

  1. Start time 18:37.
  2. proc.review.entry.criteria: I'm applying entry.universal, entry.design, entry.impl.
  3. entry.universal.author: I've asked @gareth-rees to agree to the review. I'm going to assume agreement for now since he previously requested review.on 2020-09-01.
  4. entry.universal.source-available: Source documents include a detailed mail discussion (see below) and the Configura custom transforms design and implementation. All available. Reading the mail thread will be part of proc.review.plan.homework.
  5. entry.universal.source-reviewed: Transforms were reviewed and have been in deployment for years without any failures reported. Other sources are unreviewed.
  6. entry.design.rfc: This branch introduces design.mps.walk and changes other designs. I believe these have had informal review, partly from @fstromback , and I'll include them in proc.review.plan.homework.
  7. Entry passed.
  8. End time 18:52.
  9. Entry took 15 minutes.

The mail thread leading to this work, discovered by p4 grep, is:

  1. https://info.ravenbrook.com/mail/2020/08/20/21-01-34/0.txt
  2. https://info.ravenbrook.com/mail/2020/08/23/12-48-45/0.txt
  3. https://info.ravenbrook.com/mail/2020/08/23/13-46-40/0.txt
  4. https://info.ravenbrook.com/mail/2020/08/23/16-19-00/0.txt
  5. https://info.ravenbrook.com/mail/2020/08/25/07-06-48/0.txt
  6. https://info.ravenbrook.com/mail/2020/08/25/15-26-47/0.txt
  7. https://info.ravenbrook.com/mail/2020/08/26/16-43-11/0.txt
  8. https://info.ravenbrook.com/mail/2020/08/26/21-48-50/0.txt
  9. https://info.ravenbrook.com/mail/2020/08/30/15-01-23/0.txt
  10. https://info.ravenbrook.com/mail/2020/08/30/19-17-03/0.txt
  11. https://info.ravenbrook.com/mail/2020/08/31/07-05-28/0.txt
  12. https://info.ravenbrook.com/mail/2020/08/31/07-39-12/0.txt
  13. https://info.ravenbrook.com/mail/2020/08/31/16-31-32/0.txt
  14. https://info.ravenbrook.com/mail/2020/09/03/13-02-25/0.txt
  15. https://info.ravenbrook.com/mail/2020/09/03/19-49-47/0.txt
  16. https://info.ravenbrook.com/mail/2020/09/03/21-12-11/0.txt
  17. https://info.ravenbrook.com/mail/2020/09/04/07-46-39/0.txt

@rptb1
Copy link
Member

rptb1 commented Feb 15, 2023

job004090 is a source for the review, because we must ensure we don't regress.

@thejayps thejayps added the pending Something needs doing, even if closed. label Feb 20, 2023
@thejayps
Copy link
Contributor

thejayps commented Feb 21, 2023

[Actually written by @rptb1 with @thejayps ]

Executing proc.review.plan:

  1. Started 09:23.
  2. Pairing this procedure with @thejayps so that he can observe how I do it. This is quite a complicated change so there are things to discuss. These notes will be more detailed and discursive than usual.
  3. Eyeballing the diffs, we can see that this is high risk change (touches trace.c and pools).
  4. Size of change is +1000 -300 ish. Mostly code change. Design doc change is just a copy of the issue plus references updated. The manual has some things moved around and about 100 lines of new doc. About 200 lines of new test case code. About 100 lines of misc. That leaves about 600 lines of code, so 1 h @ 10 loc/min.
  5. Thinking about roles and tactics:
  6. proc.review.role.check.correctness is most important. Part of that will be to check against the older transforms code (which we know to work).
  7. proc.review.role.check.consistency with documentation is very important.
  8. proc.review.role.check.source with the original issues: there's @fstromback 's mail thread but most importantly, against Configura's requirements
  9. That raises some extra proc.review.plan.homework: refreshing understanding Configura's requirements. I'm pretty sure Configura would be OK with us documenting their requirements for this in public. We should ask.
  10. Thinking about skills and knowledge, roughly: @rptb1 for correctness, @thejayps for consistency, @UNAA008 on sources. To be clear, we should all do the homework, but @UNAA008 's checking role will be to check between that stuff and the change. Tracking down transforms and Configura requirements is probably homework for @rptb1 and @thejayps : check Configura current usage with them and source code .
  11. proc.review.plan.tactics: Using 10 loc/minute rule: Code checking is about 1 h @ 10 loc/min for code correcness. The rest is all smaller. It's feasable to review this in one session. We might spill over.
  12. proc.review.plan.schedule: We discussed scheduling in a meeting yesterday. Not this week. Maybe Thursday 2023-03-02 13:00?
  13. proc.review.plan.source: See New API function mps_pool_walk #34 (comment) plus Walk interface is not suitable for common applications #33 plus documents @rptb1 digs up from Transforms during homework.
  14. proc.review.plan.homework: We all need to read the requirements: mail thread, issue, whatever is implied by existing Configura usage. I don't think there's anything external to learn about.
  15. @rptb1 will write an invitation mail later today.
  16. Planning finished 10:01.
  17. Planning took 30 mins.

@thejayps
Copy link
Contributor

thejayps commented Feb 21, 2023

[Actually written by @rptb1 with @thejayps ]

  1. Size of change is +1000 -300 ish. Mostly code change. Design doc change is just a copy of the issue plus references updated. The manual has some things moved around and about 100 lines of new doc. About 200 lines of new test case code. About 100 lines of misc. That leaves about 600 lines of code, so 1 h @ 10 loc/min.

Possible PI here to add more detail to proc.review.plan for size estimation.

@thejayps
Copy link
Contributor

thejayps commented Feb 21, 2023

[Actually written by @rptb1 with @thejayps ]

More possible PI: Discussing how we paired on planning here, we could perhaps have a short guide on how to use MPS procedure documents effectively, so that PIs take hold. Repeatedly re-reading the procedure and all of its bullet points is key -- treating the procedure as a checklist of what's been done more than set of steps.

@UNAA008
Copy link
Contributor

UNAA008 commented Mar 1, 2023

I read the mail thread with a focus on requirements.
Here's what I wrote
Review_mps_pool_walk_2023-03.PDF
Apologies that something has disabled the hyperlinks in this rendering.

@Ravenbot
Copy link
Member

Ravenbot commented Mar 2, 2023 via email

@UNAA008
Copy link
Contributor

UNAA008 commented Mar 2, 2023

RMPW_2023-03.ZIP

GITHUB won't allow HTML attachments but this ZIP file contains the HTML and CSS.

@rptb1
Copy link
Member

rptb1 commented Mar 2, 2023

Executing proc.review.kickoff

  1. Start time 15:03.
  2. Logging meeting set for 16:15.
  3. Kickoff took 9 mins.

Copy link
Contributor

@UNAA008 UNAA008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent most of the review time examining test code to see what requirements were explicitly checked.
AMCSSTH.C
Checks that "walking works" while other threads continue to allocate.
I don't think walked objects are modified during walking.
Tests 104.C and 97.C are better documented and appear to address req.walk.all and req.walk.modify.

It isn't clear if req.walk.examine or req.walk.perf are tested.
I can't see any tests for serialisation.

In the documentation I noticed that 43.5 References contains a broken hyperlink.

Copy link
Contributor

@thejayps thejayps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 minor defects found: details are in in-line comments.

code/walk.c Show resolved Hide resolved
:c:data:`closure` is an arbitrary pointer that will be passed to
:c:data:`scan_area`.

The scanning function is called multiple times with disjoint areas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: clarity - although the documentation is clear in explaining what the scanner must do, it's less clear what the scanner might be expected or permitted to do in common applications.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#33 title mentions "common applications" and this is a reference to that. This is referring to the manual's documentation of mps_pool_walk. Something like "This function can be used to update strawberries to raspberries everywhere in the heap." This might be a general comment on the manual in fact, and lead to some rules for how to write the manual. Quite cheap to add a few sentences. Writing out a detailed use case might be something for the Scheme example.

Copy link
Member

@rptb1 rptb1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executing proc.review.check

  1. C. In https://info.ravenbrook.com/mail/2020/08/31/07-39-12/0/, we should consider what it would take to walk objects without parking the arena. It could be a test of multiple traces.
  2. C. In https://info.ravenbrook.com/mail/2020/08/23/16-19-00/0/, we should study Storm's reloading and see if it could help Clasp or meet Clasp's loading requirements.
  3. C. In https://info.ravenbrook.com/mail/2020/08/25/07-06-48/0/ there is an implied requirement to remove all protection in the MPS. I think we provided a "portmortem" type of call to do this for Configura. The next collection will have to scan the world, but maybe that's acceptable sometimes.
  4. m. design.mps.walk.sol.walk.all says it only visits live objects. How does it determine that? Does it mean reachable? rule.generic.clear
  5. m.

    mps/code/walk.c

    Line 124 in 1dd501e

    * .assume.parked: The root walker must be invoked with a parked
    impl.c.walk.assume.parked does not explain why the root walker must be invoked with a parked arena. rule.generic.clear
  6. m.

    mps/code/walk.c

    Line 124 in 1dd501e

    * .assume.parked: The root walker must be invoked with a parked
    implc.w.alk.assume.parked is assumed by the pool walker but only talks about the roots walker. rule.generic.self
  7. Checking incomplete but stopped at 16:13 for tea and logging.

design/walk.txt Show resolved Hide resolved

_`.case.serialize`: A language runtime that offers serialization and
deserialization of the heap will need to walk all formatted objects in
order to identify references to globals (during serialization) and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. "globals" not defined or linked. rule.generic.clear.

------------

_`.req.walk.all`: It must be possible for the client program to visit
all automatically managed formatted objects using a callback.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. Why "using a callback"? Do we mean "apply a visitor function"? rule.generic.clear

_`.req.walk.all`: It must be possible for the client program to visit
all automatically managed formatted objects using a callback.

_`.req.walk.assume-format`: The callback should not need to switch on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. What does it mean to "switch on the format"? rule.generic.clear

_`.req.walk.modify`: It must be possible for the callback to modify
the references in the objects.

_`.req.walk.overhead`: The overhead of calling the callback should be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. What does this mean and why? rule.generic.clear

* white set means that the MPS_FIX1 test will always fail and
* _mps_fix2 will never be called. */
res = TraceCreate(&trace, arena, TraceStartWhyWALK);
/* Fail if no trace available. Unlikely due to .assume.parked. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. It's more than unlikely. Could add NOTREACHED to this case. rule.code.minimal

if (res != ResOK)
return res;
trace->white = ZoneSetEMPTY;
trace->state = TraceFLIPPED;
Copy link
Member

@rptb1 rptb1 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M. There's an assumption here about how traces work and that you can force flip a trace safely by updating a field and not going through TraceFlip. The assumption is not documented or linked to e.g. traceFlip, or the design of the tracer. Could go badly wrong. rule.code.assume.

arena->flippedTraces = TraceSetAdd(arena->flippedTraces, trace);
ts = TraceSetSingle(trace);

ScanStateInit(&ss, ts, arena, RankEXACT, trace->white);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. Why RankEXACT? Probably has no effect, but if so, we should say so. rule.generic.clear

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise: Created issue #195

/* TraceScanFormat -- scan a formatted area of memory for references
*
* This is a wrapper for format scanning functions, which should not
* otherwise be called directly from within the MPS. This function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. Why should they not be called? rule.generic.clear


/* Synthesize a flipped trace with an empty white set. The empty
* white set means that the MPS_FIX1 test will always fail and
* _mps_fix2 will never be called. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. Is this dependency documented at both ends? rule.code.deps.

@rptb1
Copy link
Member

rptb1 commented Mar 2, 2023

Executing proc.review.log

  1. Start time 16:15.
  2. Point of order. None of us have completed checking. @UNAA008 thinks about 80%, @thejayps thinks about 60%, @rptb1 thinks about 15% but of the most important code. We'll log what we have, and schedule more time for review checking.
  3. @UNAA008 would like to check in a pair to help with navigation.
  4. Findings: @UNAA008 1 major, 1 minor, 2 questions. @thejayps 0 major, 2 minor. @rptb1 2 major, 11 minor, 4 comments.
  5. @UNAA008 M. Not all requirements are tested. Not clear what rule that breaks.
  6. @rptb1 Nm. Requirements are not linked to tests and tests are not documented in the design.
  7. @rptb1 Nm. Test code is not linked to the requirements tested.
  8. @rptb1 C. Pool walking is using the GC mechanisms without doing any GC. This could be explained a lot more clearly in the design.
  9. @rptb1 Not only does the design talk about live objects, it not clear how the non-GC does anything about it.
  10. @rptb1 m The design doesn't mention that the client code is called on padding objects. An anti-requirement?
  11. @thejayps NI: Comments on the first line of a paragraph hide the paragraph. Would be better to comment at the end of a paragaph.
  12. @UNAA008 NI: The review issue classification is poorly presented for reference during checking.
  13. @UNAA008 Nm: There's a relationship between use cases and requirements and none of those are traceable to an origin in the documentation. @rptb1 There is a link in the references section but it's not very clear or indeed cited from the requirements.
  14. @rptb1 NM: We are missing a high-level requirements document.
  15. End time 17:15.
  16. Logging took 1h.
  17. Deferring brainstorm due to pervasive fatigue.
  18. How did it happen that I underestimate the checking time?

@rptb1
Copy link
Member

rptb1 commented Mar 6, 2023

Executing proc.review.plan

  1. Start time 15:44.
  2. This is a second round of planning because we ran out of time last week. See New API function mps_pool_walk #34 (comment).
  3. @thejayps requests time to do a "walkthrough" as a pair to discuss what makes this code suitable/unsuitable for Configura's requirements and how to assess its correctness. We discussed whether that was a part of the review process. Decided to schedule that separately.
  4. Otherwise we think the same proc.review.plan.tactics apply.
  5. The checking time was off because of the complexity/subtlety of this change, so estimating by lines of code didn't work too well. We think we're about half-way through in elapsed time, so another 2-3h should do it. That's within the bounds of proc.review.plan.time.
  6. We'll attack this on Thursday afternoon in the big review time slot.
  7. Same roles as before more-or-less.
  8. We could consider the walkthrough mentioned above proc.review.plan.homework.
  9. @UNAA008 is invited, but optionally. We think his role is mostly fulfilled and he's kinda busy.
  10. End time 15:58. Planning took 14 minutes.

@rptb1
Copy link
Member

rptb1 commented Mar 9, 2023

Executing proc.review.kickoff

  1. Start time 13:58
  2. @rptb1 focus on correctness, @thejayps focus on consistency, particularly between manual, design, code, test cases.
  3. Logging meeting at 15:00.
  4. End time 14:08.
  5. Kickoff took 10 minutes.

Copy link
Contributor

@thejayps thejayps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1M 1m 2c 1q:
M: numbered tests eg 104.c and 97.c which were changed in this merge are not documented in the manual or source code index. How does a developer know that they might need to change an arbitrarily named piece of test source when in order to test new functionality? This introduces a risk that tests become out of date and fail to find defects. It also presents a danger in that reviewers can't easily identify whether the appropriate test files have been modified when changes are reviewed.

m: without considering the numbered tests, test coverage for pool_walk is introduced in amcss, amcssth and sncss. Why were these chosen? There may be defects for other pools and configurations that don't test pool_walk, even though the intention for the pool_walk functionality is to make it as far as possible pool-agnostic. If this decision was documented, I haven't yet found the documentation.

comment: although it appears the intention is for mps_pool_walk to replace mps_arena_formatted_objects_walk, which is described as deprecated in the merge, test coverage for the latter is preserved where it exists.

@@ -343,11 +343,10 @@ _`.interface.tags.alloc`: Two functions to extend the existing
``mps_alloc()`` (request.???.??? proposes to remove the varargs)

``void (*mps_objects_step_t)(mps_addr_t addr, size_t size, mps_fmt_t format, mps_pool_t pool, void *tag_data, void *p)``
``void mps_pool_walk(mps_arena_t arena, mps_pool_t pool, mps_objects_step_t step, void *p)``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: why was mps_pool_walk no longer included in this document? (although a single reference to mps_pool_walk does appear in this document in master)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the diffs might be misleading in some way. Needs investigation. How could this be deleted before it existed?

Visit all :term:`formatted objects` in a :term:`pool`. The pool
must be :term:`automatically managed <automatic memory
management>`. The pool's :term:`arena` must be in the
:term:`parked state`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: this document does not explain that mps_pool_walk replaces the deprecated function mps_arena_formatted_objects_walk (although the description of the latter in deprecated.rst advises users to use mps_pool_walk instead)

Copy link
Member

@rptb1 rptb1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executing proc.review.check

  1. Start time 14:09.
  2. End time 14:58
  3. 11 minor, 3 comments

:c:data:`closure` is an arbitrary pointer that will be passed to
:c:data:`scan_area`.

The scanning function is called multiple times with disjoint areas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#33 title mentions "common applications" and this is a reference to that. This is referring to the manual's documentation of mps_pool_walk. Something like "This function can be used to update strawberries to raspberries everywhere in the heap." This might be a general comment on the manual in fact, and lead to some rules for how to write the manual. Quite cheap to add a few sentences. Writing out a detailed use case might be something for the Scheme example.

AVERT(ScanState, ss);
AVER(refIO != NULL);

NOTREACHED;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C. Because walkNoFix isn't reached, the pools aren't doing any real garbage collecting. We're only calling their scanners. So mps_pool_walk could be used to try out multiple traces incrementally.

/* walkNoFix -- third-stage fix function for poolWalk.
*
* The second-stage fix is not called via poolWalk; so this is not
* called either. The NOTREACHED checks that this is the case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. The statement "is not called" needs to link to why that is (currently design.mps.walk.sol.walk.maint), in the design, and the other code that ensures it. rule.code.dep

}


/* poolWalkScan -- format scanner for poolWalk */
Copy link
Member

@rptb1 rptb1 Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. Explain why this exists. (To unpack and pass the closure to the area scanner.) rule.generic.clear, rule.code.justified

@@ -402,6 +402,9 @@ typedef struct ScanStateStruct {
Sig sig; /* <design/sig> */
struct mps_ss_s ss_s; /* .ss <http://bash.org/?400459> */
Arena arena; /* owning arena */
mps_fmt_scan_t formatScan; /* callback for scanning formatted objects */
mps_area_scan_t areaScan; /* ditto via the area scanning interface */
void *areaScanClosure; /* closure argument for areaScan */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C. Rather than adding fields to the ScanStateStruct that are only used in Pool Walk, why not extend ScanStateStruct locally, like rootsStepClosureStruct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. In fact, why is Pool Walk different from Roots Walk, when it's basically using the same trick? Why don't they share more code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. The comment about improvement in Root Walking seems to have been ignored. "there's no direct support for creating a trace without also condemning part of the heap. (@@@@ This looks like a useful candidate for inclusion in the future)"


AVER(totalReturn != NULL);
AVERT(Seg, seg);
AVERT(ScanState, ss);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. If we expect loSegScan to only be called during a Pool Walk, then it should assert that it's in a Pool Walk. rule.code.minimal.

p = BufferLimit(buffer);
continue;
}
/* since we skip over the buffered area we are always */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. This logic is duplicated several times. Grep for the string above. Copy-paste coding?

@@ -86,6 +86,24 @@ static void test_stepper(mps_addr_t object, mps_fmt_t fmt, mps_pool_t pool,
}


/* area_scan -- area scanning function for mps_pool_walk */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. Duplicate of code in amcss.c. Move both to fmtdy,c?

@@ -312,15 +312,18 @@ called via the generic function ``SegBlacken()``.
``typedef Res (*SegScanMethod)(Bool *totalReturn, Seg seg, ScanState ss)``

_`.method.scan`: The ``scan`` method scans all the grey objects on the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m. But the restriction that the objects are grey was lifted elsewhere in this branch, and is that is necessary for the implementation.

_`.method.walk.deprecated`: The ``walk`` method is deprecated along
with the public functions ``mps_arena_formatted_objects_walk()`` and
``mps_amc_apply()`` and will be removed along with them in a future
release.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C. This might be premature. Noting here because this statement and other deprecations in this branch will need to change depending.

@rptb1
Copy link
Member

rptb1 commented Mar 9, 2023

Executing proc.review.log

  1. Start time 15:07.
  2. Counting issues: @thejayps 1M, 1m, 2C, 1Q. @rptb1 11m, 4C.
  3. @rptb1 Nm. Pool Walk relies on this test in https://github.com/Ravenbrook/mps/blob/branch/2020-08-31/walk/code/poolams.c#L1348-L1352 to scan all objects. Is this safe? What are the colour invariants?
  4. Nm. Also, the above code is very unclear. Needs a comment.
  5. @thejayps NC. We can conclude from our review and discussions with Configura that the Pool Walk change is only a small risk to Configura: they can continue to use what they have as long as we merge Transforms.
  6. We'll do brainstorm at 16:05.
  7. End time 15:55.
  8. Logging took 48 minutes.

How does a developer know that they might need to change an arbitrarily named piece of test source when in order to test new functionality?

NM. Similarly with the stress and coverage tests in code, though they are a bit more proximate.

Copy link
Member

@rptb1 rptb1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird extra review.

@@ -343,11 +343,10 @@ _`.interface.tags.alloc`: Two functions to extend the existing
``mps_alloc()`` (request.???.??? proposes to remove the varargs)

``void (*mps_objects_step_t)(mps_addr_t addr, size_t size, mps_fmt_t format, mps_pool_t pool, void *tag_data, void *p)``
``void mps_pool_walk(mps_arena_t arena, mps_pool_t pool, mps_objects_step_t step, void *p)``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the diffs might be misleading in some way. Needs investigation. How could this be deleted before it existed?

@rptb1
Copy link
Member

rptb1 commented Mar 9, 2023

Executing proc.review.brainstorm

  1. Start time 16:05.
  2. Thinking about New API function mps_pool_walk #34 (review) : This came about because RIT of the Harlequin QA "group" was not part of the MM group as such and was a kind of outside black-box tester, not subject to MM group processes or coding standards or management. There isn't any clear process at the moment for adding or updating test cases, and there should be. That makes it hard to make an improvement to add references. The design should really reference (or the other way around) the tests that test it. We could add that as a rule of designs. That would fix the problem in future, and we should probably go through all the existing tests ensuring they are connected to design.
  3. The MMQA test suite driver needs comprehending and documenting.
  4. Thinking about New API function mps_pool_walk #34 (comment) : We can make it a rule that you're not allowed to delete consistency checks (for some definition). Now just because it's a rule doesn't mean it can't happen, but when it does, it will require a great deal of justification. It might a good idea to say that they must be disabled/commented out with an explanation, at least for a long time, possibly forever, rather than deleted. What might have happened here is that the check was inconvenient and so removed, rather than thinking about what it enforces and what else depends on it. That's a big nono.
  5. The problem uncovered by this issue is that a Trace has been used in a way that the Tracer isn't designed for. A bit of hack basically, though not especially dirty. That might provoke problems with assumptions in the code and it might constrain the Tracer in future, reducing the flexibility of the system. What colour are walked objects? The Tracer believes they have a colour. This important question is not answered by the design. This might have been prevented by having a better design document for the tracer, since the current one is a mess. Prevented by insisting on better design documents, which we already do. So maybe this is just technical debt.
  6. End time 16:28. Brainstorm took 23 mins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending Something needs doing, even if closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Walk interface is not suitable for common applications
6 participants