-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use EventLogFormat.h from upstream (GHC) #27
Comments
I agree that it would be great if we could just use the upstream version. It was a bit painful to update the header file in #29. I'm not sure what Eden or Mercury people think though. As to support older versions of GHCs, I feel we can drop support for pre-7 GHCs and just support the latest three released versions of GHC as with the established convention for Haskell packages. I'm afraid I disagree with the idea of ghc-events being tied to a specific GHC. The eventlog framework in GHC is designed to be extensible so that old tools should be able to parse new versions of the format and new tools are able to understand old log files. Also it would be inconvenient if I had to rebuild ghc-events (or threadscope) every time I switched GHC. |
ghc-events was specifically designed to be forwards and backwards compatible as far as possible. Otherwise, you have to build a specific ThreadScope binary that works with the version of GHC you're using, you need multiple ThreadScope binaries lying around, it becomes hard (or impossible) to work with old eventlog files, and so on. I can't tell you how many times I've found that I'm really glad we did it this way - it's really hard to build ThreadScope, but I can "apt-get install" it on any Linux system and it works with whatever GHC version I'm using. The level of version mismatch might lead to some missing features or degraded capabilities, but it works. Possibly we can drop support for 6.12, but the main argument for doing so would be that it's hard to test it. So long as we can test old versions I think we should continue to support them. (this argues against keeping the Eden support, but I think the responsibility for maintaining that lies with @jberthold, it's not a big deal for us to keep it.) |
Oh, and BTW testing support for old versions really just means keeping around old eventlog files, we don't necessarily need access to a working GHC. |
Thanks @simonmar for making the point about compatibility with older formats. Dropping it means dropping a feature. I don't think it is hard to keep a single header file full of numbers in sync with GHC versions (sometimes there is a lot more to do to keep GHC-internal code in sync...). The design was chosen for a reason, it should not be thrown overboard for minor reasons. For the 6.12.* versions, I do think it would be acceptable to drop support (and bump the major version). This might simplify the code going forward. However, that simplification might not be a big win either so I don't see a pressing need to slash down working code. I cannot speak for Mercury but I believe the event definitions are useful. Would be great to have some documentation about them. |
So how do we reconcile this: @maoe writes
and this @simonmar writes
Which is the authoritative document for defining events? Is it the Further, the header file has this to say:
It sounds to me like the only reliable way to do so is to never remove old events. As an aside, for me Stack has taken over managing my GHC version, my ThreadScope version along with everything else, so I never If GHC doesn't want to define Eden and Mercury events for some reason, we could always define those in local header files, while still including the GHC events from GHC itself. |
Unfortunately there's a cost to backwards compatibility, and in this case it's the extra work in maintaining the library that works with multiple versions of the eventlog format.
The one in GHC is authoritative for that version of GHC, whereas the one in
So this is why we have the
Yes, that would be another way to do it. However, it's not as good:
So I argue that the |
FWIW I'm on @simonmar's side on this one. |
I see two extreme camps here
And both have their shortcomings, namely:
I think there is a sweet spot between the two.
This way we can just take the authoritative definitions from GHC while keeping the very nice backward/forward compatibility. Also GHC doesn't need to know about Eden/Mercury. Obviously we need to revive the deprecated events in GHC and convince GHC devs to follow this convention, though. |
If that would be easier from your perspective it's OK to have the source of truth be the GHC sources. Just one thing: we have to keep the Eden and Mercury events together with the GHC events, to prevent us accidentally reusing event IDs that are taken by Eden and Mercury events. We'd have to carefully document the reason that this code is in GHC, and how to test it. |
@simonmar Why is this necessary? Currently GHC doesn't maintain Eden/Mercury events and just has comments like this: /* Range 60 - 80 is used by eden for parallel tracing
* see http://www.mathematik.uni-marburg.de/~eden/
*/
/* Range 100 - 139 is reserved for Mercury. */ My idea doesn't change these comments at all. What I'm proposing here is that we
I don't see the reason(s) why we have to keep Eden/Mercury events in GHC's header file. Could you elaborate? |
If I understand your proposal correctly, the "new workflow" is to never remove definitions from the GHC header file, by way of
That makes perfect sense to me. Given the event numbers are reserved once and then fixed, no holes should result (and it is just a few numbers after all).
Again, if I understand your proposal correctly, this question only comes up because you wish to copy the file from GHC to this library, because this library will always have to have the full definition. Therefore I would argue that it is not helpful to split the file in this project. Unrecoverable mistakes might result if somebody is not aware of the full picture and uses a reserved number in a GHC version that reaches a release. It is great that this common resource has been maintained for meanwhile 8 years, and things are in working state. Great that you guys take responsibility on this library, so please do not take it wrong if I argue for some points differently from what you propose. |
@jberthold Sure, I didn't propose to delete the comments about the Eden/Mercury/Perf events. We definitely should keep them to avoid event conflicts.
What I didn't get is this point. We don't need to list every Eden/Mercury/whatever events in the GHC's header file to avoid conflicts. We just need to comment the event ID regions that are reserved for external projects and this is, I believe, how GHC devs have been doing so far. When a new region needs to be reserved for a project, a developer from the project needs to talk with GHC devs and reserve a region by adding a comment line in the GHC's header, not the one in ghc-events. The full picture is always kept in GHC. So in short, I don't see how factoring out the external events into a separate file leads to event conflicts.
I'm not sure what @mboes thinks but here is my reasoning: The motivation is really about "housekeeping". Yes, I agree that the house is in fair shape but that's at the cost of maintenance. While I integrated the new heap profiling events (#29) from GHC 8.2, I thought the workflow was more cumbersome and error-prone than it should be. I had to manually resolve conflicts. The source of conflicts is not only the external events but also the events that are deprecated. Probably it's okay if the header file in question changes very infrequently. This was the case in the past but there was a change in 8.2 and I expect more in the near future as I'm planning to extend the heap profiling support and I guess @bgamari may or may not extend the framework to support the DWARF based statistical profiling. Does this answer the question? |
On 06/03/2017 05:09 PM, Mitsutoshi Aoe wrote:
Sure, I didn't propose to
delete the comments about the Eden/Mercury/Perf events. We definitely
should keep them to avoid event conflicts.
Well, then I don't see what is wrong with the current way.
Unrecoverable mistakes ...
[...] We just need to comment the event ID regions that are
reserved for external projects and this is, I believe, how GHC devs have
been doing so far. [...] reserve a
region by adding a comment line in the GHC's header, not the one in
ghc-events.
The full picture is always kept in GHC.
I think I understand what you mean, but the full picture is in
`ghc-events`. There is no guarantee that it will remain complete in GHC
so nobody should be encouraged to rely on the GHC version of the file.
(which is the essence of my concern; "just" a risk, not a mistake)
So in short, I don't see how factoring out the external events into a
separate file leads to event conflicts.
The motivation is really about "housekeeping". Yes, I agree that the
house is in fair shape but that's at the cost of maintenance.
Well, of course there is this one thing: _my_ GHC fork has a version of
the file which is more similar to the `ghc-events` version.
I had to manually resolve conflicts. The source of conflicts is not only
the external events but also the events that are deprecated.
Agree, it had diverged in comments and deprecations (FWIW the deprecated
23 and 24 are in use in the parallel RTS tracing, I have different code
in the `#if 0`), and the thread states deserve commenting.
Does this answer the question?
Well, partly...
Sure: In the end, I may as well just `#include` a separate header file
for Eden events in my GHC version as well (the same file as in
`ghc-events`; not much more work.
I am not convinced this change would be a big win, though.
|
@maoe yup, sounds good to me. |
Ok, if we block out the reserved parts of the range with comments in the GHC source, then that's fine. I was just concerned that we might re-use an Eden or a Mercury event ID in GHC by mistake. |
@jberthold Sorry I still don't get this part. What is the risk that increases in the new workflow? I think people should use the latest GHC's header (assuming GHC follows the new workflow) and nobody should use We can put a big caution message in the README encouraging people who consider adding new events consult GHC devs.
I guess this is because you use the
Yup. I think that's a great idea. |
GHC now uses a python script to generate the header files: https://gitlab.haskell.org/ghc/ghc/-/blob/master/rts/gen_event_types.py |
The project currently has its own copy of
EventLogFormat.h
that ships with GHC. This strikes me as an unfortunate situation: if the upstream version of that file as found in GHC is to be considered the authoritative definition of what events are supported, then we shouldn't be forking our own copy.Now, the trouble with switching to the upstream version is that it is a lot smaller. Many long ago deprecated events have been removed entirely. The Mercury events are not defined. Neither are the Eden events. But at least this way we know for sure that GHC and ghc-events agree about event definitions because they share the exact same interface file.
Further, this looks like a good opportunity to remove support for events that are no longer supported. Should we really still be aiming for compatibility all the way to 6.12.* along with the funny quirks we encounter in those compilers? Does Mercury really still use ghc-events? Is Eden still an active project?
Another point is that if we use whatever
EventLogFormat.h
we find in the currently installed GHC, then the resultingghc-events
binary is tuned for that installed GHC version and only that GHC version. But that's a feature. Some of the complexity in the existing code base stems from having a single binary support eventlogs produced by any version of GHC, past present and future. Which makes sense in a pre project environments world, which both cabal-install and Stack now support. But with cabal-install and Stack projects, it makes sense to have a project local ghc-events that is guaranteed to work well with the eventlogs that are created as part of that project and the GHC version used in that project.The text was updated successfully, but these errors were encountered: