-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugin System #144
Plugin System #144
Conversation
I accept that a 5-10x slowdown is pretty significant, but I'm also very hesitant to add arbitrary conversions, as if I start, I'll have a hard time refusing to add new ones. It is my belief that fast-export helps more users by providing generic mechanisms instead of specialized solutions. Why not investigate a Python plugin-based system which could replace the existing filter (which could provide the current functionality using a plugin which runs shell commands)? I don't get the reason for the second patch. Why would you need it as branches and branch names are preserved during conversion, what is lost without it? Please explain. |
Fair enough - a plugin based system does indeed sound like the proper solution for the dos2unix. Regarding the branch name in commit - current branches are maintained yes, but mercurial has a concept of "what branch was this commit created on", which git doesn't have a corresponding piece of information. I.e., In git you can commit on an untracked head, which you cannot do in mercurial. In my case, the branch names of merged-long-ago branches of my repository had much more meaningful information than the commits themselves, and preserving this information was important to me. |
@jhenkens : I see, I didn't know that hg tracked that meta data, interesting. Why not use git notes for this (we already do that to preserve the hg hash with the --hg-hash flag)? That way it could easily be available to tools, and we don't need to mess with the commit subject (which may be problematic if it overruns the displayed width in various tools). Such a design would require us to decide if we want a new name space ( |
I think that both of my improvements would do well as plugins, as said initially for dos2unix, actually.
Having the branch name as part of the commit message was what we wanted for our use - to have it immediately available in git show and whatnot.
But I could also see people who would like it in a git note.
I’ll see if I can figure something out for plugins. Any thoughts?
…Sent from my mobile
_________________
Johan Henkens
On Sep 21, 2018, at 7:58 AM, Frej Drejhammar ***@***.***> wrote:
@jhenkens : I see, I didn't know that hg tracked that meta data, interesting.
Why not use git notes for this (we already do that to preserve the hg hash with the --hg-hash flag)? That way it could easily be available to tools, and we don't need to mess with the commit subject (which may be problematic if it overruns the displayed width in various tools). Such a design would require us to decide if we want a new name space (hg-branch?) or if we should stay in the hg-name space and change the data format to a list of <key> = <value> fields (we would still have to provide a way to generate the legacy format).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
A generic plugin mechanism that can be used for both massaging commit messages and files sounds very tempting. Ideally plugins should should not require external dependencies (check if there is something already in mercurial we could use) and allow us to migrate to Python 3 when/if Mercurial ends support for Python 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it!
There are some issues:
- Please rebase this patch against master as there is no need to bring the rejected previous patch into the history (and as a bonus this patch becomes much easier to review).
- Some of the added files lack a newline at the end.
- I want a section describing how to use and create new plugins in the README.
- According to PEP 394 the shebang should say
#!/usr/bin/env python2
to get python2 in the plugins (do we need it at all, these are not stand-alone scripts?).
But overall this looks good and I see no reason for not merging an updated version as this much improves the flexibility of fast-export.
hg-fast-export.py
Outdated
|
||
for plugin in plugins: | ||
for filter in plugin['commit_message_filters']: | ||
desc = filter(desc,branch,author) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we in the future discover that we need to send more than these three values to the plugins,we'll have to update the signature for all plugins. Perhaps we should give it a {desc: <>, branch: <>, author: <> }
dictionary instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh. This idea is great. Leave the signature fixed.
filter_ret = filter_proc.poll() | ||
if filter_ret: | ||
raise subprocess.CalledProcessError(filter_ret, filter_cmd) | ||
return d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no newline
b6ffe62
to
09b9450
Compare
I've updated the MR, but left it as two commits for the time being. Once provisioinally approved I'll squash into one with a cleaner message. Let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on the documentation! The only feature i think is missing is the ability to use plugins which are located outside the fast-export tree (see the comment on the documentation).
To merge this I would like to see three commits:
-
One adding the plugin mechanisms.
-
One replacing the current shell-script-filter with the plugin. It would be very nice if the current
--filter-contents
command line flag is translated to a plugin call so we don't break existing scripts. (When this lands I plan to convert the--fe
and--e
to plugins too) -
One commit adding the
branch_name_in_commit
plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks and a request about splitting out dos2unix I really should have made for the previous version of this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Som parts of this commit should go into the previous mechanisms commit.
I think I've split it out as requested, and made the necessary changes again. Sorry about the back and forth. |
Don't worry about it, I could have noticed those things earlier. Each time I ask for changes, I worry about you getting fed up and abandoning the patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I found some more things to complain about :) I could just fix them up myself, but as you have been patient so far and I want you listed in the history as the author, I'll ask for another revision. If you want me to do them anyway, just tell me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get crashes when I try to use a contents shell filter
95fb8fb
to
29d9d78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it passes my tests too.
I did a git diff
between this version and the previous one. I discovered that some of the files have DOS line endings, which they shouldn't have. Another thing I noticed when running git diff
on the command line (compared to the gihub diff viewer) is that most changes you do to the readmes add trailing whitespace (which I don't like).
It seems like we never get this done :)
Teaches me for even attempting it on WSL. 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like a heel to nit pick about the trailing whitespace.
Check this article on how to avoid committing traling whitespace or
configure git with color.diff=auto
, color.diff.whitespace="red reverse"
and check that core.whitespace
includes blank-at-eol
to make traling whitespace stand out.
hg-fast-export.py
Outdated
parents = commit_data['parents'] | ||
author = commit_data['author'] | ||
desc = commit_data['desc'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing whitespace
@@ -0,0 +1,10 @@ | |||
## Branch Name in Commit Message | |||
|
|||
Mercurial has a much stronger notion of branches than Git, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing whitespace
## Branch Name in Commit Message | ||
|
||
Mercurial has a much stronger notion of branches than Git, | ||
and some parties may not wish to lose the branch information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
prepend or append the branch name from the mercurial | ||
commit into the commit message in Git. | ||
|
||
To use the plugin, add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line too
commit into the commit message in Git. | ||
|
||
To use the plugin, add | ||
`--plugin branch_name_in_commit=(start|end)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one too
plugins/dos2unix/README.md
Outdated
repository. | ||
|
||
To use the plugin, add | ||
`--plugin dos2unix`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
|
||
This plugin uses shell scripts in order to perform filtering of files. | ||
If your preferred scripting is done via shell, this tool is for you. | ||
Be noted, though, that this method can cause an order of magnitude slow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
Be noted, though, that this method can cause an order of magnitude slow | ||
down. For small repositories, this wont be an issue. | ||
|
||
To use the plugin, add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
down. For small repositories, this wont be an issue. | ||
|
||
To use the plugin, add | ||
`--plugin shell_filter_file_contents=path/to/shell/script.sh`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
|
||
To use the plugin, add | ||
`--plugin shell_filter_file_contents=path/to/shell/script.sh`. | ||
The filter script is supplied to the plugin option after the plugin name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
Now it looks good. Thank you for your contribution and patience. |
Throwing these out there for discussion, before I do any work to potentially isolate them if desired.
After seeing the changes to run a shell script, I tried doing it on a repo with a few hundred thousand commits, to run dos2unix on the files, and it resulted in a 5-10x slowdown, which was unnacceptable. To counter this, I perform a simple string replacement on non-binary files via python, with nearly zero slowdown.
Second, mercurial stores information historically in the branch name. For some, this information would be bad to lose, so I added a flag to either prepend or append the branch name to the commit message during the conversion.
Let me know what you think of these features.