Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin System #144

Merged
merged 5 commits into from
Dec 6, 2018
Merged

Plugin System #144

merged 5 commits into from
Dec 6, 2018

Conversation

jhenkens
Copy link
Contributor

Throwing these out there for discussion, before I do any work to potentially isolate them if desired.

After seeing the changes to run a shell script, I tried doing it on a repo with a few hundred thousand commits, to run dos2unix on the files, and it resulted in a 5-10x slowdown, which was unnacceptable. To counter this, I perform a simple string replacement on non-binary files via python, with nearly zero slowdown.

Second, mercurial stores information historically in the branch name. For some, this information would be bad to lose, so I added a flag to either prepend or append the branch name to the commit message during the conversion.

Let me know what you think of these features.

@frej frej added feature-request This is a feature request contribution-required Maintainer has no time and/or no plans to work on this, interested parties should submit a patch patch-exists A patch exists, but it requires revision before it will be accepted labels Sep 17, 2018
@frej
Copy link
Owner

frej commented Sep 17, 2018

I accept that a 5-10x slowdown is pretty significant, but I'm also very hesitant to add arbitrary conversions, as if I start, I'll have a hard time refusing to add new ones. It is my belief that fast-export helps more users by providing generic mechanisms instead of specialized solutions. Why not investigate a Python plugin-based system which could replace the existing filter (which could provide the current functionality using a plugin which runs shell commands)?

I don't get the reason for the second patch. Why would you need it as branches and branch names are preserved during conversion, what is lost without it? Please explain.

@jhenkens
Copy link
Contributor Author

Fair enough - a plugin based system does indeed sound like the proper solution for the dos2unix.

Regarding the branch name in commit - current branches are maintained yes, but mercurial has a concept of "what branch was this commit created on", which git doesn't have a corresponding piece of information. I.e., In git you can commit on an untracked head, which you cannot do in mercurial.

In my case, the branch names of merged-long-ago branches of my repository had much more meaningful information than the commits themselves, and preserving this information was important to me.

@frej
Copy link
Owner

frej commented Sep 21, 2018

@jhenkens : I see, I didn't know that hg tracked that meta data, interesting.

Why not use git notes for this (we already do that to preserve the hg hash with the --hg-hash flag)? That way it could easily be available to tools, and we don't need to mess with the commit subject (which may be problematic if it overruns the displayed width in various tools). Such a design would require us to decide if we want a new name space (hg-branch?) or if we should stay in the hg-name space and change the data format to a list of <key> = <value> fields (we would still have to provide a way to generate the legacy format).

@jhenkens
Copy link
Contributor Author

jhenkens commented Sep 21, 2018 via email

@frej
Copy link
Owner

frej commented Sep 21, 2018

A generic plugin mechanism that can be used for both massaging commit messages and files sounds very tempting.

Ideally plugins should should not require external dependencies (check if there is something already in mercurial we could use) and allow us to migrate to Python 3 when/if Mercurial ends support for Python 2.

Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

There are some issues:

  • Please rebase this patch against master as there is no need to bring the rejected previous patch into the history (and as a bonus this patch becomes much easier to review).
  • Some of the added files lack a newline at the end.
  • I want a section describing how to use and create new plugins in the README.
  • According to PEP 394 the shebang should say #!/usr/bin/env python2 to get python2 in the plugins (do we need it at all, these are not stand-alone scripts?).

But overall this looks good and I see no reason for not merging an updated version as this much improves the flexibility of fast-export.


for plugin in plugins:
for filter in plugin['commit_message_filters']:
desc = filter(desc,branch,author)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we in the future discover that we need to send more than these three values to the plugins,we'll have to update the signature for all plugins. Perhaps we should give it a {desc: <>, branch: <>, author: <> } dictionary instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh. This idea is great. Leave the signature fixed.

filter_ret = filter_proc.poll()
if filter_ret:
raise subprocess.CalledProcessError(filter_ret, filter_cmd)
return d
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no newline

@jhenkens jhenkens force-pushed the master branch 4 times, most recently from b6ffe62 to 09b9450 Compare October 29, 2018 22:43
@jhenkens
Copy link
Contributor Author

I've updated the MR, but left it as two commits for the time being. Once provisioinally approved I'll squash into one with a cleaner message.

Let me know what you think.

@jhenkens jhenkens changed the title WIP: Dos2Unix and branch name WIP: Plugin System Oct 29, 2018
Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on the documentation! The only feature i think is missing is the ability to use plugins which are located outside the fast-export tree (see the comment on the documentation).

To merge this I would like to see three commits:

  • One adding the plugin mechanisms.

  • One replacing the current shell-script-filter with the plugin. It would be very nice if the current --filter-contents command line flag is translated to a plugin call so we don't break existing scripts. (When this lands I plan to convert the --fe and --e to plugins too)

  • One commit adding the branch_name_in_commit plugin.

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
hg-fast-export.py Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
plugins/branch_name_in_commit/__init__.py Outdated Show resolved Hide resolved
@jhenkens jhenkens changed the title WIP: Plugin System Plugin System Nov 8, 2018
Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks and a request about splitting out dos2unix I really should have made for the previous version of this patch.

README.md Outdated Show resolved Hide resolved
pluginloader/__init__.py Outdated Show resolved Hide resolved
plugins/dos2unix/__init__.py Outdated Show resolved Hide resolved
plugins/dos2unix/README.md Show resolved Hide resolved
Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Som parts of this commit should go into the previous mechanisms commit.

hg-fast-export.py Outdated Show resolved Hide resolved
hg-fast-export.py Show resolved Hide resolved
plugins/branch_name_in_commit/__init__.py Outdated Show resolved Hide resolved
@jhenkens
Copy link
Contributor Author

I think I've split it out as requested, and made the necessary changes again. Sorry about the back and forth.

@frej
Copy link
Owner

frej commented Dec 1, 2018

Don't worry about it, I could have noticed those things earlier. Each time I ask for changes, I worry about you getting fed up and abandoning the patch.

Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I found some more things to complain about :) I could just fix them up myself, but as you have been patient so far and I want you listed in the history as the author, I'll ask for another revision. If you want me to do them anyway, just tell me.

.gitattributes Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
hg-fast-export.py Outdated Show resolved Hide resolved
hg-fast-export.sh Outdated Show resolved Hide resolved
Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get crashes when I try to use a contents shell filter

plugins/shell_filter_file_contents/__init__.py Outdated Show resolved Hide resolved
@jhenkens jhenkens force-pushed the master branch 2 times, most recently from 95fb8fb to 29d9d78 Compare December 2, 2018 18:12
Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it passes my tests too.

I did a git diff between this version and the previous one. I discovered that some of the files have DOS line endings, which they shouldn't have. Another thing I noticed when running git diff on the command line (compared to the gihub diff viewer) is that most changes you do to the readmes add trailing whitespace (which I don't like).

It seems like we never get this done :)

plugins/shell_filter_file_contents/README.md Outdated Show resolved Hide resolved
plugins/shell_filter_file_contents/__init__.py Outdated Show resolved Hide resolved
@jhenkens
Copy link
Contributor Author

jhenkens commented Dec 5, 2018

Teaches me for even attempting it on WSL. 😉
I've gone ahead and removed the trailing whitespace and done yet another pass of dos2unix, from a proper unix box for sanity.

Copy link
Owner

@frej frej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a heel to nit pick about the trailing whitespace.

Check this article on how to avoid committing traling whitespace or
configure git with color.diff=auto, color.diff.whitespace="red reverse" and check that core.whitespace includes blank-at-eol to make traling whitespace stand out.

parents = commit_data['parents']
author = commit_data['author']
desc = commit_data['desc']

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailing whitespace

@@ -0,0 +1,10 @@
## Branch Name in Commit Message

Mercurial has a much stronger notion of branches than Git,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailing whitespace

## Branch Name in Commit Message

Mercurial has a much stronger notion of branches than Git,
and some parties may not wish to lose the branch information
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

prepend or append the branch name from the mercurial
commit into the commit message in Git.

To use the plugin, add
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line too

commit into the commit message in Git.

To use the plugin, add
`--plugin branch_name_in_commit=(start|end)`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one too

repository.

To use the plugin, add
`--plugin dos2unix`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here


This plugin uses shell scripts in order to perform filtering of files.
If your preferred scripting is done via shell, this tool is for you.
Be noted, though, that this method can cause an order of magnitude slow
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

Be noted, though, that this method can cause an order of magnitude slow
down. For small repositories, this wont be an issue.

To use the plugin, add
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

down. For small repositories, this wont be an issue.

To use the plugin, add
`--plugin shell_filter_file_contents=path/to/shell/script.sh`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here


To use the plugin, add
`--plugin shell_filter_file_contents=path/to/shell/script.sh`.
The filter script is supplied to the plugin option after the plugin name,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

@frej
Copy link
Owner

frej commented Dec 6, 2018

Now it looks good. Thank you for your contribution and patience.

@frej frej merged commit cadcfcb into frej:master Dec 6, 2018
frej added a commit that referenced this pull request Dec 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution-required Maintainer has no time and/or no plans to work on this, interested parties should submit a patch feature-request This is a feature request patch-exists A patch exists, but it requires revision before it will be accepted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants