Represents the author of a change
The authors mapping between an origin and a destination
Use the default author for all the submits in the destination. Note that some destinations might choose to ignore this author and use the current user running the tool (In other words they don't allow impersonation).
authoring_class authoring.overwrite(default)
Parameter | Description |
---|---|
default | string The default author for commits in the destination |
Create an authoring object that will overwrite any origin author with noreply@foobar.com mail.
authoring.overwrite("Foo Bar <noreply@foobar.com>")
Create a new author from a string with the form 'name foo@bar.com'
author new_author(author_string)
Parameter | Description |
---|---|
author_string | string A string representation of the author with the form 'name foo@bar.com' |
new_author('Foo Bar <foobar@myorg.com>')
Use the origin author as the author in the destination, no whitelisting.
authoring_class authoring.pass_thru(default)
Parameter | Description |
---|---|
default | string The default author for commits in the destination. This is used in squash mode workflows or if author cannot be determined. |
authoring.pass_thru(default = "Foo Bar <noreply@foobar.com>")
Create an individual or team that contributes code.
authoring_class authoring.whitelisted(default, whitelist)
Parameter | Description |
---|---|
default | string The default author for commits in the destination. This is used in squash mode workflows or when users are not whitelisted. |
whitelist | sequence of string List of white listed authors in the origin. The authors must be unique |
authoring.whitelisted(
default = "Foo Bar <noreply@foobar.com>",
whitelist = [
"someuser@myorg.com",
"other@myorg.com",
"another@myorg.com",
],
)
Some repositories are not based on email but use LDAPs/usernames. This is also supported since it is up to the origin how to check whether two authors are the same.
authoring.whitelisted(
default = "Foo Bar <noreply@foobar.com>",
whitelist = [
"someuser",
"other",
"another",
],
)
The authors mapping between an origin and a destination
A console that can be used in skylark transformations to print info, warning or error messages.
Core transformations for the change metadata
Generate a message that includes a constant prefix text and a list of changes included in the squash change.
transformation metadata.squash_notes(prefix='Copybara import of the project:\n\n', max=100, compact=True, show_ref=True, show_author=True, oldest_first=False)
Parameter | Description |
---|---|
prefix | string A prefix to be printed before the list of commits. |
max | integer Max number of commits to include in the message. For the rest a comment like (and x more) will be included. By default 100 commits are included. |
compact | boolean If compact is set, each change will be shown in just one line |
show_ref | boolean If each change reference should be present in the notes |
show_author | boolean If each change author should be present in the notes |
oldest_first | boolean If set to true, the list shows the oldest changes first. Otherwise it shows the changes in descending order. |
'Squash notes' default is to print one line per change with information about the author
metadata.squash_notes("Changes for Project Foo:\n")
This transform will generate changes like:
Changes for Project Foo:
- 1234abcde second commit description by Foo Bar <foo@bar.com>
- a4321bcde first commit description by Foo Bar <foo@bar.com>
metadata.squash_notes("Changes for Project Foo:\n",
oldest_first = True,
show_author = False,
)
This transform will generate changes like:
Changes for Project Foo:
- a4321bcde first commit description
- 1234abcde second commit description
metadata.squash_notes(
prefix = 'Changes for Project Foo:',
compact = False
)
This transform will generate changes like:
Changes for Project Foo:
--
2 by Foo Baz <foo@baz.com>:
second commit
Extended text
--
1 by Foo Bar <foo@bar.com>:
first commit
Extended text
For a given change, store a copy of the author as a label with the name ORIGINAL_AUTHOR.
transformation metadata.save_author(label='ORIGINAL_AUTHOR')
Parameter | Description |
---|---|
label | string The label to use for storing the author |
Certain labels are present in the internal metadata but are not exposed in the message by default. This transformations find a label in the internal metadata and exposes it in the message. If the label is already present in the message it will update it to use the new name and separator.
transformation metadata.expose_label(name, new_name=label, separator="=", ignore_label_not_found=True)
Parameter | Description |
---|---|
name | string The label to search |
new_name | string The name to use in the message |
separator | string The separator to use when adding the label to the message |
ignore_label_not_found | boolean If a label is not found, ignore the error and continue. |
Expose a hidden label called 'REVIEW_URL':
metadata.expose_label('REVIEW_URL')
This would add it as REVIEW_URL=the_value
.
Expose a hidden label called 'REVIEW_URL' as GIT_REVIEW_URL:
metadata.expose_label('REVIEW_URL', 'GIT_REVIEW_URL')
This would add it as GIT_REVIEW_URL=the_value
.
Expose the label with a custom separator
metadata.expose_label('REVIEW_URL', separator = ': ')
This would add it as REVIEW_URL: the_value
.
For a given change, restore the author present in the ORIGINAL_AUTHOR label as the author of the change.
transformation metadata.restore_author(label='ORIGINAL_AUTHOR')
Parameter | Description |
---|---|
label | string The label to use for restoring the author |
Adds a header line to the commit message. Any variable present in the message in the form of ${LABEL_NAME} will be replaced by the corresponding label in the message. Note that this requires that the label is already in the message or in any of the changes being imported. The label in the message takes priority over the ones in the list of original messages of changes imported.
transformation metadata.add_header(text, ignore_label_not_found=False)
Parameter | Description |
---|---|
text |
string The header text to include in the message. For example '[Import of foo ${LABEL}]'. This would construct a message resolving ${LABEL} to the corresponding label. |
ignore_label_not_found |
boolean If a label used in the template is not found, ignore the error and don't add the header. By default it will stop the migration and fail. |
Adds a header to any message
metadata.add_header("COPYBARA CHANGE")
Messages like:
A change
Example description for
documentation
Will be transformed into:
COPYBARA CHANGE
A change
Example description for
documentation
Adds a header to messages that contain a label. Otherwise it skips the message manipulation.
metadata.add_header("COPYBARA CHANGE FOR ${GIT_URL}",
ignore_label_not_found = True,
)
Messages like:
A change
Example description for
documentation
GIT_URL=http://foo.com/1234```
Will be transformed into:
COPYBARA CHANGE FOR http://foo.com/1234 Example description for documentation
GIT_URL=http://foo.com/1234```
But any change without that label will not be transformed.
Removes part of the change message using a regex
transformation metadata.scrubber(regex, replacement='')
Parameter | Description |
---|---|
regex |
string Any text matching the regex will be removed. Note that the regex is runs in multiline mode. |
replacement |
string Text replacement for the matching substrings. References to regex group numbers can be used in the form of $1, $2, etc. |
When change messages are in the following format:
Public change description
This is a public description for a commit
CONFIDENTIAL:
This fixes internal project foo-bar
Using the following transformation:
metadata.scrubber('(^|\n)CONFIDENTIAL:(.|\n)*')
Will remove the confidential part, leaving the message as:
Public change description
This is a public description for a commit
The previous example is prone to leak confidential information since a developer could easily forget to include the CONFIDENTIAL label. A different approach for this is to scrub everything by default except what is explicitly allowed. For example, the following scrubber would remove anything not enclosed in tags:
metadata.scrubber('^(?:\n|.)*<public>((?:\n|.)*)</public>(?:\n|.)*$', replacement = '$1')
So a message like:
this
is
very confidential<public>but this is public
very public
</public>
and this is a secret too
would be transformed into:
but this is public
very public
Verifies that a RegEx matches (or not matches) the change message. Does not, transform anything, but will stop the workflow if it fails.
transformation metadata.verify_match(regex, verify_no_match=False)
Parameter | Description |
---|---|
regex | string The regex pattern to verify. The re2j pattern will be applied in multiline mode, i.e. '^' refers to the beginning of a file and '$' to its end. |
verify_no_match | boolean If true, the transformation will verify that the RegEx does not match. |
Check that the change message contains a text enclosed in :
metadata.verify_match("<public>(.|\n)*</public>")
Allows updating links to references in commit messages to match the destination's format. Note that this will only consider the 5000 latest commits.
referenceMigrator metadata.map_references(before, after, regex_groups={}, additional_import_labels=[])
Parameter | Description |
---|---|
before |
string Template for origin references in the change message. Use a '${reference}' token to capture the actual references. E.g. if the origin uses linkslike 'http://changes?1234', the template would be 'http://internalReviews.com/${reference}', with reference_regex = '[0-9]+' |
after |
string Format for references in the destination, use the token '${reference}' to represent the destination reference. E.g. 'http://changes(${reference})'. |
regex_groups |
dict Regexes for the ${reference} token's content. Requires one 'before_ref' entry matching the ${reference} token's content on the before side. Optionally accepts one 'after_ref' used for validation. |
additional_import_labels |
sequence of string Meant to be used when migrating from another tool: Per default, copybara will only recognize the labels defined in the workflow's endpoints. The tool will use these additional labels to find labels created by other invocations and tools. |
Finds links to commits in change messages, searches destination to find the equivalent reference in destination. Then replaces matches of 'before' with 'after', replacing the subgroup matched with the destination reference. Assume a message like 'Fixes bug introduced in origin/abcdef', where the origin change 'abcdef' was migrated as '123456' to the destination.
metadata.map_references(
before = "origin/${reference}",
after = "destination/${reference}",
regex_groups = {
"before_ref": "[0-9a-f]+",
"after_ref": "[0-9]+",
},
),
This would be translated into 'Fixes bug introduced in destination/123456', provided that a change with the proper label was found - the message remains unchanged otherwise.
Core functionality for creating migrations, and basic transformations.
Glob returns a list of every file in the workdir that matches at least one pattern in include and does not match any of the patterns in exclude.
glob glob(include, exclude=[])
Parameter | Description |
---|---|
include | sequence of string The list of glob patterns to include |
exclude | sequence of string The list of glob patterns to exclude |
Include all the files under a folder except for internal
folder files:
glob(["foo/**"], exclude = ["foo/internal/**"])
Globs can have multiple inclusive rules:
glob(["foo/**", "bar/**", "baz/**.java"])
This will include all files inside foo
and bar
folders and Java files inside baz
folder.
Globs can have multiple exclusive rules:
glob(["foo/**"], exclude = ["foo/internal/**", "foo/confidential/**" ])
Include all the files of foo
except the ones in internal
and confidential
folders
Copybara uses Java globbing. The globbing is very similar to Bash one. This means that recursive globbing for a filename is a bit more tricky:
glob(["BUILD", "**/BUILD"])
This is the correct way of matching all BUILD
files recursively, including the one in the root. **/BUILD
would only match BUILD
files in subdirectories.
While two globs can be used for matching two directories, there is a more compact approach:
glob(["{java,javatests}/**"])
This matches any file in java
and javatests
folders.
Given a list of transformations, returns the list of transformations equivalent to undoing all the transformations
sequence core.reverse(transformations)
Parameter | Description |
---|---|
transformations | sequence of transformation The transformations to reverse |
Defines a migration pipeline which can be invoked via the Copybara command.
core.workflow(name, origin, destination, authoring, transformations=[], origin_files=glob(['**']), destination_files=glob(['**']), mode="SQUASH", reversible_check=True for 'CHANGE_REQUEST' mode. False otherwise, ask_for_confirmation=False)
Parameter | Description |
---|---|
name | string The name of the workflow. |
origin | origin Where to read from the code to be migrated, before applying the transformations. This is usually a VCS like Git, but can also be a local folder or even a pending change in a code review system like Gerrit. |
destination | destination Where to write to the code being migrated, after applying the transformations. This is usually a VCS like Git, but can also be a local folder or even a pending change in a code review system like Gerrit. |
authoring | authoring_class The author mapping configuration from origin to destination. |
transformations | sequence The transformations to be run for this workflow. They will run in sequence. |
origin_files | glob A glob relative to the workdir that will be read from the origin during the import. For example glob(["**.java"]), all java files, recursively, which excludes all other file types. |
destination_files | glob A glob relative to the root of the destination repository that matches files that are part of the migration. Files NOT matching this glob will never be removed, even if the file does not exist in the source. For example glob([''], exclude = ['/BUILD']) keeps all BUILD files in destination when the origin does not have any BUILD files. You can also use this to limit the migration to a subdirectory of the destination, e.g. glob(['java/src/'], exclude = ['/BUILD']) to only affect non-BUILD files in java/src. |
mode | string Workflow mode. Currently we support three modes:
|
reversible_check | boolean Indicates if the tool should try to to reverse all the transformations at the end to check that they are reversible. |
ask_for_confirmation | boolean Indicates that the tool should show the diff and require user's confirmation before making a change in the destination. |
Command line flags:
Name | Type | Description |
---|---|---|
--change_request_parent | string | Commit revision to be used as parent when importing a commit using CHANGE_REQUEST workflow mode. this shouldn't be needed in general as Copybara is able to detect the parent commit message. |
--last-rev | string | Last revision that was migrated to the destination |
--iterative-limit-changes | int | Import just a number of changes instead of all the pending ones |
--ignore-noop | boolean | Only warn about operations/transforms that didn't have any effect. For example: A transform that didn't modify any file, non-existent origin directories, etc. |
--squash-skip-history | boolean | Avoid exposing the history of changes that are being migrated. This is useful when we want to migrate a new repository but we don't want to expose all the change history to metadata.squash_notes. |
--iterative-all-changes | boolean | By default Copybara will only try to migrate changes that could affect the destination. Ignoring changes that only affect excluded files in origin_files. This flag disables that behavior and runs for all the changes. |
Moves files between directories and renames files
move core.move(before, after, paths=glob(["**"]), overwrite=False)
Parameter | Description |
---|---|
before | string The name of the file or directory before moving. If this is the empty string and 'after' is a directory, then all files in the workdir will be moved to the sub directory specified by 'after', maintaining the directory tree. |
after | string The name of the file or directory after moving. If this is the empty string and 'before' is a directory, then all files in 'before' will be moved to the repo root, maintaining the directory tree inside 'before'. |
paths | glob A glob expression relative to 'before' if it represents a directory. Only files matching the expression will be moved. For example, glob(["**.java"]), matches all java files recursively inside 'before' folder. Defaults to match all the files recursively. |
overwrite | boolean Overwrite destination files if they already exist. Note that this makes the transformation non-reversible, since there is no way to know if the file was overwritten or not in the reverse workflow. |
Move all the files in a directory to another directory:
core.move("foo/bar_internal", "bar")
In this example, foo/bar_internal/one
will be moved to bar/one
.
Move all the files in the checkout dir into a directory called foo:
core.move("", "foo")
In this example, one
and two/bar
will be moved to foo/one
and foo/two/bar
.
Move the contents of a folder to the checkout root directory:
core.move("foo", "")
In this example, foo/bar
would be moved to bar
.
Replace a text with another text using optional regex groups. This tranformer can be automatically reversed.
replace core.replace(before, after, regex_groups={}, paths=glob(["**"]), first_only=False, multiline=False, repeated_groups=False)
Parameter | Description |
---|---|
before |
string The text before the transformation. Can contain references to regex groups. For example "foo${x}text". If '$' literal character needs to be matched, ' |
after |
string The text after the transformation. It can also contain references to regex groups, like 'before' field. |
regex_groups |
dict A set of named regexes that can be used to match part of the replaced text. For example {"x": "[A-Za-z]+"} |
paths |
glob A glob expression relative to the workdir representing the files to apply the transformation. For example, glob(["**.java"]), matches all java files recursively. Defaults to match all the files recursively. |
first_only |
boolean If true, only replaces the first instance rather than all. In single line mode, replaces the first instance on each line. In multiline mode, replaces the first instance in each file. |
multiline |
boolean Whether to replace text that spans more than one line. |
repeated_groups |
boolean Allow to use a group multiple times. For example foo${repeated}/${repeated}. Note that this mechanism doesn't use backtracking. In other words, the group instances are treated as different groups in regex construction and then a validation is done after that. |
Replaces the text "internal" with "external" in all java files
core.replace(
before = "internal",
after = "external",
paths = glob(["**.java"]),
)
In this example we map some urls from the internal to the external version in all the files of the project.
core.replace(
before = "https://some_internal/url/${pkg}.html",
after = "https://example.com/${pkg}.html",
regex_groups = {
"pkg": ".*",
},
)
So a url like https://some_internal/url/foo/bar.html
will be transformed to https://example.com/foo/bar.html
.
This example removes blocks of text/code that are confidential and thus shouldn'tbe exported to a public repository.
core.replace(
before = "${x}",
after = "",
multiline = True,
regex_groups = {
"x": "(?m)^.*BEGIN-INTERNAL[\\w\\W]*?END-INTERNAL.*$\\n",
},
)
This replace would transform a text file like:
This is
public
// BEGIN-INTERNAL
confidential
information
// END-INTERNAL
more public code
// BEGIN-INTERNAL
more confidential
information
// END-INTERNAL
Into:
This is
public
more public code
Verifies that a RegEx matches (or not matches) the specified files. Does not, transform anything, but will stop the workflow if it fails.
verifyMatch core.verify_match(regex, paths=glob(["**"]), verify_no_match=False)
Parameter | Description |
---|---|
regex | string The regex pattern to verify. To satisfy the validation, there has to be atleast one (or no matches if verify_no_match) match in each of the files included in paths. The re2j pattern will be applied in multiline mode, i.e. '^' refers to the beginning of a file and '$' to its end. |
paths | glob A glob expression relative to the workdir representing the files to apply the transformation. For example, glob(["**.java"]), matches all java files recursively. Defaults to match all the files recursively. |
verify_no_match | boolean If true, the transformation will verify that the RegEx does not match. |
Groups some transformations in a transformation that can contain a particular, manually-specified, reversal, where the forward version and reversed version of the transform are represented as lists of transforms. The is useful if a transformation does not automatically reverse, or if the automatic reversal does not work for some reason.
If reversal is not provided, the transform will try to compute the reverse of the transformations list.
transformation core.transform(transformations, reversal=The reverse of 'transformations', ignore_noop=False)
Parameter | Description |
---|---|
transformations | sequence of transformation The list of transformations to run as a result of running this transformation. |
reversal | sequence of transformation The list of transformations to run as a result of running this transformation in reverse. |
ignore_noop | boolean In case a noop error happens in the group of transformations (Both forward and reverse), it will be ignored. In general this is a bad idea and prevents Copybara for detecting important transformation errors. |
Module for dealing with local filesytem folders
A folder destination is a destination that puts the output in a folder
destination folder.destination()
Command line flags:
Name | Type | Description |
---|---|---|
--folder-dir | string | Local directory to put the output of the transformation |
A folder origin is a origin that uses a folder as input
folderOrigin folder.origin(materialize_outside_symlinks=False)
Parameter | Description |
---|---|
materialize_outside_symlinks | boolean By default folder.origin will refuse any symlink in the migration folder that is an absolute symlink or that refers to a file outside of the folder. If this flag is set, it will materialize those symlinks as regular files in the checkout directory. |
Command line flags:
Name | Type | Description |
---|---|---|
--folder-origin-author | string | Author of the change being migrated from folder.origin() |
--folder-origin-message | string | Message of the change being migrated from folder.origin() |
Set of functions to define Git origins and destinations.
Command line flags:
Name | Type | Description |
---|---|---|
--git-repo-storage | string | Location of the storage path for git repositories |
Defines a standard Git origin. For Git specific origins use: github_origin
or gerrit_origin
.
All the origins in this module accept several string formats as reference (When copybara is called in the form of copybara config workflow reference
):
- Branch name: For example
master
- An arbitrary reference:
refs/changes/20/50820/1
- A SHA-1: Note that currently it has to be reachable from the default refspec
- A Git repository URL and reference:
http://github.com/foo master
- A GitHub pull request URL:
https://github.com/some_project/pull/1784
So for example, Copybara can be invoked for a
git.origin
in the CLI as:copybara copy.bara.sky my_workflow https://github.com/some_project/pull/1784
This will use the pull request as the origin URL and reference.
gitOrigin git.origin(url, ref=None, submodules='NO', include_branch_commit_logs=False)
Parameter | Description |
---|---|
url | string Indicates the URL of the git repository |
ref | string Represents the default reference that will be used for reading the revision from the git repository. For example: 'master' |
submodules | string Download submodules. Valid values: NO, YES, RECURSIVE. |
include_branch_commit_logs | boolean Whether to include raw logs of branch commits in the migrated change message. This setting only affects merge commits. |
Mirror git references between repositories
git.mirror(name, origin, destination, refspecs=['refs/heads/*'], prune=False)
Parameter | Description |
---|---|
name | string Migration name |
origin | string Indicates the URL of the origin git repository |
destination | string Indicates the URL of the destination git repository |
refspecs | sequence of string Represents a list of git refspecs to mirror between origin and destination.For example 'refs/heads/:refs/remotes/origin/' will mirror any referenceinside refs/heads to refs/remotes/origin. |
prune | boolean Remove remote refs that don't have a origin counterpart |
Command line flags:
Name | Type | Description |
---|---|---|
--git-mirror-force | boolean | Force push even if it is not fast-forward |
Defines a Git origin for Gerrit reviews.
gitOrigin git.gerrit_origin(url, ref=None, submodules='NO')
Parameter | Description |
---|---|
url | string Indicates the URL of the git repository |
ref | string DEPRECATED. Use git.origin for submitted branches. |
submodules | string Download submodules. Valid values: NO, YES, RECURSIVE. |
Defines a Git origin of type Github.
gitOrigin git.github_origin(url, ref=None, submodules='NO')
Parameter | Description |
---|---|
url | string Indicates the URL of the git repository |
ref | string Represents the default reference that will be used for reading the revision from the git repository. For example: 'master' |
submodules | string Download submodules. Valid values: NO, YES, RECURSIVE. |
Creates a commit in a git repository using the transformed worktree.
Given that Copybara doesn't ask for user/password in the console when doing the push to remote repos, you have to use ssh protocol, have the credentials cached or use a credential manager.
gitDestination git.destination(url, push=master, fetch=push reference, skip_push=False)
Parameter | Description |
---|---|
url | string Indicates the URL to push to as well as the URL from which to get the parent commit |
push | string Reference to use for pushing the change, for example 'master' |
fetch | string Indicates the ref from which to get the parent commit |
skip_push | boolean If set, copybara will not actually push the result to the destination. This is meant for testing workflows and dry runs. |
Command line flags:
Name | Type | Description |
---|---|---|
--git-committer-name | string | If set, overrides the committer name for the generated commits in git destination. |
--git-committer-email | string | If set, overrides the committer e-mail for the generated commits in git destination. |
--git-destination-url | string | If set, overrides the git destination URL. |
--git-destination-fetch | string | If set, overrides the git destination fetch reference. |
--git-destination-push | string | If set, overrides the git destination push reference. |
--git-destination-path | string | If set, the tool will use this directory for the local repository. Note that the directory will be deleted each time Copybara is run. |
--git-destination-skip-push | boolean | If set, the tool will not push to the remote destination |
--git-destination-last-rev-first-parent | boolean | Use git --first-parent flag when looking for last-rev in previous commits |
Creates a change in Gerrit using the transformed worktree. If this is used in iterative mode, then each commit pushed in a single Copybara invocation will have the correct commit parent. The reviews generated can then be easily done in the correct order without rebasing.
gerritDestination git.gerrit_destination(url, fetch, push_to_refs_for='')
Parameter | Description |
---|---|
url | string Indicates the URL to push to as well as the URL from which to get the parent commit |
fetch | string Indicates the ref from which to get the parent commit |
push_to_refs_for | string Review branch to push the change to, for example setting this to 'feature_x' causes the destination to push to 'refs/for/feature_x'. It defaults to 'fetch' value. |
Command line flags:
Name | Type | Description |
---|---|---|
--git-committer-name | string | If set, overrides the committer name for the generated commits in git destination. |
--git-committer-email | string | If set, overrides the committer e-mail for the generated commits in git destination. |
--git-destination-url | string | If set, overrides the git destination URL. |
--git-destination-fetch | string | If set, overrides the git destination fetch reference. |
--git-destination-push | string | If set, overrides the git destination push reference. |
--git-destination-path | string | If set, the tool will use this directory for the local repository. Note that the directory will be deleted each time Copybara is run. |
--git-destination-skip-push | boolean | If set, the tool will not push to the remote destination |
--git-destination-last-rev-first-parent | boolean | Use git --first-parent flag when looking for last-rev in previous commits |
Module for applying patches.
A transformation that applies the given patch files. If a path does not exist in a patch, it will be ignored.
patchTransformation patch.apply(patches=[], excluded_patch_paths=[], series=None)
Parameter | Description |
---|---|
patches | sequence of string The list of patchfiles to apply, relative to the current config file.The files will be applied relative to the checkout dir and the leading pathcomponent will be stripped (-p1). |
excluded_patch_paths | sequence of string The list of paths to exclude from each of the patches. Each of the paths will be excluded from all the patches. Note that these are not workdir paths, but paths relative to the patch itself. |
series | string The config file that contains a list of patches to apply. The series file contains names of the patch files one per line. The names of the patch files are relative to the series config file. The files will be applied relative to the checkout dir and the leading path component will be stripped (-p1). |