git:
- add: <absolute path in git repository>
to: <absolute path inside image>
owner: <owner>
group: <group>
includePaths:
- <path or glob relative to path in add>
excludePaths:
- <path or glob relative to path in add>
stageDependencies:
install:
- <path or glob relative to path in add>
beforeSetup:
- <path or glob relative to path in add>
setup:
- <path or glob relative to path in add>
git:
- url: <git repo url>
branch: <branch name>
commit: <commit>
tag: <tag>
add: <absolute path in git repository>
to: <absolute path inside image>
owner: <owner>
group: <group>
includePaths:
- <path or glob relative to path in add>
excludePaths:
- <path or glob relative to path in add>
stageDependencies:
install:
- <path or glob relative to path in add>
beforeSetup:
- <path or glob relative to path in add>
setup:
- <path or glob relative to path in add>
What is Git mapping?
Git mapping defines a file or a directory in the Git repository that should be added to the image at the particular path. The repository may be a local one, hosted in the directory that contains the config, or a remote one. In the latter case, the configuration of the git mapping includes a repository address and version (branch, tag, or commit hash).
werf adds files from the repository to the image either by fully transferring them via git archive or by applying patches between commits. Full transfer is used to add files for the first time. Subsequent builds apply patches to reflect changes in a Git repository. Refer to the More details: git_archive… section to learn more about the algorithm behind fully transferring and applying patches.
The configuration of git mappings supports file filtering. You can use a set of git mappings to create virtually any file structure in the image. Also, you can specify the owner and the group for the files in the git mapping configuration (so no need to run chown
).
werf supports Git submodules. If it detects that the files specified in the git mapping configuration are present in the submodules, it will act accordingly to change the files in the submodules correctly.
All submodules in a project are bound to a specific commit. That way, all the collaborators get the same content. werf does not initialize or update the submodules. Instead, it merely uses these bound commits.
Below is an example of a git mapping configuration. In it, source files are added from a local repository (/src
is the source and /app
is the destination directory), while remote phantomjs source files are imported and saved in /src/phantomjs
:
git:
- add: /src
to: /app
- url: https://github.com/ariya/phantomjs
add: /
to: /src/phantomjs
Syntax of a git mapping
The git mapping configuration for a local repository has the following parameters:
add
— path to a directory or a file whose contents must be copied into the image. The path must be specified relative to the repository root and is absolute (i.e., starts with/
). This parameter is optional; by default, the contents of the entire repository are transferred, that is, an emptyadd
is equivalent toadd: /
;to
— the path in the image to copy the contents specified withadd
to;owner
— the name or uid of the owner of the files to be copied;group
— the name or gid of the owner’s group;excludePaths
— a set of masks to exclude files or directories during recursive copying. Paths in masks must be specified relative to add;includePaths
— a set of masks to include files or directories during recursive copying. Paths in masks must be specified relative to add;stageDependencies
— a set of masks to monitor for changes that trigger rebuilds of the user stages. This is reviewed in detail in the Running assembly instructions reference.
The git mapping configuration for a remote repository has some additional parameters:
url
— the address of the remote repository;branch
,tag
,commit
— the name of a branch, tag, or a commit hash to use. If these parameters are omitted, the master branch will be used.
By default, the use of the
branch
directive is not allowed by giterminism (read more about it here)
Using git mappings
Copying directories
The add
parameter defines a source path in the repository. Then all files in this directory are recursively retrieved and added to the image at the to
path. If the parameter is not set, werf will use the default path ( /
). In other words, the entire repository will be copied. For example:
git:
- add: /
to: /app
The following basic git mapping configuration adds all repository contents to the /app
directory in the image.
You can specify multiple git mappings:
git:
- add: /src
to: /app/src
- add: /assets
to: /static
Note, however, that the git mapping parameter doesn’t specify the directory to transfer (like cp -r /src /app
). Instead, the add
parameter specifies the contents of a directory to be recursively transferred from the repository. That is, to copy the contents of the /assets
directory to the /app/assets
directory, you have to specify the assets keyword twice in the configuration or use the includePaths
filter. For example:
git:
- add: /assets
to: /app/assets
or
git:
- add: /
to: /app
includePaths: assets
Changing the owner
The git mapping configuration provides the owner
and group
parameters. These are the names or numerical IDs of the owner and group (userid, groupid) common to all files and directories transferred to the image.
git:
- add: /src/index.php
to: /app/index.php
owner: www-data
If the group
parameter is omitted, then the group is set to the primary group of the user.
If the owner
or group
value is a string, then the specified user or group must exist in the system by the moment the transfer of files is complete. They must be added in advance if necessary (e.g., at the beforeInstall stage), otherwise, the build will end with an error.
git:
- add: /src/index.php
to: /app/index.php
owner: wwwdata
Using filters
werf uses the includePaths
and excludePaths
parameters to process the file list. These parameters contain a set of paths or masks to include and exclude files and directories to/from the list of files to transfer to the image. The excludePaths
filter works as follows: masks are applied to each file found in the add
path. If there is at least one match, the file is ignored; if no matches are found, the file gets added to the image. includePaths
works the opposite way: if there is at least one match, the file gets added to the image.
The git mapping configuration can include both filters. In this case, the file will be added to the image if the path matches any of includePaths
masks and does not match any excludePaths
masks.
For example:
git:
- add: /src
to: /app
includePaths:
- '**/*.php'
- '**/*.js'
excludePaths:
- '**/*-dev.*'
- '**/*-test.*'
In the above example, .php
and .js
files from /src
are added to the image, excluding files with the -dev.
or -test.
suffixes in the filename.
The step involving the addition of a
**/*
template is here for convenience: the most common use case of a git mapping with filters is to configure recursive copying for the directory. The addition of**/*
allows you to specify the directory name only; thus, its entire contents would match the filter
Masks support the following wildcards:
*
— matches any file. This pattern includes.
and excludes/
;**
— matches directories recursively or files expansively;?
— matches exactly one character. It is equivalent to /.{1}/ in regexp;[set]
— matches any character within the set. It behaves exactly like character sets in regexp, including the set negation ([^a-z]);\
— escapes the next metacharacter.
Masks that start with *
or **
should be escaped with quotation marks in the werf.yaml
file:
"*.rb"
— with double quotation marks'**/*'
— with single quotation marks
Filter examples:
add: /src
to: /app
includePaths:
# match all php files residing directly in /src
- '*.php'
# match recursively all php files in /src
# (also matches *.php because '.' is included in **)
- '**/*.php'
# match all files in /src/module1 recursively
# an example of the implicit addition of **/*
- module1
You can use the includePaths
filter to copy a single file without renaming it:
git:
- add: /src
to: /app
includePaths: index.php
Overlapping of target paths
Those who prefer to add multiple git mappings, have to remember that overlapping paths defined in to
may result in an inability to add files to the image. For example:
git:
- add: /src
to: /app
- add: /assets
to: /app/assets
When processing a config, werf calculates possible overlaps among all git mappings related to includePaths
and excludePaths
filters. If an overlap is found, werf tries to resolve the conflict by adding excludePaths
into the git mapping implicitly. In all other cases, the build will end with an error. However, the implicit excludePaths
filter may have undesirable side effects, so it is best to avoid conflicts caused by overlapping paths between the configured git mappings.
Here is an example of implicit excludePaths
:
git:
- add: /src
to: /app
excludePaths: # werf add this filter to resolve a conflict
- assets # between paths /src/assets and /assets
- add: /assets
to: /app/assets
Working with remote repositories
werf can use remote repositories as file sources. For this, you have to specify the repository address via the url
parameter in the git mapping configuration. werf supports https
and git+ssh
protocols.
https
The syntax for the https
protocol is as follows:
git:
- url: https://[USERNAME[:PASSWORD]@]repo_host/repo_path[.git/]
You may have to enter your credentials to access the repository over https
.
Here is an example of using GitLab CI variables for retrieving a login and password:
git:
- url: https://{{ env "CI_REGISTRY_USER" }}:{{ env "CI_JOB_TOKEN" }}@registry.gitlab.company.name/common/helper-utils.git
In the above example, we use the env method from the sprig library for accessing the environment variables.
git, ssh
werf supports accessing the repository via the git
protocol. This protocol is usually secured with SSH: this feature is used by GitHub, Bitbucket, GitLab, Gogs, Gitolite, etc. The typical repository address will look as follows:
git:
- url: git@gitlab.company.name:project_group/project.git
A good understanding of how werf searches for access keys is required to use the remote repositories over SSH (read more below).
Working with SSH keys
The ssh-agent provides keys for SSH connections. It is a daemon operating via a file socket. The path to the socket is stored in the SSH_AUTH_SOCK
environment variable. werf mounts this file socket into all assembly containers and sets the SSH_AUTH_SOCK
environment variable, i.e., connection to remote Git repositories is established using keys registered in the running ssh-agent.
werf applies the following algorithm for using the ssh-agent:
- If werf is started with the
--ssh-key
flag (there might be multiple flags):- A temporary ssh-agent starts and uses the defined keys; it is used for all Git operations with remote repositories.
- The already running ssh-agent is ignored in this case.
- No
--ssh-key
flag(s) is specified and ssh-agent is running:- werf uses the
SSH_AUTH_SOCK
environment variable; keys that are added to this agent are used for Git operations.
- werf uses the
- No
--ssh-key
flag(s) is specified and ssh-agent is not running:- If the
~/.ssh/id_rsa
file exists, werf runs the temporary ssh-agent with the key from the~/.ssh/id_rsa
file.
- If the
- If none of the previous options is applicable, the ssh-agent no SSH keys will be used for operations on external Git repositories. Building an image with the remote repositories defined in the git mapping will fail.
More details: gitArchive, gitCache, gitLatestPatch
Let us review the process of adding files to the final image in more detail. As it was stated earlier, the docker image contains multiple layers. To understand what layers werf create, let’s examine building actions triggered by three sample commits: 1
, 2
, and 3
:
- Commit 1. All files are added to a single layer depending on the git mappings configuration. This is done with using the git archive command. The resulting layer corresponds to the gitArchive stage.
- Commit 2. Another layer is added. In it, files are modified by applying a patch. This layer corresponds to the gitLatestPatch stage.
- Commit 3. Files have been added already, and werf applies patches in the gitLatestPatch stage layer.
The build sequence for these commits may be represented as follows:
gitArchive | gitLatestPatch | |
---|---|---|
Commit No. 1 is made, build at 10:00 | files as in commit No. 1 | - |
Commit No. 2 is made, build at 10:05 | files as in commit No. 1 | files as in commit No. 2 |
Commit No. 3 is made, build at 10:15 | files as in commit No. 1 | files as in commit No. 3 |
With time, the number of commits grows, and the size of the patch between commit No. 1 and the current one may grow quite large. This will increase the size of the last layer and the overall size of the stages even more. To prevent the uncontrolled growth of the latest layer, werf provides the additional intermediary stage called gitCache. When gitLatestPatch diff becomes excessively large, much of its diff is merged with the gitCache diff, thus reducing the gitLatestPatch stage size.
git stages and rebasing
Each git stage stores service labels containing SHA commits that this stage was built up on.
werf will use them for creating patches when assembling the next git stage (in a nutshell, it is a git diff COMMIT_FROM_PREVIOUS_GIT_STAGE LATEST_COMMIT
for each described git mapping).
So, if some stage has a saved commit that is not in the Git repository (e.g., after rebasing), werf will rebuild that stage at the next build using the latest commit.
Git worktree
For the Stapel builder to work properly, werf needs the full Git history of the project in order to work efficiently. So by default, werf fetches the history for the current Git project when needed. This means that werf can automatically convert the shallow-clone repository to a full clone and download an updated list of branches and tags from the origin during image cleanup.
The default behavior is defined using the following settings:
gitWorktree:
forceShallowClone: false
allowUnshallow: true
allowFetchOriginBranchesAndTags: true
For example, here is how you can disable automatic unshallow of the Git working directory:
gitWorktree:
forceShallowClone: true
allowUnshallow: false