shell:
  beforeInstall:
  - <bash command>
  install:
  - <bash command>
  beforeSetup:
  - <bash command>
  setup:
  - <bash command>
  cacheVersion: <arbitrary string>
  beforeInstallCacheVersion: <arbitrary string>
  installCacheVersion: <arbitrary string>
  beforeSetupCacheVersion: <arbitrary string>
  setupCacheVersion: <arbitrary string>
ansible:
  beforeInstall:
  - <task>
  install:
  - <task>
  beforeSetup:
  - <task>
  setup:
  - <task>
  cacheVersion: <arbitrary string>
  beforeInstallCacheVersion: <arbitrary string>
  installCacheVersion: <arbitrary string>
  beforeSetupCacheVersion: <arbitrary string>
  setupCacheVersion: <arbitrary string>

Running assembly instructions with git

What is user stages?

User stage is a stage with assembly instructions from config. Currently, there are two kinds of assembly instructions: shell and ansible. Werf defines 4 user stages and executes them in this order: beforeInstall, install, beforeSetup and setup. Assembly instructions from one stage are executed to create one docker layer.

Motivation behind stages

Opinionated build structure

User stages pattern is based on analysis of real applications building instructions. It turns out that group assembly instructions into 4 user stages are enough for most applications. Instructions grouping decrease layers sizes and speed up image building.

Framework for a build process

User stages pattern defines a structure for building process and thus set boundaries for a developer. This is a high speed up over unstructured instructions in Dockerfile because developer knows what kind of instructions should be on each stage.

Run assembly instruction on git changes

User stage execution can depend on changes of files in a git repository. Werf supports local and remote git repositories. User stage can be dependent on changes in several repositories. Different changes in one repository can cause a rebuild of different user stages.

More build tools: shell, ansible, …

Shell is a familiar and well-known build tool. Ansible is a newer tool and it needs some time for learning.

If you need prototype as soon as possible then the shell is enough — it works like a RUN instruction in Dockerfile. Ansible configuration is declarative and mostly idempotent which is good for long-term maintenance.

Stage execution is isolated in code, so implementing support for another tool is not so difficult.

Usage of user stages

Werf provides 4 user stages where assembly instructions can be defined. Assembly instructions are not limited by werf. You can define whatever you define for RUN instruction in Dockerfile. However, assembly instructions grouping arises from experience with real-life applications. So the vast majority of application builds need these actions:

  • install system packages
  • install system dependencies
  • install application dependencies
  • setup system applications
  • setup application

What is the best strategy to execute them? First thought is to execute them one by one to cache interim results. The other thought is not to mix instructions for these actions because of different file dependencies. User stages pattern suggests this strategy:

  • use beforeInstall user stage for installing system packages
  • use install user stage to install system dependencies and application dependencies
  • use beforeSetup user stage to setup system settings and install an application
  • use setup user stage to setup application settings

beforeInstall

A stage that executes instructions before install an application. This stage is for system applications that rarely changes but time consuming to install. Also, long-lived system settings can be done here like setting locale, setting time zone, adding groups and users, etc. Installation of enormous language distributions and build tools like PHP and composer, java and gradle, etc. are good candidates to execute at this stage.

In practice, these components are rarely changes, and beforeInstall stage caches them for an extended period.

install

A stage to install an application. This stage is for installation of application dependencies and setup some standard settings.

Instructions on this stage have access to application source codes, so application dependencies can be installed with build tools (like composer, gradle, npm, etc.) that require some manifest file (i.e., pom.xml, Gruntfile). Best practice is to make this stage dependent on changes in that manifest file.

beforeSetup

This stage is for prepare application before setup some settings. Every kind of compilation can be done here: creating jars, creating executable files and dynamic libraries, creating web assets, uglification and encryption. This stage often makes to be dependent on changes in source codes.

setup

This stage is to setup application settings. The usual actions here are copying some profiles into /etc, copying configuration files into well-known locations, creating a file with the application version. These actions should not be time-consuming as they execute on every commit.

custom strategy

Again, there are no limitations for assembly instructions. The previous definitions of user stages are just suggestions arise from experience with real applications. You can even use only one user stage or define your strategy of grouping assembly instructions and get benefits from caching and git dependencies.

Syntax

There are two builder directives for assembly instructions on top level: shell and ansible. These builder directives are mutually exclusive. You can build an image with shell assembly instructions or with ansible assembly instructions.

Builder directive has four directives which define assembly instructions for each user stage:

  • beforeInstall
  • install
  • beforeSetup
  • setup

Builder directives also contain cacheVersion directives that are user-defined parts of user stages signatures. More details in a CacheVersion section.

Shell

Syntax for user stages with shell assembly instructions:

shell:
  beforeInstall:
  - <bash_command 1>
  - <bash_command 2>
  ...
  - <bash_command N>
  install:
  - bash command
  ...
  beforeSetup:
  - bash command
  ...
  setup:
  - bash command
  ...
  cacheVersion: <version>
  beforeInstallCacheVersion: <version>
  installCacheVersion: <version>
  beforeSetupCacheVersion: <version>
  setupCacheVersion: <version>

Shell assembly instructions are arrays of bash commands for user stages. Commands for one stage are executed as one RUN instruction in Dockerfile, and thus werf creates one layer for one user stage.

Werf provides distribution agnostic bash binary, so you need no bash binary in the base image. Commands for one stage are joined with && and then encoded as base64. User stage assembly container runs decoding and then executes joined commands. For example, beforeInstall stage with apt-get update and apt-get install commands:

beforeInstall:
- apt-get update
- apt-get install -y build-essential g++ libcurl4

Werf performs user stage commands as follows:

  • generates temporary script on host machine

      #!/.werf/stapel/embedded/bin/bash -e
    
      apt-get update
      apt-get install -y build-essential g++ libcurl4
    
  • mounts to corresponding user stage assembly container as /.werf/shell/script.sh, and
  • runs the script.

bash binary is stored in a stapel volume. Details about the concept can be found in this blog post [RU] (referred dappdeps has been renamed to stapel but the principle is the same)

Ansible

Syntax for user stages with ansible assembly instructions:

ansible:
  beforeInstall:
  - <ansible task 1>
  - <ansible task 2>
  ...
  - <ansible task N>
  install:
  - ansible task
  ...
  beforeSetup:
  - ansible task
  ...
  setup:
  - ansible task
  ...
  cacheVersion: <version>
  beforeInstallCacheVersion: <version>
  installCacheVersion: <version>
  beforeSetupCacheVersion: <version>
  setupCacheVersion: <version>

Ansible config and stage playbook

Ansible assembly instructions for user stage is a set of ansible tasks. To run this tasks with ansible-playbook command werf mounts this directory structure into the user stage assembly container:

/.werf/ansible-workdir
├── ansible.cfg
├── hosts
└── playbook.yml

ansible.cfg contains settings for ansible:

  • use local transport
  • werf stdout_callback for better logging
  • turn on force_color
  • use sudo for privilege escalation (no need to use become in tasks)

hosts is an inventory file and contains the only localhost. Also, there are some ansible_* settings, i.e., the path to python in stapel.

playbook.yml is a playbook with all tasks from one user stage. For example, werf.yaml with install stage like this:

ansible:
  install:
  - debug: msg='Start install'
  - file: path=/etc mode=0777
  - copy:
      src: /bin/sh
      dest: /bin/sh.orig
  - apk:
      name: curl
      update_cache: yes
  ...

werf produces this playbook.yml for install stage:

- hosts: all
  gather_facts: 'no'
  tasks:
  - debug: msg='Start install'  \
  - file: path=/etc mode=0777   |
  - copy:                        > these lines are copied from werf.yaml
      src: /bin/sh              |
      dest: /bin/sh.orig        |
  - apk:                        |
      name: curl                |
      update_cache: yes         /
  ...

Werf plays the user stage playbook in the user stage assembly container with playbook-ansible command:

$ export ANSIBLE_CONFIG="/.werf/ansible-workdir/ansible.cfg"
$ ansible-playbook /.werf/ansible-workdir/playbook.yml

ansible and python binaries and libraries are stored in a stapel volume. Details about the concept can be found in this blog post [RU] (referred dappdeps has been renamed to stapel but the principle is the same).

Supported modules

One of the ideas behind werf is idempotent builds. If nothing changed — werf should create the same image. This task accomplished by signature calculation for stages. Ansible has non-idempotent modules — they are giving different results if executed twice and werf cannot correctly calculate signature to rebuild stages. For now, there is a list of supported modules:

Werf config with the module not from this list gives an error and stops a build. Feel free to report an issue if some module should be enabled.

Copy files

The preferred way of copying files into an image is git mappings. Werf cannot calculate changes of files referred in copy module. The only way to copy some external file into an image, for now, is to use the go-templating method .Files.Get. This method returns file content as a string. So content of the file becomes a part of user stage signature, and file changes lead to user stage rebuild.

For example, copy nginx.conf into an image:

ansible:
  install:
  - copy:
      content: |
{{ .Files.Get '/conf/etc/nginx.conf' | indent 6}}
      dest: /etc/nginx/nginx.conf

Werf renders that snippet as go template and then transforms it into this playbook.yml:

- hosts: all
  gather_facts: no
  tasks:
    install:
    - copy:
        content: |
          http {
            sendfile on;
            tcp_nopush on;
            tcp_nodelay on;
            ...

Jinja templating

Ansible supports Jinja templating of playbooks. However, Go templates and Jinja templates has the same delimiters: {{ and }}. Jinja templates should be escaped to work. There are two possible variants: escape only {{ or escape the whole Jinja expression.

For example, you have this ansible task:

- copy:
    src: {{item}}
    dest: /etc/nginx
    with_files:
    - /app/conf/etc/nginx.conf
    - /app/conf/etc/server.conf

So, Jinja expression {{item}} should be escaped:

# escape {{ only
src: {{"{{"}} item }}

or

# escape the whole expression
src: {{`{{item}}`}}

Ansible problems

  • Live stdout implemented for raw and command modules. Other modules display stdout and stderr content after execution.
  • Excess logging into stderr may hang ansible task execution (issue #784).
  • apt module hangs build process on particular debian and ubuntu versions. This affects derived images as well (issue #645).

User stages dependencies

One of the werf features is an ability to define dependencies for stage rebuild. As described in stages reference, stages are built one by one, and each stage has a calculated stage signature. Signatures have various dependencies. When dependencies are changed, the stage signature is changed, and werf rebuild this stage and all following stages.

These dependencies can be used for defining rebuild for the user stages. User stages signatures and so rebuilding of the user stages depends on:

  • changes in assembly instructions
  • changes of cacheVersion directives
  • git repository changes
  • changes in files that imports from an artifacts

First three dependencies are described further.

Dependency on assembly instructions changes

User stage signature depends on rendered assembly instructions text. Changes in assembly instructions for user stage lead to the rebuilding of this stage. E.g., you use the following shell assembly instructions:

shell:
  beforeInstall:
  - echo "Commands on the Before Install stage"
  install:
  - echo "Commands on the Install stage"
  beforeSetup:
  - echo "Commands on the Before Setup stage"
  setup:
  - echo "Commands on the Setup stage"

First, build of this image execute all four user stages. There is no git mapping in this config, so next builds never execute assembly instructions because user stages signatures not changed and build cache remains valid.

Changing assembly instructions for install user stage:

shell:
  beforeInstall:
  - echo "Commands on the Before Install stage"
  install:
  - echo "Commands on the Install stage"
  - echo "Installing ..."
  beforeSetup:
  - echo "Commands on the Before Setup stage"
  setup:
  - echo "Commands on the Setup stage"

Now werf build executes install assembly instructions and instructions from following stages.

Go-templating and using environment variables can changes assembly instructions and lead to unforeseen rebuilds. For example:

shell:
  beforeInstall:
  - echo "Commands on the Before Install stage for {{ env "CI_COMMIT_SHA” }}"
  install:
  - echo "Commands on the Install stage"
  ...

First build renders beforeInstall command into:

echo "Commands on the Before Install stage for 0a8463e2ed7e7f1aa015f55a8e8730752206311b"

Build for the next commit renders beforeInstall command into:

echo "Commands on the Before Install stage for 36e907f8b6a639bd99b4ea812dae7a290e84df27"

Using CI_COMMIT_SHA assembly instructions text changes every commit. So this configuration rebuilds beforeInstall user stage on every commit.

Dependency on git repo changes

As stated in a git mapping reference, there are gitArchive and gitLatestPatch stages. gitArchive is executed after beforeInstall user stage, and gitLatestPatch is executed after setup user stage if a local git repository has changes. So, to execute assembly instructions with the latest version of source codes, you may rebuild gitArchive with special commit or rebuild beforeInstall (change cacheVersion or instructions for beforeInstall stage).

install, beforeSetup and setup user stages are also dependant on git repository changes. A git patch is applied at the beginning of user stage to execute assembly instructions with the latest version of source codes.

During image build process source codes are updated only within one stage, subsequent stages are based on this stage and use actualized files. First build adds sources on gitArchive stage. Any other build updates sources on gitCache, gitLatestPatch or on one of the following user stages: install, beforeSetup and setup.

This stage is shown in Calculation signature phase git files actualized on specific stage

User stage dependency on git repository changes is defined with git.stageDependencies parameter. Syntax is:

git:
- ...
  stageDependencies:
    install:
    - <mask 1>
    ...
    - <mask N>
    beforeSetup:
    - <mask>
    ...
    setup:
    - <mask>

git.stageDependencies parameter has 3 keys: install, beforeSetup and setup. Each key defines an array of masks for one user stage. User stage is rebuilt if a git repository has changes in files that match with one of the masks defined for user stage.

For each user stage werf creates a list of matched files and calculates a checksum over each file attributes and content. This checksum is a part of stage signature. So signature is changed with every change in a repository: getting new attributes for the file, changing file’s content, adding a new matched file, deleting a matched file, etc.

git.stageDependencies masks work together with git.includePaths and git.excludePaths masks. werf considers only files matched with includePaths filter and stageDependencies masks. Likewise, werf considers only files not matched with excludePaths filter and matched with stageDependencies masks.

stageDependencies masks works like includePaths and excludePaths filters. Masks are matched with files paths and may contain the following glob patterns:

  • * — matches any file. This pattern includes . and excludes /
  • ** — matches directories recursively or files expansively
  • ? — matches any one character. Equivalent to /.{1}/ in regexp
  • [set] — matches any one character in the set. Behaves exactly like character sets in regexp, including set negation ([^a-z])
  • \ — escapes the next metacharacter

Mask that starts with * is treated as anchor name by yaml parser. So mask with * or ** patterns at the beginning should be quoted:

# * at the beginning of mask, so use double quotes
- "*.rb"
# single quotes also work
- '**/*'
# no star at the beggining, no quoting needed
- src/**/*.js

Werf determines whether the files changes in the git repository with use of checksums. For user stage and for each mask, the following algorithm is applied:

  • werf creates a list of all files from add path and apply excludePaths and includePaths filters:
  • each file path from the list compared to the mask with the use of glob patterns;
  • if mask matches a directory then this directory content is matched recursively;
  • werf calculates checksum of attributes and content of all matched files.

These checksums are calculated in the beginning of the build process before any stage container is ran.

Example:

---
image: app
git:
- add: /src
  to: /app
  stageDependencies:
    beforeSetup:
    - "*"
shell:
  install:
  - echo "install stage"
  beforeSetup:
  - echo "beforeSetup stage"
  setup:
  - echo "setup stage"

This werf.yaml has a git mapping configuration to transfer /src content from local git repository into /app directory in the image. During the first build, files are cached in gitArchive stage and assembly instructions for install and beforeSetup are executed. The next builds of commits that have only changes outside of the /src do not execute assembly instructions. If a commit has changes inside /src directory, then checksums of matched files are changed, werf will apply git patch, rebuild all existing stages since beforeSetup: beforeSetup and setup. Werf will apply patch on the beforeSetup stage itself.

Dependency on CacheVersion values

There are situations when a user wants to rebuild all or one of user stages. This can be accomplished by changing cacheVersion or <user stage name>CacheVersion values.

Signature of the install user stage depends on the value of the installCacheVersion parameter. To rebuild the install user stage (and subsequent stages), you need to change the value of the installCacheVersion parameter.

Note that cacheVersion and beforeInstallCacheVersion directives have the same effect. When these values are changed, then the beforeInstall stage and subsequent stages rebuilt.

Example: common image for multiple applications

You can define an image with common packages in separated werf.yaml. cacheVersion value can be used to rebuild this image to refresh packages versions.

image: ~
from: ubuntu:latest
shell:
  beforeInstallCacheVersion: 2
  beforeInstall:
  - apt update
  - apt install ...

This image can be used as base image for multiple applications if images from hub.docker.com doesn’t suite your needs.

External dependency example

CacheVersion directives can be used with go templates to define user stage dependency on files, not in the git tree.

image: ~
from: ubuntu:latest
shell:
  installCacheVersion: {{.Files.Get "some-library-latest.tar.gz" | sha256sum}}
  install:
  - tar zxf some-library-latest.tar.gz
  - <build application>

Build script can be used to download some-library-latest.tar.gz archive and then execute werf build command. If the file is changed then werf rebuilds install user stage and subsequent stages.