We divide the build process of the images described in the werf configuration file into stages, with clear functions and purposes. Each stage corresponds to an intermediate image, like layers in Docker. The final image corresponds to the last built stage for a particular git state and the werf configuration file.

Stages are steps in the build process. A stage is built from a logically grouped set of config instructions. It takes into account the assembly conditions and rules. Each stage relates to a single Docker image and saved in the storage.

You can consider the stages as a building cache of an application, however, that are not the cache but a part of a building context.

Stage conveyor

A stage conveyor is an ordered sequence of conditions and rules for carrying out stages. werf uses different stage conveyors to assemble various types of images depending on their configuration.

The user only needs to write a correct configuration: werf performs the rest of the work with stages

For each stage at every build, werf calculates the unique identifier of the stage called stage digest.

If a stage has no stage dependencies, it is skipped, and, accordingly, the stage conveyor is reduced by one stage. It means that the stage conveyor can be reduced to several stages or even to a single from stage.

Stage digest

The stage digest is used for tagging a stage (digest is the part of image tag) in the storage. werf does not build stages that already exist in the storage (similar to caching in Docker yet more complex).

The stage digest is calculated as the checksum of:

  • checksum of stage dependencies;
  • previous stage digest;
  • git commit-id related with the previous stage (if previous stage is git-related).

Digest identifier of the stage represents content of the stage and depends on git history which lead to this content.

Stage dependencies

Stage dependency is a piece of data that affects the stage digest. Stage dependency may be represented by:

  • some file from a git repo with its contents;
  • instructions to build stage defined in the werf.yaml;
  • the arbitrary string specified by the user in the werf.yaml;
  • and so on.

Most stage dependencies are specified in the werf.yaml, others relate to a runtime.

The tables below illustrate dependencies of a Dockerfile image, a Stapel image, and a Stapel artifact stages dependencies. Each row describes dependencies for a certain stage. The left column contains a short description of dependencies, the right column includes related werf.yaml directives and contains relevant references for more information.

stage dockerfile

target dockerfile instructions
hashsum of files related with ADD and COPY dockerfile instructions
args used in target dockerfile instructions
addHost
image: <image name... || ~>
dockerfile: <relative path>
context: <relative path>
target: <docker stage name>
args:
  <build arg name>: <value>
addHost:
- <host:ip>

References:

stage from

from
or from image stages-digest
or from artifact stages-digest
actual digest from registry (if fromLatest: true)
fromCacheVersion
mounts
from: <image[:<tag>]>
fromLatest: <bool>
fromCacheVersion: <arbitrary string>
fromImage: <image name>
fromArtifact: <artifact name>
mount:
- from: build_dir
  to: <absolute or relative path>
- from: tmp_dir
  to: <absolute path>
- fromPath: <absolute or relative path>
  to: <absolute path>

stage beforeInstall

beforeInstall bash commands or ansible tasks
cacheVersion
beforeInstallCacheVersion
shell:
  beforeInstall:
  - <bash command>
  cacheVersion: <arbitrary string>
  beforeInstallCacheVersion: <arbitrary string>

or

ansible:
  beforeInstall:
  - <task>
  cacheVersion: <arbitrary string>
  beforeInstallCacheVersion: <arbitrary string>

stage importsBeforeInstall

imports before install
import:
- artifact: <artifact name>
  before: install
  add: <absolute path>
  to: <absolute path>
  owner: <owner>
  group: <group>
  includePaths:
  - <relative path or glob>
  excludePaths:
  - <relative path or glob>

stage gitArchive

git mappings
git:
- add: <absolute path>
  to: <absolute path>
  owner: <owner>
  group: <group>
  includePaths:
  - <relative path or glob>
  excludePaths:
  - <relative path or glob>
- url: <git repo url>
  branch: <branch name>
  commit: <commit>
  tag: <tag>
  add: <absolute path>
  to: <absolute path>
  owner: <owner>
  group: <group>
  includePaths:
  - <relative path or glob>
  excludePaths:
  - <relative path or glob>

stage install

install bash commands or ansible tasks
installCacheVersion
git files hashsum by install stageDependency
git:
- stageDependencies:
    install:
    - <relative path or glob>

shell:
  install:
  - <bash command>
  installCacheVersion: <arbitrary string>

or

ansible:
  install:
  - <task>
  installCacheVersion: <arbitrary string>

stage importsAfterInstall

imports after install
import:
- artifact: <artifact name>
  after: install
  add: <absolute path>
  to: <absolute path>
  owner: <owner>
  group: <group>
  includePaths:
  - <relative path or glob>
  excludePaths:
  - <relative path or glob>

stage beforeSetup

beforeSetup bash commands or ansible tasks
beforeSetupCacheVersion
git files hashsum by beforeSetup stageDependency
git:
- stageDependencies:
    beforeSetup:
    - <relative path or glob>

shell:
  beforeSetup:
  - <bash command>
  beforeSetupCacheVersion: <arbitrary string>

or

ansible:
  beforeSetup:
  - <task>
  beforeSetupCacheVersion: <arbitrary string>

stage importsBeforeSetup

imports before setup
import:
- artifact: <artifact name>
  before: setup
  add: <absolute path>
  to: <absolute path>
  owner: <owner>
  group: <group>
  includePaths:
  - <relative path or glob>
  excludePaths:
  - <relative path or glob>

stage setup

setup bash commands or ansible tasks
setupCacheVersion
git files hashsum by setup stageDependency
git:
- stageDependencies:
    setup:
    - <relative path or glob>

shell:
  setup:
  - <bash command>
  setupCacheVersion: <arbitrary string>

or

ansible:
  setup:
  - <task>
  setupCacheVersion: <arbitrary string>

stage gitCache

size of git diff between last used commit and actual

stage importsAfterSetup

imports after setup
import:
- artifact: <artifact name>
  after: setup
  add: <absolute path>
  to: <absolute path>
  owner: <owner>
  group: <group>
  includePaths:
  - <relative path or glob>
  excludePaths:
  - <relative path or glob>

stage gitLatestPatch

presence of git diff changes between last used commit and actual

stage dockerInstructions

docker instructions
docker:
  VOLUME:
  - <volume>
  EXPOSE:
  - <expose>
  ENV:
    <env name>: <env value>
  LABEL:
    <label name>: <label value>
  ENTRYPOINT: <entrypoint>
  CMD: <cmd>
  WORKDIR: <workdir>
  USER: <user>
  HEALTHCHECK: <healthcheck>

Storage

Storage contains the stages of the project. Stages can be stored in the docker registry or locally on a host machine.

Most werf commands use stages. Such commands require specifying the location of the storage using the --repo key or the WERF_REPO environment variable.

There are 2 types of storage:

  1. Local storage. Uses local docker server runtime to store stages as docker images.
  2. Remote storage. Uses docker registry to store images. Remote storage is selected by param --repo=DOCKER_REPO_DOMAIN, for example --repo=registry.mycompany.com/web/frontend/stages. NOTE Each project should specify unique docker repo domain, that used only by this project.

Stages are named differently depending on local or remote storage used.

When docker registry is used as the storage for the project there is also a cache of local docker images on each host where werf is running. This cache is cleared by the werf itself or can be freely removed by other tools (such as docker rmi).

Note that all werf commands that require access to stages must use the same storage. Therefore, when using local storage, all werf commands must be run from the same host. When using remote storage, it does not matter which host werf is run from, as long as it is shared for these calls (applies to commands such as build, converge, cleanup, etc.).

It is recommended though to use docker registry as a storage, werf uses this mode with CI/CD systems by default

Host requirements to use remote storage:

  • Connection to docker registry.
  • Connection to the Kubernetes cluster (used to synchronize multiple build/publish/deploy processes running from different machines, see more info below).

Stage naming

Stages in the local storage are named using the following schema: PROJECT_NAME:STAGE_DIGEST-TIMESTAMP_MILLISEC. For example:

myproject                   9f3a82975136d66d04ebcb9ce90b14428077099417b6c170e2ef2fef-1589786063772   274bd7e41dd9        16 seconds ago      65.4MB
myproject                   7a29ff1ba40e2f601d1f9ead88214d4429835c43a0efd440e052e068-1589786061907   e455d998a06e        18 seconds ago      65.4MB
myproject                   878f70c2034f41558e2e13f9d4e7d3c6127cdbee515812a44fef61b6-1589786056879   771f2c139561        23 seconds ago      65.4MB
myproject                   5e4cb0dcd255ac2963ec0905df3c8c8a9be64bbdfa57467aabeaeb91-1589786050923   699770c600e6        29 seconds ago      65.4MB
myproject                   14df0fe44a98f492b7b085055f6bc82ffc7a4fb55cd97d30331f0a93-1589786048987   54d5e60e052e        31 seconds ago      64.2MB

Stages in the remote storage are named using the following schema: DOCKER_REPO_ADDRESS:STAGE_DIGEST-TIMESTAMP_MILLISEC. For example:

localhost:5000/myproject-stages                 d4bf3e71015d1e757a8481536eeabda98f51f1891d68b539cc50753a-1589714365467   7c834f0ff026        20 hours ago        66.7MB
localhost:5000/myproject-stages                 e6073b8f03231e122fa3b7d3294ff69a5060c332c4395e7d0b3231e3-1589714362300   2fc39536332d        20 hours ago        66.7MB
localhost:5000/myproject-stages                 20dcf519ff499da126ada17dbc1e09f98dd1d9aecb85a7fd917ccc96-1589714359522   f9815cec0867        20 hours ago        65.4MB
localhost:5000/myproject-stages                 1dbdae9cc1c9d5d8d3721e32be5ed5542199def38ff6e28270581cdc-1589714352200   6a37070d1b46        20 hours ago        65.4MB
localhost:5000/myproject-stages                 f88cb5a1c353a8aed65d7ad797859b39d357b49a802a671d881bd3b6-1589714347985   5295f82d8796        20 hours ago        65.4MB
localhost:5000/myproject-stages                 796e905d0cc975e718b3f8b3ea0199ea4d52668ecc12c4dbf85a136d-1589714344546   a02ec3540da5        20 hours ago        64.2MB

Digest identifier of the stage represents content of the stage and depends on git history which lead to this content.

  • PROJECT_NAME — the project name.
  • STAGE_DIGEST — [the stage digest][#stage-digest].
  • TIMESTAMP_MILLISEC — the timestamp that is generated during stage saving procedure after stage built. It is guaranteed that timestamp will be unique within specified storage.