When running, werf saves data both to the container registry and on the host.
In the case of the container registry, it deletes all outdated images based on the cleanup policies and the fact that Kubernetes still use that particular image. In the case of the host, all data can be divided into two categories: cache – the temporary data generated by werf that are no longer needed, and local Docker stages – werf creates them if used without the container registry.
There are dedicated commands in werf for cleaning up the container registry and the host. In the case of the container registry, werf only cleans up data related to the specific project, while the host cleanup covers all projects at once.
Single project | All projects | |
---|---|---|
Cleaning up outdated data in the container registry | werf cleanup --repo REPO |
- |
Complete cleanup of the container registry | werf purge --repo REPO |
- |
Cleaning up outdated data on the host | werf host cleanup --project-name PROJECT * |
werf host cleanup |
Complete cleanup of the host | werf host purge --project-name PROJECT * |
werf host purge |
, where * indicates partial functionality.
It is worth noting that:
- You can safely clean up the outdated data at any time, manually or automatically, with no risk of losing critical data that are used in production.
- Moreover, werf can automatically clean up the outdated data on the host as part of any werf command’s regular operation.
- werf isn’t supposed to run the complete data cleanup automatically due to the possibility of losing data used in production. The total data cleanup should be run manually and by a knowledgeable person.
Cleaning up the container registry
Cleaning up outdated data
The werf cleanup command is designed to run on a schedule. werf performs the (safe) cleanup according to the specified cleanup policies.
The algorithm automatically selects the images to delete. It consists of the following steps:
- Pulling the necessary data from the container registry.
- Preparing a list of images to keep. werf leaves intact:
- Images that Kubernetes uses;
- Images that meet the criteria of the user-defined policies when scanning the Git history;
- New images that were built within the predefined time frame (you can set it via the
--keep-stages-built-within-last-n-hours
option; it is set to 2 hours by default); - Images related to the images selected in previous steps.
- Deleting all the remaining images.
Images in Kubernetes
werf connects to all Kubernetes clusters described in all configuration contexts of kubectl. It then collects image names for the following object types: pod
, deployment
, replicaset
, statefulset
, daemonset
, job
, cronjob
, replicationcontroller
.
The user can configure werf’s behavior using the following parameters (and related environment variables):
--kube-config
,--kube-config-base64
set out the kubectl configuration (by default, the user-defined configuration at~/.kube/config
is used);--kube-context
scans a specific context;--scan-context-namespace-only
scans the namespace linked to a specific context (by default, all namespaces are scanned).--without-kube
disables Kubernetes scanning.
As long as some object in the Kubernetes cluster uses an image, werf will never delete this image from the container registry. In other words, if you run some object in a Kubernetes cluster, werf will not delete its related images under any circumstances during the cleanup.
Scanning the git history
werf’s cleanup algorithm uses the fact that the container registry keeps the information about the commits on which the build is based (it does not matter if an image was added to the container registry or some changes were made to it). For each build, werf saves the information about the commit, stage digest, and the image name to the registry (for each image
defined in werf.yaml
).
Once the new commit triggers the build process, werf adds the information that the stage digest corresponds to a specific commit to the registry (even if the resulting image does not change).
This approach ensures the unbreakable bond between the stage digest and the git history. Also, it makes it possible to effectively clean up outdated images based on the git state and selected policies. The algorithm scans the git history and selects relevant images.
Information about commits is the only source of truth for the algorithm, so images lacking such information will be deleted.
User-defined policies
The user can specify images that will not be deleted during a cleanup using the so-called keepPolicies
cleanup policies. If there is no configuration provided in the werf.yaml
, werf will use the default policy set.
It is worth noting that the algorithm scans the local state of the git repository. Therefore, it is essential to keep all git branches and git tags up-to-date. By default, werf performs synchronization automatically (you can change its behavior using the gitWorktree.allowFetchOriginBranchesAndTags directive in werf.yaml
).
Aspects of cleaning up the images that are being built
During the cleanup, werf applies user-defined policies to the set of images for each image
defined in werf.yaml
. The cleanup must respect all the images
in use. On the other hand, the set of images based on the Git repository’s main branch may not cover all the suitable images (for example, images
may be added to/deleted from some feature branch).
werf adds the name of the image being built to the container registry to avoid deleting images and stages in use. Such an approach frees werf from the strict set of image names defined in werf.yaml
and forces it to take into account all the images ever used.
The werf managed-images ls|add|rm
family of commands allows the user to edit the so-called managed images set and explicitly delete images that are no longer needed and can be removed entirely.
Complete cleanup
The werf purge command deletes all images from the container registry. It does not take into account if the images are being used in the Kubernetes cluster or not.
This command runs within the specific project and requires access to the project’s Git repository that contains werf.yaml
.
Cleaning up the host
Cleaning up outdated data
The werf host cleanup command cleans up old, unused, outdated data and reduces the cache size for all projects on the host. It uses the space occupied and user settings as a guide.
The algorithm automatically decides which data to delete. It consists of the following steps:
- Evaluating the space used on the volume where the local docker server data are located;
- If the space used exceeds the threshold, werf calculates the amount of space that needs to be freed to get the percent of occupied space back below the threshold (with an extra 5%). By default, the threshold is 70% of the volume space, and you can configure it using the relevant parameter.
- Next, the algorithm proceeds to delete the least recently used (LRU) data until the percent of occupied space goes back below the threshold. By default, the threshold is 70% and an extra 5% as a reserve (you can configure it using the relevant parameter).
What data can be deleted:
- Git archives in the local werf cache:
~/.werf/local_cache/git_archives
. - Git patches in the local werf cache:
~/.werf/local_cache/git_patches
. - Git repositories in the local werf cache:
~/.werf/local_cache/git_repos
. - Git worktree in the local werf cache:
~/.werf/local_cache/git_worktrees
. - All docker images that were built by version v1.2 and exist on the local docker server.
- Docker images that were built by version v1.1 and are stored in
--stages-storage=REPO
.- The algorithm cannot delete images created by version v1.1 and stored in
--stages-storage=:local
since this is the primary storage that keeps stages that can be used in production and other environments.
- The algorithm cannot delete images created by version v1.1 and stored in
Note that the algorithm of the werf host cleanup
command separately processes the volume where the local werf cache is stored (~/.werf/local_cache
) and the volume where the local docker server data are stored (usually at /var/lib/docker
). If werf cannot find the directory where the data of the local docker server are stored, you can specify the appropriate path explicitly via the --docker-server-storage-path=/var/lib/docker
parameter (or via the WERF_DOCKER_SERVER_STORAGE_PATH
environment variable).
By default, werf can automatically clean up the outdated host data as part of any werf command’s regular operation. That is why you do not need to invoke the werf host cleanup
manually or via cron. However, the user can disable auto-cleaning of outdated host data using the --disable-auto-host-cleanup
parameter (or the respective WERF_DISABLE_AUTO_HOST_CLEANUP
environment variable). In this case, we recommend adding the werf host cleanup
command to the list of cron jobs, e.g., as follows:
# /etc/cron.d/werf-host-cleanup
SHELL=/bin/bash
*/30 * * * * gitlab-runner source ~/.profile ; source $(trdl use werf 1.2 stable) ; werf host cleanup
By default, without additional parameters, the werf host cleanup
command cleans up the data of all projects on the host. If invoked with the --project-name PROJECT
parameter, the command can only clean up images on the local docker server. In this mode, the support for the command is partial.
Complete cleanup
The werf host purge command has two running modes: it can delete all the data of a single project or clean up all projects en masse.
CAUTION! By default, if no additional parameters are specified, werf host purge
would completely destroy all werf traces on the host: images, stages, cache, and other data (service folders, temporary files) for all projects. This command provides the maximum level of cleaning.
If the --project-name PROJECT
parameter is set, the command will delete images present on the local docker server related to the PROJECT. In this mode, the command is partially functional: werf will not delete images on the local docker server associated with the remote image storage in the container registry (e.g., local images left from running werf converge --repo REPO
). You can use the werf host cleanup
command (that cleans up all the outdated host data) to clean up these images.