This repository contains a set of single-purpose stress modules and the Kubernetes/Argo manifests used to run them in-cluster.
It assumes you already have a Kubernetes cluster. This README focuses on how this repository is organized, how the workloads are meant to be used, and how the CPU test was built from the ground up so the rest of the modules are easier to understand.
cpu/,memory/,io/,network/: one stress module per directory. Each module has a shell entrypoint, aDockerfile, and a simple Kubernetes Job manifest for in-cluster smoke testing.argo-workflows/templates/: reusable ArgoWorkflowTemplateobjects, one per module.argo-workflows/workflows/: runnable ArgoWorkflowobjects that invoke the templates with concrete parameters and cooldown steps.argo-workflows/rbac-argo.yaml: thestress-saservice account and RBAC used by the workflows in thestressnamespace.argo-workflows/argo-install-master-only/: Kustomize overlay that placesargo-serverandworkflow-controlleron the control-plane node.quickpizza/,k6/: example application and load-generation assets used in some scenarios.run-indexer/: watcher that records completed workflow pod runs.extract-measurements/: post-processing utilities for measurement extraction.
All stress modules follow the same build path:
- Start with a shell script that runs the actual stress tool and exposes the knobs we want through environment variables.
- Package that script into a small container image.
- Add a simple Kubernetes Job manifest to smoke-test the container inside the cluster before involving Argo.
- Create a
WorkflowTemplatethat exposes the important runtime knobs to Argo. - Create one or more
Workflowobjects that call the template with concrete values and sequencing logic.
The CPU module is documented in detail below. Memory, IO, and Network follow the same structure, just with different scripts and parameters.
All stress workloads run in the stress namespace. The Argo control plane is installed in the argo namespace.
Create the namespaces if they do not exist:
kubectl create namespace stress
kubectl create namespace argoLabel every worker that is allowed to run stress containers with role=stress:
kubectl label node <worker-node> role=stress
kubectl get nodes -L roleThis will be the node on which the stress test will run. Naturally, the control plane node should not have this label, as we do not want any stress to be happening in it.
The standalone Jobs in cpu/, memory/, io/, and network/ select role=stress. The Argo workflows also require target-node to match the node hostname.
The IO tests mount a host path and expect it to be group-owned by the same numeric GID used inside the images.
sudo groupadd -g 20001 stresscontainers
sudo mkdir -p /var/lib/module-stresser/
sudo chgrp -R 20001 /var/lib/module-stresser/
sudo chmod 2775 /var/lib/module-stresser/Create the shared random-data file used by the IO workloads:
openssl enc -aes-256-ctr -pass pass:seed -nosalt \
</dev/zero | dd of=/var/lib/module-stresser/fio_rand.dat bs=4M oflag=direct status=progress
sudo chgrp 20001 /var/lib/module-stresser/fio_rand.dat
sudo chmod 664 /var/lib/module-stresser/fio_rand.dat
ls -l /var/lib/module-stresser/fio_rand.dat
xxd -l 64 /var/lib/module-stresser/fio_rand.datThe workflows in argo-workflows/workflows/ use serviceAccountName: stress-sa, so apply the repo RBAC first:
kubectl apply -f argo-workflows/rbac-argo.yaml
kubectl get serviceaccount,role,rolebinding -n stressThis repo keeps the Argo control pods on the control-plane node by using the overlay in argo-workflows/argo-install-master-only/:
kubectl apply -k argo-workflows/argo-install-master-only
kubectl get pods -n argo -o wideThat overlay patches argo-server and workflow-controller with:
nodeSelector: node-role.kubernetes.io/control-plane: ""- a matching
NoScheduletoleration
After applying the overlay, taint the control-plane node so regular stress pods do not land there:
kubectl taint nodes <control-plane-node> node-role.kubernetes.io/control-plane=:NoSchedule --overwrite
kubectl describe node <control-plane-node>The point of this setup is:
- Argo control pods can still run on the control-plane node because the overlay adds the toleration.
- Stress workloads do not tolerate that taint, so they stay on worker nodes.
If you want the UI:
kubectl -n argo port-forward svc/argo-server 2746:2746Apply whichever templates you need:
argo template create argo-workflows/templates/cpu-stress-template.yamlThen submit a workflow, overriding target-node to a real worker hostname:
argo submit -n stress argo-workflows/workflows/cpu-stress.yamlThe committed workflows under argo-workflows/workflows/ are examples of concrete stress campaigns. They mostly differ in the parameters they pass into the shared templates.
The CPU module is the clearest example of how the repository was assembled.
The base implementation lives in cpu/cpu_stress.sh. It wraps stress-ng and exposes knobs such as:
WORKERSLOADTIMEOUTMETHODCPUSETMETRICS_BRIEFPERFEXTRA_ARGS
Run the script directly to validate the raw stress logic before containerizing it:
WORKERS=2 \
LOAD=75 \
TIMEOUT=30s \
METHOD=float64 \
CPUSET=0-1 \
./cpu/cpu_stress.shThis step is useful when you want to debug the test itself without involving Docker or Kubernetes. It requires stress-ng to be available on the machine where you run it.
The image is defined in cpu/Dockerfile. It installs stress-ng, copies the script, and runs it as a non-root user.
Build the image:
docker build -t cpu-stress:dev ./cpuRun the container locally with the same kind of knobs as the script:
docker run --rm \
-e WORKERS=2 \
-e LOAD=75 \
-e TIMEOUT=30s \
-e METHOD=float64 \
-e CPUSET=0-1 \
cpu-stress:devThis confirms the container entrypoint matches the raw script behavior. If you want to use your own image inside the cluster, push it to a registry reachable by the cluster and update the image reference in the Job and Argo manifests.
cpu/deployment.yaml is the simple in-cluster validation step for the CPU container. Despite the filename, it defines a Kubernetes Job.
Apply it:
kubectl apply -f cpu/deployment.yaml
kubectl logs -n stress job/cpu-stress-2w-75pct-1m -fThat Job keeps the setup small on purpose:
- it targets nodes labeled
role=stress - it passes the container knobs as environment variables
- it sets CPU requests and limits so the run is reproducible
This is the point where you confirm the image works correctly in the cluster before turning it into an Argo workflow.
The reusable Argo template lives in argo-workflows/templates/cpu-stress-template.yaml.
Apply it:
argo template create argo-workflows/templates/cpu-stress-template.yamlAt this layer, the repo exposes the knobs used in recurring workflow runs:
imageworkersloadtimeoutcpuset- resource requests and limits
display
The template also adds the Kubernetes scheduling details that do not belong in the container itself, such as:
nodeSelector: role=stresskubernetes.io/hostname: {{workflow.parameters.target-node}}- the resource patch for the main container
The final runnable workflow lives in argo-workflows/workflows/cpu-stress.yaml.
Submit it:
argo submit -n stress argo-workflows/workflows/cpu-stress.yamlInspect the run:
argo list -n stress
kubectl get workflows -n stressThis workflow reuses the cpu-stress template multiple times, varying the load and inserting cooldown periods between runs. That is the last layer in the stack: the script defines the test, the image packages it, the Job validates it in-cluster, the template makes it reusable, and the workflow turns it into a repeatable experiment.
The other modules use the same method:
memory/->argo-workflows/templates/memory-stress-template.yaml->argo-workflows/workflows/memory-stress.yamlio/->argo-workflows/templates/io-stress-template.yaml->argo-workflows/workflows/io-stress.yamlnetwork/->argo-workflows/templates/network-stress-template.yaml->argo-workflows/workflows/network-stress.yaml
The only real difference is the stress tool and the parameters each script exposes:
- memory uses
stress-ng --memrate - IO uses
fiotemplates and the shared host path - network uses
iperf3
Once you understand the CPU path, the rest of the repository follows the same pattern.