Run Chaos experiments with Charmed Litmus¶
In this how-to guide we will conduct a simple Chaos Experiment simulating POD deletion to check if the System Under Test can recover from such fault. To achieve this, we will bootstrap Chaos Infrastructure onto a Kubernetes cluster, define a Resilience Probe and create and run the experiment.
Pre-requisites:¶
The Charmed Chaos Engineering platform deployed on your Kubernetes cluster (see the Getting started tutorial)
kubectl
1. Deploy System Under Test (SUT)¶
In this guide we will use the self-signed-certificates charm as a SUT.
Create a Juju model for the self-signed-certificates
charm:
juju add-model certs
Deploy the charm:
juju deploy self-signed-certificates
Monitor the status of the deployment:
juju status --relations --watch 1s
The deployment is ready when the self-signed-certificates
charm is in the active/idle
state.
2. Bootstrap Chaos Infrastructure¶
In the Litmus Portal navigate to the Environments
tab and click on the + New Environment
button.
In the pop-up window fill in the name of the environment and select the environment type.
In this guide we will create an environment of type Production
and we will call it getting-started
:

Confirm your choices by clicking on the Save
button.
To bootstrap a Chaos Infrastructure onto a Kubernetes cluster, select the newly created environment and then click on
the + Enable Chaos
button.
First, provide a name for the infrastructure. In this guide we will use self-signed-certificates-test
:

Next, choose the Infrastructure type, specify the Kubernetes namespace to deploy the Infrastructure to and define a Service Account responsible for managing the Infrastructure.
In this guide we will deploy the namespace-specific Chaos Infrastructure alongside the SUT (note the Installation Location
being the same as the name of the Juju model we deployed self-signed-certificates
to):

Last, follow the instructions from points 2 and 3 of the Kubernetes Setup Instructions
:

After applying the manifests click the Done
button.
Deploying Chaos Infrastructure should take approximately 3-5 minutes. A successful deployment will be indicated
by the Infrastructure status turned into CONNECTED
:

3. Define a Resilience Probe¶
In the Litmus Portal navigate to the Resilience Probes
menu and click the + New Probe
button.
Select the probe of type Command
and configure it using the values below:
Name:
pod-up-probe
Timeout:
10s
Interval:
1s
Attempt:
1
Command:
kubectl -n certs get pods | grep self-signed-certificates | grep Running | wc -l
Type:
int
Comparison Criteria:
>
Value:
0
A correctly configured probe should look like so:

4. Create a Chaos Experiment¶
In the Litmus Portal navigate to the Chaos Experiments
menu and click the + New Experiment
button.
Name the test and select a Chaos Infrastructure to use:

Start off building an experiment using Blank Canvas
.
In the Experiment Builder
click the Add
button and add the pod-delete
fault:

Configure the fault with the following values:
App Kind:
statefulset
App Namespace:
certs
App Label:
app.kubernetes.io/name=self-signed-certificates

In the Probes
tab select the previously created pod-up-probe
and confirm your choice by clicking the Add to Fault
button.
Select the End of Test (EOT)
probe execution mode and apply changes.
At this point you experiment should look like this:

Save your changes by clicking the Save
button in the top-right corner of the screen.
5. Run a Chaos Experiment¶
Click the Run
button in the top-right corner of the screen to run the Chaos Experiment:

Running the experiment should take approximately 3 minutes.
When the experiment state changes from RUNNING
to COMPLETED
the run is done and the result is presented:
