Optimal Stopping Point for CI Tests
Contents
Optimal Stopping Point for CI Tests#
One of the machine learning explorations within the OpenShift CI Analysis project is predicting optimal stopping point for CI tests based on their test duration (runtimes) (see this issue for more details). In a previous notebook we showed how to access the TestGrid data, and then performed initial data analysis as well as feature engineering on it. Furthermore, we also calculated the optimal stopping point by identifying the distribution of the test_duration
values for different CI tests and comparing the distributions of passing/failing tests.
In this notebook, we will detect the optimal stopping point for different CI tests taken as inputs.
## Import libraries
import os
import gzip
import json
import datetime
import itertools
import scipy # noqa F401
from scipy.stats import ( # noqa F401
invgauss,
lognorm,
pearson3,
weibull_min,
triang,
beta,
norm,
weibull_max,
uniform,
gamma,
expon,
)
from ipynb.fs.defs.osp_helper_functions import (
CephCommunication,
fit_distribution,
standardize,
filter_test_type,
fetch_all_tests,
best_distribution,
optimal_stopping_point,
)
import warnings
warnings.filterwarnings("ignore")
Ceph#
Connection to Ceph for importing the TestGrid data
## Specify variables
METRIC_NAME = "time_to_fail"
# Specify the path for input grid data
INPUT_DATA_PATH = "../../data/raw/testgrid_258.json.gz"
# Specify the path for output metric data
OUTPUT_DATA_PATH = f"../../../../data/processed/metrics/{METRIC_NAME}"
## CEPH Bucket variables
## Create a .env file on your local with the correct configs
s3_endpoint_url = os.getenv("S3_ENDPOINT")
s3_access_key = os.getenv("S3_ACCESS_KEY")
s3_secret_key = os.getenv("S3_SECRET_KEY")
s3_bucket = os.getenv("S3_BUCKET")
s3_path = os.getenv("S3_PROJECT_KEY", "metrics")
s3_input_data_path = "raw_data"
# Specify whether or not we are running this as a notebook or part of an automation pipeline.
AUTOMATION = os.getenv("IN_AUTOMATION")
## Import data
timestamp = datetime.datetime.today()
if AUTOMATION:
filename = f"testgrid_{timestamp.day}{timestamp.month}.json"
cc = CephCommunication(s3_endpoint_url, s3_access_key, s3_secret_key, s3_bucket)
s3_object = cc.s3_resource.Object(s3_bucket, f"{s3_input_data_path}/{filename}")
file_content = s3_object.get()["Body"].read().decode("utf-8")
testgrid_data = json.loads(file_content)
else:
with gzip.open(INPUT_DATA_PATH, "rb") as read_file:
testgrid_data = json.load(read_file)
Fetch all tests#
Using the function fetch_all_tests
, we will fetch all passing and failing tests into two dataframes.
# Fetch all failing tests i.e which have a status code of 12
failures_df = fetch_all_tests(testgrid_data, 12)
failures_df.head()
timestamp | tab | grid | test | test_duration | failure/passing | |
---|---|---|---|---|---|---|
8 | 2021-08-16 23:03:14 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | Overall | 20.016667 | True |
10 | 2021-08-16 00:01:05 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | Overall | 108.233333 | True |
22 | 2021-08-16 23:03:14 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | operator.Run multi-stage test e2e-metal-assist... | 13.166667 | True |
24 | 2021-08-16 00:01:05 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | operator.Run multi-stage test e2e-metal-assist... | 89.983333 | True |
38 | 2021-08-16 00:01:05 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | TestInstall_test_install.start_install_and_wai... | 60.004001 | True |
# Fetch all passing tests i.e which have a status code of 1
passing_df = fetch_all_tests(testgrid_data, 1)
passing_df.head()
timestamp | tab | grid | test | test_duration | failure/passing | |
---|---|---|---|---|---|---|
1 | 2021-08-23 00:01:04 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | Overall | 95.300000 | True |
2 | 2021-08-22 08:53:17 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | Overall | 101.800000 | True |
3 | 2021-08-20 23:21:32 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | Overall | 134.833333 | True |
4 | 2021-08-20 15:57:36 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | Overall | 109.833333 | True |
5 | 2021-08-20 06:47:40 | "redhat-assisted-installer" | periodic-ci-openshift-release-master-nightly-4... | Overall | 94.800000 | True |
Filter tests#
After collecting the data for all passing and failing tests, we will move towards narrowing down to one test for which we would want to calculate the optimal stopping point. We will be using the test - operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test
and extract the data for this test.
failures_test = filter_test_type(
failures_df,
"operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test",
)
failures_test.head()
timestamp | tab | grid | test | test_duration | failure/passing | |
---|---|---|---|---|---|---|
0 | 2021-08-25 12:17:53 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 85.866667 | True |
1 | 2021-08-25 10:30:05 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 91.916667 | True |
2 | 2021-08-25 04:41:24 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 101.133333 | True |
3 | 2021-08-24 20:03:02 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 98.450000 | True |
4 | 2021-08-24 04:35:23 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 93.216667 | True |
passing_test = filter_test_type(
passing_df,
"operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test",
)
passing_test.head()
timestamp | tab | grid | test | test_duration | failure/passing | |
---|---|---|---|---|---|---|
0 | 2021-08-25 13:06:02 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 101.250000 | True |
1 | 2021-08-25 07:15:39 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 94.283333 | True |
2 | 2021-08-25 06:08:52 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 90.316667 | True |
3 | 2021-08-25 02:54:53 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 93.866667 | True |
4 | 2021-08-24 22:40:00 | "redhat-openshift-informing" | release-openshift-okd-installer-e2e-aws-upgrade | operator.Run multi-stage test e2e-aws-upgrade ... | 92.900000 | True |
Fit Distribution#
After extracting the data for one test, we would want to find the best distribution to perform optimal stopping point calculation. We find chi square and p-values to find the best distribution.
failure_dist, failures_r = fit_distribution(failures_test, "test_duration", 0.99, 0.01)
Distributions listed by Betterment of fit:
............................................
Distribution chi_square and p-value
3 beta (2148.0315961744586, 0.0)
9 pearson3 (2150.964892187448, 0.0)
1 norm (2178.439189095538, 0.0)
8 lognorm (2190.171386750302, 0.0)
6 gamma (2251.5768352345144, 0.0)
0 weibull_min (2335.2881528000057, 0.0)
2 weibull_max (2436.7340969950874, 0.0)
4 invgauss (2581.7529201615253, 0.0)
10 triang (3168.817214371956, 0.0)
5 uniform (5205.7686822999685, 0.0)
7 expon (7308.400793415922, 0.0)
# Identify the best fit distribution from the failing test along with its corresponding distribution parameters
best_dist, parameters_failing = best_distribution(failure_dist, failures_r)
# Identify the distributions for the passing test along with its corresponding distribution parameters
passing_dist, passing_r = fit_distribution(passing_test, "test_duration", 0.99, 0.01)
Distributions listed by Betterment of fit:
............................................
Distribution chi_square and p-value
10 triang (461.9624452114939, 4.799796517458444e-69)
3 beta (619.2886679176573, 2.716009412709153e-100)
2 weibull_max (782.1495727872282, 2.4780499811803997e-133)
5 uniform (800.7205543332755, 3.9377128547833523e-137)
9 pearson3 (902.4903827437414, 4.87937692523532e-158)
0 weibull_min (961.9366558978377, 2.6033191498811774e-170)
6 gamma (1025.0253918219983, 2.234474698949537e-183)
8 lognorm (1063.4355506988115, 2.3726995807065007e-191)
1 norm (1066.204889931689, 6.306900543032179e-192)
4 invgauss (1076.96978515332, 3.650894820526847e-194)
7 expon (2457.1474484587093, 0.0)
passing_r.head()
Distribution | chi_square and p-value | |
---|---|---|
10 | triang | (461.9624452114939, 4.799796517458444e-69) |
3 | beta | (619.2886679176573, 2.716009412709153e-100) |
2 | weibull_max | (782.1495727872282, 2.4780499811803997e-133) |
5 | uniform | (800.7205543332755, 3.9377128547833523e-137) |
9 | pearson3 | (902.4903827437414, 4.87937692523532e-158) |
# Identify the best fit distribution from the passing test
best_distribution(passing_dist, passing_r)
('weibull_min', [12.20521715428722, -9.947987307617899, 10.381897325624372])
After finding the best distribution for failing distribution, we find the corresponding parameters for the same distribution in the passing distribution.
# Find the corresponding passing test distribution parameters for the
# best fit distribution identified from the failing test above
parameters_passing = passing_dist[passing_dist["Distribution Names"] == best_dist][
"Parameters"
].values
parameters_passing = list(itertools.chain(*parameters_passing))
# Standardize the features by removing the mean and scaling to unit variance
y_std_failing, len_y_failing, y_failing = standardize(
failures_test, "test_duration", 0.99, 0.01
)
# Standardize the features by removing the mean and scaling to unit variance
y_std_passing, len_y_passing, y_passing = standardize(
passing_test, "test_duration", 0.99, 0.01
)
Optimal Stopping Point Calculation#
Let’s move forward to find the optimal stopping point for the test by passing the best distribution name, failing and passing distributions and the corresponding distribution parameters.
osp = optimal_stopping_point(
best_dist,
y_std_failing,
y_failing,
parameters_failing,
y_std_passing,
y_passing,
parameters_passing,
)
# Optimat Stopping Point for `operator.Run multi-stage test e2e-aws-upgrade
# - e2e-aws-upgrade-openshift-e2e-test container test`
osp
104.3979969544608
This tells us that the optimal stopping point should be at test duration run length of 104.39 seconds.
Conclusion#
In this notebook we were able to:
Fetch the data for all passing and failing tests
Filter the data for the test -
operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test
Find the best distribution for the test
Find the optimal stopping point for the test