Optimal Stopping Point for CI Tests#

One of the machine learning explorations within the OpenShift CI Analysis project is predicting optimal stopping point for CI tests based on their test duration (runtimes) (see this issue for more details). In a previous notebook we showed how to access the TestGrid data, and then performed initial data analysis as well as feature engineering on it. Furthermore, we also calculated the optimal stopping point by identifying the distribution of the test_duration values for different CI tests and comparing the distributions of passing/failing tests.

In this notebook, we will detect the optimal stopping point for different CI tests taken as inputs.

## Import libraries
import os
import gzip
import json
import datetime
import itertools
import scipy  # noqa F401
from scipy.stats import (  # noqa F401
    invgauss,
    lognorm,
    pearson3,
    weibull_min,
    triang,
    beta,
    norm,
    weibull_max,
    uniform,
    gamma,
    expon,
)

from ipynb.fs.defs.osp_helper_functions import (
    CephCommunication,
    fit_distribution,
    standardize,
    filter_test_type,
    fetch_all_tests,
    best_distribution,
    optimal_stopping_point,
)
import warnings

warnings.filterwarnings("ignore")

Ceph#

Connection to Ceph for importing the TestGrid data

## Specify variables
METRIC_NAME = "time_to_fail"

# Specify the path for input grid data
INPUT_DATA_PATH = "../../data/raw/testgrid_258.json.gz"

# Specify the path for output metric data
OUTPUT_DATA_PATH = f"../../../../data/processed/metrics/{METRIC_NAME}"

## CEPH Bucket variables
## Create a .env file on your local with the correct configs
s3_endpoint_url = os.getenv("S3_ENDPOINT")
s3_access_key = os.getenv("S3_ACCESS_KEY")
s3_secret_key = os.getenv("S3_SECRET_KEY")
s3_bucket = os.getenv("S3_BUCKET")
s3_path = os.getenv("S3_PROJECT_KEY", "metrics")
s3_input_data_path = "raw_data"

# Specify whether or not we are running this as a notebook or part of an automation pipeline.
AUTOMATION = os.getenv("IN_AUTOMATION")
## Import data
timestamp = datetime.datetime.today()

if AUTOMATION:
    filename = f"testgrid_{timestamp.day}{timestamp.month}.json"
    cc = CephCommunication(s3_endpoint_url, s3_access_key, s3_secret_key, s3_bucket)
    s3_object = cc.s3_resource.Object(s3_bucket, f"{s3_input_data_path}/{filename}")
    file_content = s3_object.get()["Body"].read().decode("utf-8")
    testgrid_data = json.loads(file_content)

else:
    with gzip.open(INPUT_DATA_PATH, "rb") as read_file:
        testgrid_data = json.load(read_file)

Fetch all tests#

Using the function fetch_all_tests, we will fetch all passing and failing tests into two dataframes.

# Fetch all failing tests i.e which have a status code of 12
failures_df = fetch_all_tests(testgrid_data, 12)
failures_df.head()
timestamp tab grid test test_duration failure/passing
8 2021-08-16 23:03:14 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... Overall 20.016667 True
10 2021-08-16 00:01:05 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... Overall 108.233333 True
22 2021-08-16 23:03:14 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... operator.Run multi-stage test e2e-metal-assist... 13.166667 True
24 2021-08-16 00:01:05 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... operator.Run multi-stage test e2e-metal-assist... 89.983333 True
38 2021-08-16 00:01:05 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... TestInstall_test_install.start_install_and_wai... 60.004001 True
# Fetch all passing tests i.e which have a status code of 1
passing_df = fetch_all_tests(testgrid_data, 1)
passing_df.head()
timestamp tab grid test test_duration failure/passing
1 2021-08-23 00:01:04 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... Overall 95.300000 True
2 2021-08-22 08:53:17 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... Overall 101.800000 True
3 2021-08-20 23:21:32 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... Overall 134.833333 True
4 2021-08-20 15:57:36 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... Overall 109.833333 True
5 2021-08-20 06:47:40 "redhat-assisted-installer" periodic-ci-openshift-release-master-nightly-4... Overall 94.800000 True

Filter tests#

After collecting the data for all passing and failing tests, we will move towards narrowing down to one test for which we would want to calculate the optimal stopping point. We will be using the test - operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test and extract the data for this test.

failures_test = filter_test_type(
    failures_df,
    "operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test",
)
failures_test.head()
timestamp tab grid test test_duration failure/passing
0 2021-08-25 12:17:53 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 85.866667 True
1 2021-08-25 10:30:05 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 91.916667 True
2 2021-08-25 04:41:24 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 101.133333 True
3 2021-08-24 20:03:02 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 98.450000 True
4 2021-08-24 04:35:23 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 93.216667 True
passing_test = filter_test_type(
    passing_df,
    "operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test",
)
passing_test.head()
timestamp tab grid test test_duration failure/passing
0 2021-08-25 13:06:02 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 101.250000 True
1 2021-08-25 07:15:39 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 94.283333 True
2 2021-08-25 06:08:52 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 90.316667 True
3 2021-08-25 02:54:53 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 93.866667 True
4 2021-08-24 22:40:00 "redhat-openshift-informing" release-openshift-okd-installer-e2e-aws-upgrade operator.Run multi-stage test e2e-aws-upgrade ... 92.900000 True

Fit Distribution#

After extracting the data for one test, we would want to find the best distribution to perform optimal stopping point calculation. We find chi square and p-values to find the best distribution.

failure_dist, failures_r = fit_distribution(failures_test, "test_duration", 0.99, 0.01)
Distributions listed by Betterment of fit:
............................................
   Distribution     chi_square and p-value
3          beta  (2148.0315961744586, 0.0)
9      pearson3   (2150.964892187448, 0.0)
1          norm   (2178.439189095538, 0.0)
8       lognorm   (2190.171386750302, 0.0)
6         gamma  (2251.5768352345144, 0.0)
0   weibull_min  (2335.2881528000057, 0.0)
2   weibull_max  (2436.7340969950874, 0.0)
4      invgauss  (2581.7529201615253, 0.0)
10       triang   (3168.817214371956, 0.0)
5       uniform  (5205.7686822999685, 0.0)
7         expon   (7308.400793415922, 0.0)
# Identify the best fit distribution from the failing test along with its corresponding distribution parameters
best_dist, parameters_failing = best_distribution(failure_dist, failures_r)
# Identify the distributions for the passing test along with its corresponding distribution parameters
passing_dist, passing_r = fit_distribution(passing_test, "test_duration", 0.99, 0.01)
Distributions listed by Betterment of fit:
............................................
   Distribution                         chi_square and p-value
10       triang     (461.9624452114939, 4.799796517458444e-69)
3          beta    (619.2886679176573, 2.716009412709153e-100)
2   weibull_max   (782.1495727872282, 2.4780499811803997e-133)
5       uniform   (800.7205543332755, 3.9377128547833523e-137)
9      pearson3     (902.4903827437414, 4.87937692523532e-158)
0   weibull_min   (961.9366558978377, 2.6033191498811774e-170)
6         gamma   (1025.0253918219983, 2.234474698949537e-183)
8       lognorm  (1063.4355506988115, 2.3726995807065007e-191)
1          norm    (1066.204889931689, 6.306900543032179e-192)
4      invgauss     (1076.96978515332, 3.650894820526847e-194)
7         expon                      (2457.1474484587093, 0.0)
passing_r.head()
Distribution chi_square and p-value
10 triang (461.9624452114939, 4.799796517458444e-69)
3 beta (619.2886679176573, 2.716009412709153e-100)
2 weibull_max (782.1495727872282, 2.4780499811803997e-133)
5 uniform (800.7205543332755, 3.9377128547833523e-137)
9 pearson3 (902.4903827437414, 4.87937692523532e-158)
# Identify the best fit distribution from the passing test
best_distribution(passing_dist, passing_r)
('weibull_min', [12.20521715428722, -9.947987307617899, 10.381897325624372])

After finding the best distribution for failing distribution, we find the corresponding parameters for the same distribution in the passing distribution.

# Find the corresponding passing test distribution parameters for the
# best fit distribution identified from the failing test above
parameters_passing = passing_dist[passing_dist["Distribution Names"] == best_dist][
    "Parameters"
].values
parameters_passing = list(itertools.chain(*parameters_passing))
# Standardize the features by removing the mean and scaling to unit variance
y_std_failing, len_y_failing, y_failing = standardize(
    failures_test, "test_duration", 0.99, 0.01
)
# Standardize the features by removing the mean and scaling to unit variance
y_std_passing, len_y_passing, y_passing = standardize(
    passing_test, "test_duration", 0.99, 0.01
)

Optimal Stopping Point Calculation#

Let’s move forward to find the optimal stopping point for the test by passing the best distribution name, failing and passing distributions and the corresponding distribution parameters.

osp = optimal_stopping_point(
    best_dist,
    y_std_failing,
    y_failing,
    parameters_failing,
    y_std_passing,
    y_passing,
    parameters_passing,
)
# Optimat Stopping Point for `operator.Run multi-stage test e2e-aws-upgrade
# - e2e-aws-upgrade-openshift-e2e-test container test`
osp
104.3979969544608

This tells us that the optimal stopping point should be at test duration run length of 104.39 seconds.

Conclusion#

In this notebook we were able to:

  • Fetch the data for all passing and failing tests

  • Filter the data for the test - operator.Run multi-stage test e2e-aws-upgrade - e2e-aws-upgrade-openshift-e2e-test container test

  • Find the best distribution for the test

  • Find the optimal stopping point for the test