Bug Report
operator-sdk v.1.14.0
scorecard-test v1.14.0
I'm from the DCI team. As a part of our daily cert suite, we regularly run operator-sdk scorecard --selector=test=basic-check-spec-test for two operators: simple-demo-operator and testpmd-operator, 10 tests per day.
Normally, basic-check-spec-test should be green for both. But it fails occasionally in 20% of cases for both operators in a row with timeout error error running tests context deadline exceeded. To increase timeout with wait-time up to 300s doesn't help. Also, the test is always failing for both operators in a row and it looks like a 10-20 min of some internal API.
To have more information, could you please add more information in the logs about where did exactly the timeout happen?
What did you do?
operator_sdk scorecard \
--output json \
--selector=test=basic-check-spec-test \
--kubeconfig {{ kubeconfig_path }} \
--namespace scorecard-testing \
--service-account default \
--config {{ scorecard_config_path }} \
--verbose \
--wait-time 300s \
{{ scorecard_operator_dir }}
What did you expect to see?
The results of basic-check-spec-test should be stable. In the case of timeout, it would be nice to have logs to identify what is the reason for this timeout.
What did you see instead? Under which circumstances?
Timeout in 20% of cases error running tests context deadline exceeded with no detailed logs.
Environment
Operator type:
-
name: "testpmd-operator"
version: "v0.2.9"
image: "quay.io/rh-nfv-int/testpmd-operator-bundle@sha256:5e28f883faacefa847104ebba1a1a22ee897b7576f0af6b8253c68b5c8f42815"
index_image: "quay.io/tkrishtop/index-testpmd-operator-bundle:v0.2.9"
-
name: "simple-demo-operator"
version: "v0.0.3"
image: "quay.io/opdev/simple-demo-operator-bundle@sha256:eff7f86a54ef2a340dbf739ef955ab50397bef70f26147ed999e989cfc116b79"
index_image: "quay.io/opdev/simple-demo-operator-catalog:v0.0.3"
Kubernetes cluster type:
Happens randomly for the latest stable OCP 4.7, OCP 4.8, OCP 4.9, OCP 4.10
$ operator-sdk version
operator-sdk v.1.14.0
scorecard-test v1.14.0
Possible Solution
It would be nice to have more detailed logs to identify what is the reason for this timeout.
Bug Report
I'm from the DCI team. As a part of our daily cert suite, we regularly run
operator-sdk scorecard --selector=test=basic-check-spec-testfor two operators:simple-demo-operatorandtestpmd-operator, 10 tests per day.Normally,
basic-check-spec-testshould be green for both. But it fails occasionally in 20% of cases for both operators in a row with timeout errorerror running tests context deadline exceeded. To increase timeout withwait-timeup to 300s doesn't help. Also, the test is always failing for both operators in a row and it looks like a 10-20 min of some internal API.To have more information, could you please add more information in the logs about where did exactly the timeout happen?
What did you do?
What did you expect to see?
The results of
basic-check-spec-testshould be stable. In the case of timeout, it would be nice to have logs to identify what is the reason for this timeout.What did you see instead? Under which circumstances?
Timeout in 20% of cases
error running tests context deadline exceededwith no detailed logs.Environment
Operator type:
Kubernetes cluster type:
Happens randomly for the latest stable OCP 4.7, OCP 4.8, OCP 4.9, OCP 4.10
$ operator-sdk versionPossible Solution
It would be nice to have more detailed logs to identify what is the reason for this timeout.