Compute Worker - Fix submission duplication during ingestion#2303
Compute Worker - Fix submission duplication during ingestion#2303ihsaan-ullah wants to merge 6 commits intodevelopfrom
Conversation
|
Please check this function _create_container where we create container using the following code container = client.create_container(
self.container_image,
name=container_name,
host_config=host_config,
detach=False,
volumes=volumes_host,
command=command,
working_dir="/app/program",
environment=[
"PYTHONUNBUFFERED=1",
"http_proxy=" + Settings.COMPETITION_CONTAINER_HTTP_PROXY,
"https_proxy=" + Settings.COMPETITION_CONTAINER_HTTPS_PROXY,
],
network_disabled=Settings.COMPETITION_CONTAINER_NETWORK_DISABLED,
)the line to check is
NOTE: this is not something I introduced but clarification will be useful. |
|
Hello @ihsaan-ullah, I need to check more in depth, but from what I recall this folder is shared between scoring and ingestion. |
|
#2294 is merged, we can rebase this PR. |
rebasing
…stion and making submission available during scoring. Also copying submission files to ingestion predictions i.e. /app/input/res to make sure already existing competitions do not break rebased
8168952 to
6a1733d
Compare
|
How can we make sure that all compute workers run this code? Do we have any mechanism to force people to update their compute workers? |
For v1.25 we are asking organizers to upgrade their workers. That is indeed a bit fragile. |
|
@Didayolo this PR is ready for testing. One point to check and maybe fix/clarify in the code:I feel that there is an inconsistency in the compute_worker code when we use
def replace_legacy_metadata_command(
command, kind, is_scoring, ingestion_only_during_scoring=False
):
vars_to_replace = [
("$input", "/app/input_data" if kind == "ingestion" else "/app/input"),
("$output", "/app/output"),
(
"$program",
"/app/ingestion_program"
if ingestion_only_during_scoring and is_scoring
else "/app/program",
),
ingestion_program_dir = os.path.join(self.root_dir, "ingestion_program")I have added this point to the meeting agenda to discuss with Obada and others too |
Description
This PR updates the compute worker to avoid duplicating submission during ingestion and making submission available during scoring. Also copying submission files to ingestion predictions i.e. /app/input/res to make sure already existing competitions do not break
Issue fixed
Checklist