When running on this command dicee --sparql_endpoint "https://dbpedia.data.dice-research.org/sparql" --trainer PL --model "DeCaL" --num_epochs 10 --batch_size 32 --p 1 --q 1 --r 1 --embedding_dim 16 --scoring_technique KvsAll --eval_model None --optim Adam --lr 0.01 --num_core 32 --backend polars --path_to_store_single_run "DBpedia-Embs" --save_embeddings_as_csv
the size of self.trainer.dataset.entity_to_idx decreases when being ``initialized the second time due to 2 GPUs''. This causes target_dim in KvsAll to change and leads to a size mismatch error in the loss function: outputs from the model have a the same size as the initial value of ```len(self.trainer.dataset.entity_to_idx)```, while the targets take the new value of ```len(self.trainer.dataset.entity_to_idx)``` obtained through the second GPU process.
Is there a way to make sure that every initialization involving datasets are done only once independently of the number of GPUs?
Note: I was trying to fix the issue but I am not sure I will have enough time. Debugging already took me 5h:)
When running on this command
dicee --sparql_endpoint "https://dbpedia.data.dice-research.org/sparql" --trainer PL --model "DeCaL" --num_epochs 10 --batch_size 32 --p 1 --q 1 --r 1 --embedding_dim 16 --scoring_technique KvsAll --eval_model None --optim Adam --lr 0.01 --num_core 32 --backend polars --path_to_store_single_run "DBpedia-Embs" --save_embeddings_as_csvthe size of
self.trainer.dataset.entity_to_idxdecreases when being ``initialized the second time due to 2 GPUs''. This causestarget_dimin KvsAll to change and leads to a size mismatch error in the loss function: outputs from the model have a the same size as the initial value of ```len(self.trainer.dataset.entity_to_idx)```, while the targets take the new value of ```len(self.trainer.dataset.entity_to_idx)``` obtained through the second GPU process.Is there a way to make sure that every initialization involving datasets are done only once independently of the number of GPUs?
Note: I was trying to fix the issue but I am not sure I will have enough time. Debugging already took me 5h:)