The paper mentions that agents are told: "IMPORTANT: You are NOT allowed to use the OpenAI API for anything but this evaluation script." in section 5.4 when they have access to an OpenAI API Key for evaluation, but this is not explicitly mentioned in the instructions given to the agents or the judge.
Are there other general rules that should be applied for synthetic data generation? e.g., are all closed-source LLM providers, or teacher models bigger than the model being post-trained, off the table?
The paper mentions that agents are told: "IMPORTANT: You are NOT allowed to use the OpenAI API for anything but this evaluation script." in section 5.4 when they have access to an OpenAI API Key for evaluation, but this is not explicitly mentioned in the instructions given to the agents or the judge.
Are there other general rules that should be applied for synthetic data generation? e.g., are all closed-source LLM providers, or teacher models bigger than the model being post-trained, off the table?