This techer-student model is applicable on majority of models , I have used a 5 layer CNN model as student to explicitly show how knowledge distillation can improve ones model accuracy even with a subpar model. The student model is infact not great yet it still shows an accuracy of 53% , i will be adding more examples and improving the student model for comparitive study.
This Repository is under work and is to continuously track such examples