Skip to content

Architecture Proposal: LookThem V5 (Ultra-Lightweight 1.4M Parameter Model) for Tiny-ImageNet Benchmark #15

@theJuniorProgrammer3

Description

@theJuniorProgrammer3

Hi everyone,
​I would like to propose a custom lightweight architecture called LookThem V5 to be included or benchmarked in this repository.
​LookThem V5 uses a custom Ratio-based Attention mechanism (without the traditional QKV structure). In my independent training from scratch for only 40 epochs on Google Colab, it achieved 35.46% Validation Accuracy with a very small model size (around 1.4M parameters / 5.38 MB).
​Since I have limited GPU resources to run the full 200-epoch cyclic learning rate protocol required by this repository, I would be very grateful if anyone with better hardware could benchmark it according to the official procedures here.
​All the codes, notebooks, and training logs are available on my Hugging Face repository:
👉 https://huggingface.co/ASomeoneWhoInterestedWithAI/LookThem_Tiny-ImageNet
​How to integrate and benchmark the code:

  1. ​Open the notebook named LookThemV5ContinueTrain.ipynb in the Hugging Face repo.
  2. ​Extract the model architecture code (LookThemLayer, etc.).
  3. ​To train it completely from scratch, please remove the model checkpoint import/loading section.
  4. ​Change the starting epoch back to 0, and adjust the training loop from 40 to 200 epochs to match this repository's benchmark procedure.
  5. ​Feel free to adjust the rest of the pipeline configuration (such as the data loader or training script) to seamlessly fit your environment.
    ​Thank you very much! I would love to see how this architecture performs under your full 200-epoch protocol.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions