Galaxy Classification with Deep Learning

This repository documents a deep learning project on galaxy image classification using the Galaxy10 SDSS dataset. The project focuses on two core tasks: implementing a custom convolutional neural network (CNN) and applying transfer learning with DenseNet121.

Both approaches were trained and evaluated on the same dataset within a shared workflow for preprocessing, augmentation, training, and model evaluation.

Project Overview

The repository contains the training pipeline, model evaluation, and selected result visualizations for a deep learning classification task based on galaxy images.

The project focuses on:

data loading and preprocessing
stratified train/validation/test splitting
image augmentation
training a custom CNN
transfer learning with DenseNet121
model evaluation using accuracy, loss curves, and confusion matrices

Dataset

This project was developed for the Galaxy10 SDSS dataset.

The dataset file Galaxy10.h5 is not included in this repository.
To run the project, download the dataset separately and place it in the repository's data/ directory:

data/Galaxy10.h5

Repository Structure

galaxy-classification-deep-learning/
├── README.md
├── requirements.txt
├── .gitignore
├── data/
│   └── Galaxy10.h5
├── src/
│   └── galaxy_classifier_train.py
├── images/
├── models/

Installation

Create and activate a virtual environment, then install the required dependencies:

pip install -r requirements.txt

Requirements

The final version of this project does not require astroNN as an installation dependency.

A suitable requirements.txt for this repository is:

tensorflow>=2.13,<2.18
keras>=2.13,<3.0
h5py>=3.8,<4.0
numpy>=1.24,<2.0
scikit-learn>=1.3,<1.6
matplotlib>=3.7,<3.10

Usage

Set the model type in the training script:

MODEL_TYPE = "custom_cnn"   # Options: "custom_cnn" or "densenet121_transfer"

Then run:

python src/galaxy_classifier_train.py

The script will:

load the dataset
split the data into train, validation, and test sets
train the selected model
evaluate the trained model
save the trained model in models/
save result plots in images/

Implemented Approaches

1. Custom CNN

A convolutional neural network built from scratch using multiple Conv2D, MaxPooling2D, Dropout, and Dense layers.

2. Transfer Learning with DenseNet121

A pretrained DenseNet121 backbone with ImageNet weights, used as a frozen feature extractor and extended with a final classification layer.

Data Augmentation

The final training pipeline uses geometric augmentation:

random flipping
random rotation

Additional experiments with color-channel-based augmentation were explored during development, but were not included in the final pipeline because they did not provide a robust performance improvement on this dataset.

Evaluation

Model performance is evaluated using:

training and validation accuracy
training and validation loss
test set evaluation
normalized confusion matrix

Notes

This repository is intended as a portfolio project and demonstrates a deep learning workflow for image classification rather than a deployment-ready end-user application.
The dataset is strongly imbalanced across classes.
Extremely small classes make some balancing strategies less useful in practice.

Acknowledgments

This project builds on the Galaxy10 dataset resources published by Henry Leung (henrysky). The Galaxy10 repository also credits Jo Bovy as co-author of the dataset work.

Useful references:

During development, the project also drew on the surrounding documentation and examples from astroNN, an astronomy-focused deep learning project created by Henry Leung, with Jo Bovy credited in the project documentation.

Useful references:

Authors

Andreas Schulz
Stefan Anell

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Galaxy Classification with Deep Learning

Project Overview

Dataset

Repository Structure

Installation

Requirements

Usage

Implemented Approaches

1. Custom CNN

2. Transfer Learning with DenseNet121

Data Augmentation

Evaluation

Notes

Acknowledgments

Authors

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Galaxy Classification with Deep Learning

Project Overview

Dataset

Repository Structure

Installation

Requirements

Usage

Implemented Approaches

1. Custom CNN

2. Transfer Learning with DenseNet121

Data Augmentation

Evaluation

Notes

Acknowledgments

Authors