Skip to content

MohammadAsadolahi/knapsack-problem-solved-with-evolutionary-strategy-Genetic-algorithm-in-python

Repository files navigation

Evolutionary Knapsack Optimizer

Solving NP-Hard Combinatorial Optimization via Genetic Algorithms

Python 3.8+ License Algorithm Status


A from-scratch implementation of a Genetic Algorithm (GA) for the 0/1 Knapsack Problem — one of Karp's original 21 NP-complete problems — demonstrating how biologically-inspired metaheuristics can efficiently approximate solutions to computationally intractable optimization landscapes.



Table of Contents


Problem Statement

The 0/1 Knapsack Problem is a cornerstone of combinatorial optimization:

Given a set of $n$ items, each with a value $v_i$ and weight $w_i$, determine the subset that maximizes total value while respecting a capacity constraint $W$.

$$\max \sum_{i=1}^{n} v_i \cdot x_i \quad \text{subject to} \quad \sum_{i=1}^{n} w_i \cdot x_i \leq W, \quad x_i \in {0, 1}$$

With $n$ items, the brute-force search space is $O(2^n)$. For n = 20 (default configuration), that's 1,048,576 candidate solutions. This implementation converges to a near-optimal solution by evaluating only a fraction of that space through evolutionary search.


Why Genetic Algorithms?

Traditional exact methods (branch-and-bound, dynamic programming) guarantee optimality but scale poorly. Genetic Algorithms offer a fundamentally different paradigm:

Property Exact Methods Genetic Algorithm
Optimality Guarantee Yes Approximate
Time Complexity $O(nW)$ or $O(2^n)$ $O(g \cdot p \cdot n)$
Parallelizable Limited Inherently parallel
Handles Noisy Objectives No Yes
Scalability Degrades exponentially Graceful degradation

Where $g$ = generations, $p$ = population size, $n$ = item count. For large-scale real-world instances — resource allocation in cloud infrastructure, portfolio optimization, cargo loading — metaheuristics are often the only practical approach.


Algorithm Design

This implementation follows the canonical GA pipeline with custom enhancements:

┌─────────────────────────────────────────────────────────────────┐
│                    EVOLUTIONARY PIPELINE                        │
│                                                                 │
│  ┌──────────┐    ┌───────────┐    ┌──────────┐    ┌──────────┐ │
│  │  INIT    │───▶│ SELECTION │───▶│CROSSOVER │───▶│ MUTATION │ │
│  │Population│    │ (Elitism) │    │(1-Point) │    │(Bit-Flip)│ │
│  └──────────┘    └───────────┘    └──────────┘    └──────────┘ │
│       │                                                 │       │
│       │              ┌──────────────┐                   │       │
│       │              │  EVALUATION  │                   │       │
│       └──────────────│   Fitness &  │◀──────────────────┘       │
│                      │  Constraint  │                           │
│                      └──────┬───────┘                           │
│                             │                                   │
│                      ┌──────▼───────┐                           │
│                      │   SURVIVOR   │                           │
│                      │  SELECTION   │──── Repeat g generations  │
│                      └──────────────┘                           │
└─────────────────────────────────────────────────────────────────┘

Key Genetic Operators

Operator Strategy Detail
Encoding Binary chromosome Each gene $x_i \in {0,1}$ represents item inclusion/exclusion
Initialization Constrained random Only feasible, non-duplicate, non-empty solutions enter the initial pool
Crossover Single-point (midpoint) Offspring inherit the first half from one parent, second half from the other
Mutation Adaptive bit-flip Triggered by duplication, infeasibility, or stochastic mutation rate threshold
Selection $(\mu + \lambda)$ Parents and offspring compete; top $\mu$ survive (steady-state elitism)
Elitism Generational best The fittest chromosome per generation is archived for analysis

Fitness Function & Selection Pressure

The fitness function goes beyond naive value summation. It introduces a capacity-proximity penalty that rewards solutions packing close to the knapsack limit:

$$f(x) = \frac{V \cdot W_{used} - V \cdot |W_{max} - W_{used}|}{W_{used}}$$

Where:

  • $V = \sum v_i x_i$ — total value of selected items
  • $W_{used} = \sum w_i x_i$ — total weight consumed
  • $W_{max}$ — knapsack capacity (threshold)

This creates a dual selection pressure: maximize value while penalizing under-utilization of capacity. The penalty term $|W_{max} - W_{used}|$ ensures the algorithm doesn't converge to trivially light (and low-value) solutions — a common failure mode in constrained GAs.


Convergence Analysis

Empirical results demonstrate characteristic GA convergence behavior over 100 generations:

Elite Chromosome Fitness (Best-of-Generation)

Elite Chromosome Fitness

The monotonically non-decreasing step function confirms elitism preservation — the best solution is never lost. Rapid improvement occurs in the first ~20 generations (exploration phase), followed by plateau refinement (exploitation phase).

Population Average Fitness

Population Average Fitness

The smooth asymptotic curve shows healthy population convergence without premature stagnation. The gap between average and elite fitness narrows over time, indicating the entire population is being pulled toward high-quality regions of the search space.

Combined Convergence View

Elite vs Average Fitness

The convergence of both curves toward ~830 absolute worth by generation 60-70 demonstrates the algorithm's ability to balance exploration and exploitation — a hallmark of well-tuned evolutionary strategies.


Architecture

knapsack-problem-solved-with-evolutionary-strategy-Genetic-algorithm-in-python/
│
├── knapsack with GA.py       # Core GA engine (single-file, zero dependencies beyond matplotlib)
├── docs/
│   └── ALGORITHM.md          # Deep-dive: mathematical foundations & complexity analysis
├── requirements.txt          # Python dependencies
├── LICENSE                   # Apache 2.0
└── README.md                 # You are here

Class Design

class Chromosome:
    """Encapsulates a candidate solution as a binary vector with cached fitness metrics."""
    solution: List[int]       # Binary encoding [0,1,1,0,...] 
    value: int                # Σ selected item values
    weight: int               # Σ selected item weights
    absoluteWorth: float      # Composite fitness score

class GeneticSolver:
    """Orchestrates the full evolutionary optimization loop."""
    # Hyperparameters: populationSize, generationCount, mutationRate, itemCount, threshold
    # Core methods: initialPopulation(), crossOver(), mutate(), lunchEvolution(), solve()

Quick Start

# Clone the repository
git clone https://github.com/MohammadAsadolahi/knapsack-problem-solved-with-evolutionary-strategy-Genetic-algorithm-in-python.git
cd knapsack-problem-solved-with-evolutionary-strategy-Genetic-algorithm-in-python

# Install dependencies
pip install -r requirements.txt

# Run the optimizer
python "knapsack with GA.py"

The solver will output generational statistics to the console and display three matplotlib visualizations upon completion.


Configuration Space

Tune the GA by modifying the constructor parameters:

# GeneticSolver(populationSize, generationCount, mutationRate, itemCount, threshold)

GeneticSolver(10, 100, 20, 20, 400)   # Default: balanced exploration/exploitation
GeneticSolver(6, 200, 10, 10, 100)    # Small instance: fewer items, more generations
GeneticSolver(10, 80, 10, 25, 600)    # Large instance: more items, higher capacity
Parameter Symbol Description Trade-off
populationSize $\mu$ Individuals per generation Diversity ↔ Computation
generationCount $g$ Evolutionary iterations Convergence quality ↔ Runtime
mutationRate $p_m$ Mutation probability (%) Exploration ↔ Exploitation
itemCount $n$ Number of available items Problem complexity ($2^n$ space)
threshold $W$ Knapsack weight capacity Constraint tightness

Theoretical Foundations

For a comprehensive treatment of the mathematical underpinnings, complexity analysis, and connections to the broader evolutionary computation literature, see docs/ALGORITHM.md.

Topics covered:

  • Schema Theorem and Building Block Hypothesis
  • Convergence guarantees under elitist selection
  • Constraint handling in evolutionary optimization
  • Connections to simulated annealing and particle swarm optimization
  • Computational complexity of the 0/1 Knapsack Problem (Karp, 1972)

Author

Mohammad AsadolahiSenior Agentic AI Engineer

Focused on Agentic AI Architectures In The Wild.


License

This project is licensed under the Apache License 2.0 — see LICENSE for details.


Built with evolutionary principles. Optimized by nature.



this readme is AI assisted generated, so check for mistakes

About

solving knapsack problem with n items with GA(genetic algorithm)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages