This directory contains comprehensive examples and best practices for optimizing Python code performance. From basic time complexity analysis to advanced concurrent programming techniques, these resources will help you write faster, more efficient Python applications.
| File | Description | Key Topics |
|---|---|---|
01-time-complexity-analysis.py |
Understanding and analyzing algorithm complexity | Big O notation, complexity comparison, optimization strategies |
02-memory-optimization.py |
Memory usage optimization techniques | Memory profiling, efficient data structures, garbage collection |
03-data-structures-selection.py |
Choosing optimal data structures | Lists vs sets vs dicts, performance characteristics |
04-algorithm-optimization.py |
Algorithm improvement strategies | Search algorithms, sorting optimizations, algorithmic thinking |
05-profiling-and-benchmarking.py |
Performance measurement and analysis | cProfile, timeit, custom benchmarking frameworks |
06-caching-strategies.py |
Caching techniques for performance | LRU cache, memoization, custom caching solutions |
07-lazy-evaluation.py |
Lazy evaluation and generators | Memory-efficient iteration, generator expressions |
08-string-operations.py |
String processing optimizations | Efficient concatenation, formatting, regex optimization |
09-io-optimization.py |
Input/Output performance improvements | File handling, network operations, buffering strategies |
10-concurrent-programming.py |
Parallel and concurrent execution | Threading, multiprocessing, asyncio patterns |
The tutorial/ subdirectory contains a comprehensive guide covering all performance optimization topics in detail.
Start with time complexity analysis to understand how your algorithms scale:
# Example from 01-time-complexity-analysis.py
def efficient_search(data, target):
"""O(1) lookup using set"""
data_set = set(data)
return target in data_set
def inefficient_search(data, target):
"""O(n) lookup using list"""
return target in dataLearn to optimize memory usage with appropriate data structures:
# Example from 02-memory-optimization.py
# Use generators for large datasets
def memory_efficient_processing(large_dataset):
return (process_item(item) for item in large_dataset)
# Use __slots__ for classes with many instances
class OptimizedPoint:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x, self.y = x, yAlways measure before optimizing:
# Example from 05-profiling-and-benchmarking.py
import cProfile
def profile_function(func):
profiler = cProfile.Profile()
profiler.enable()
result = func()
profiler.disable()
profiler.print_stats()
return result- Use profiling tools to identify bottlenecks
- Don't optimize based on assumptions
- Focus on the 80/20 rule: 80% of time spent in 20% of code
- Lists: For ordered data with frequent appends
- Sets: For unique items and fast membership testing
- Dicts: For key-value mappings and fast lookups
- Deques: For efficient operations at both ends
- Choose algorithms with better time complexity
- Use built-in functions (they're optimized in C)
- Consider caching for expensive computations
- Use generators for large datasets
- Implement
__slots__for classes with many instances - Be mindful of object lifecycle and references
- Use appropriate buffering for file operations
- Leverage async/await for I/O-bound tasks
- Use multiprocessing for CPU-bound tasks
# From 03-data-structures-selection.py
import time
def compare_membership_testing():
data_list = list(range(10000))
data_set = set(range(10000))
target = 9999
# List: O(n) - slow for large datasets
start = time.time()
result = target in data_list
list_time = time.time() - start
# Set: O(1) - fast regardless of size
start = time.time()
result = target in data_set
set_time = time.time() - start
print(f"Set is {list_time/set_time:.1f}x faster")# From 08-string-operations.py
def efficient_string_building(words):
return "".join(words) # O(n)
def inefficient_string_building(words):
result = ""
for word in words:
result += word # O(n²)
return result# From 06-caching-strategies.py
from functools import lru_cache
@lru_cache(maxsize=128)
def expensive_computation(n):
# Cached results for repeated calls
return sum(i**2 for i in range(n))- cProfile: Function-level profiling
- timeit: Micro-benchmarking
- tracemalloc: Memory usage tracking
- line_profiler: Line-by-line profiling
- memory_profiler: Memory usage profiling
- py-spy: Sampling profiler for production
- Run multiple iterations for statistical significance
- Use warmup runs to account for JIT compilation
- Control system load during benchmarking
- Measure both time and memory usage
-
Time Complexity (
01-time-complexity-analysis.py)- Understand Big O notation
- Learn to analyze algorithm efficiency
- Practice with common algorithms
-
Data Structures (
03-data-structures-selection.py)- Master when to use lists, sets, dicts
- Understand performance trade-offs
- Practice data structure selection
-
Memory Optimization (
02-memory-optimization.py)- Learn memory profiling techniques
- Understand Python memory model
- Practice memory-efficient coding
-
Algorithm Optimization (
04-algorithm-optimization.py)- Study classic algorithms
- Learn optimization techniques
- Practice algorithmic thinking
-
Profiling (
05-profiling-and-benchmarking.py)- Master profiling tools
- Learn benchmarking techniques
- Practice performance measurement
-
Caching (
06-caching-strategies.py)- Implement various caching strategies
- Understand cache invalidation
- Practice cache optimization
-
String Operations (
08-string-operations.py)- Master efficient string processing
- Learn regex optimization
- Practice text processing optimization
-
I/O Optimization (
09-io-optimization.py)- Optimize file and network operations
- Learn buffering strategies
- Practice I/O performance tuning
-
Concurrent Programming (
10-concurrent-programming.py)- Master threading and multiprocessing
- Learn async programming
- Practice concurrent optimization
-
Lazy Evaluation (
07-lazy-evaluation.py)- Understand generators and iterators
- Learn memory-efficient processing
- Practice lazy evaluation patterns
- Execution Time: Total time to complete operation
- Throughput: Operations per second
- Latency: Time to first response
- Percentiles: P50, P95, P99 response times
- Memory Usage: Peak and average memory consumption
- Memory Efficiency: Memory per operation
- Garbage Collection: GC frequency and duration
- Memory Leaks: Growing memory usage over time
- CPU Utilization: Processor usage efficiency
- I/O Operations: Disk and network performance
- Cache Hit Ratio: Effectiveness of caching
- Scalability: Performance under load
# Don't optimize without measuring first
# Profile to find actual bottlenecks# Bad: Using list for membership testing
if item in my_list: # O(n)
# Good: Using set for membership testing
if item in my_set: # O(1)# Bad: String concatenation in loop
result = ""
for item in items:
result += str(item) # Creates new string each time
# Good: Join operation
result = "".join(str(item) for item in items)# Bad: Manual implementation
def find_max(numbers):
max_val = numbers[0]
for num in numbers[1:]:
if num > max_val:
max_val = num
return max_val
# Good: Built-in function
max_val = max(numbers) # Optimized in C- Start with Profiling: Always measure before optimizing
- Focus on Bottlenecks: Optimize the slowest parts first
- Learn Incrementally: Master one technique at a time
- Practice Regularly: Apply techniques to real projects
- Stay Updated: Keep learning new optimization techniques
- Tutorial Directory: Comprehensive guides for each topic
- Python Documentation: Official performance tips
- Community Resources: Performance-focused Python communities
- Books: "High Performance Python" and similar resources
When adding new performance examples:
- Include clear documentation and comments
- Provide benchmarking code to demonstrate improvements
- Follow the established file naming convention
- Add comprehensive examples with real-world applications
Remember: "Premature optimization is the root of all evil" - Donald Knuth. Always profile first, then optimize based on actual bottlenecks.