Skip to content

Commit 041be22

Browse files
authored
Merge pull request #10 from jMetal/feature/software-x-requirements
**feat: add reproducibility features, headless mode docs, and multi-p…
2 parents 47bdf42 + cacab67 commit 041be22

16 files changed

Lines changed: 560 additions & 82 deletions

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ __pycache__/
77
*.py[cod]
88
*$py.class
99

10+
htmls/*.ipynb
11+
1012
# C extensions
1113
*.so
1214

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
## [Released]
1111

12+
## [1.5.0] - 2025-11-21
13+
14+
### Added
15+
- **Reproducibility support**: Added `seed` parameter to Bayesian statistical tests (`bayesian_sign_test`, `bayesian_signed_rank_test`) for deterministic results
16+
- **Reproducibility support**: Added `seed` parameter to `HistoPlot` class for consistent jitter in histogram generation
17+
- **Multi-platform CI**: Comprehensive GitHub Actions workflow testing on Windows, Linux (Ubuntu), and macOS
18+
- **Smoke tests**: Added automated smoke tests for CLI and API functionality across all platforms
19+
- **Environment files**: Added `requirements.txt`, `requirements-dev.txt`, and `environment.yml` for broader compatibility
20+
- **Headless mode documentation**: Complete documentation and examples for running SAES without display (CI/CD, servers)
21+
- **Headless mode examples**: Python and shell script examples for automated workflows (`examples/headless_mode_example.py`, `examples/headless_cli_example.sh`, `examples/headless_cli_example.bat`)
22+
- **New test suite**: Added `test_bayesian_seed.py` specifically for verifying seed reproducibility
23+
- **Documentation**: New reproducibility documentation page (`docs/usage/reproducibility.rst`) with best practices
24+
25+
### Changed
26+
- Updated `test_bayesian.py` to demonstrate both seed parameter usage and backward compatibility
27+
- Enhanced README with sections on reproducibility and headless mode
28+
- Improved documentation structure to include reproducibility guidance
29+
30+
### Fixed
31+
- Ensured all random operations can be made deterministic for reproducible research
32+
1233
## [1.4.0] - 2025-11-15
1334

1435
### Added

README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,51 @@ source venv/bin/activate # On Windows: venv\Scripts\activate
124124
pip install -e ".[dev]"
125125
```
126126

127+
### Using Environment Files
128+
129+
For broader compatibility, environment files are provided:
130+
131+
```sh
132+
# Using pip with requirements.txt
133+
pip install -r requirements.txt
134+
135+
# Using conda with environment.yml
136+
conda env create -f environment.yml
137+
conda activate saes
138+
```
139+
140+
## 🔄 Reproducibility
141+
142+
SAES supports **deterministic seeds** for reproducible research:
143+
144+
```python
145+
from SAES.statistical_tests.bayesian import bayesian_sign_test
146+
from SAES.plots.histoplot import HistoPlot
147+
148+
# Bayesian tests with seed for reproducibility
149+
result, _ = bayesian_sign_test(data, sample_size=5000, seed=42)
150+
151+
# Histogram plots with consistent jitter
152+
histoplot = HistoPlot(data, metrics, "Accuracy", seed=42)
153+
```
154+
155+
See the [reproducibility documentation](https://jMetal.github.io/SAES/usage/reproducibility.html) for details.
156+
157+
## 💻 Headless Mode
158+
159+
SAES can run in headless mode (without display) for automated workflows, CI/CD pipelines, and server environments:
160+
161+
```bash
162+
# Set matplotlib backend
163+
export MPLBACKEND=Agg
164+
165+
# Run SAES commands
166+
python -m SAES -ls -ds data.csv -ms metrics.csv -m HV -s friedman -op results.tex
167+
python -m SAES -bp -ds data.csv -ms metrics.csv -m HV -i Problem1 -op boxplot.png
168+
```
169+
170+
See `examples/headless_mode_example.py` for a complete Python example or `examples/headless_cli_example.sh` for CLI usage.
171+
127172
## 🤝 Contributors
128173

129174
- [![GitHub](https://img.shields.io/badge/GitHub-100000?style=flat&logo=github&logoColor=white)](https://github.com/rorro6787) **Emilio Rodrigo Carreira Villalta**

SAES/statistical_tests/bayesian.py

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ def bayesian_sign_test(data: pd.DataFrame,
77
rope_limits=[-0.01, 0.01],
88
prior_strength=0.5,
99
prior_place="rope",
10-
sample_size=5000) -> tuple:
10+
sample_size=5000,
11+
seed=None) -> tuple:
1112
"""
1213
Performs the Bayesian sign test to compare the performance of two algorithms across multiple instances.
1314
The Bayesian sign test is a non-parametric statistical test used to compare the performance of two algorithms on multiple instances. The null hypothesis is that the algorithms perform equivalently, which implies their average ranks are equal.
@@ -40,6 +41,9 @@ def bayesian_sign_test(data: pd.DataFrame,
4041
sample_size (int):
4142
Total number of random_search samples generated. Default is 5000.
4243
44+
seed (int, optional):
45+
Random seed for reproducibility. Default is None (non-deterministic).
46+
4347
Returns:
4448
tuple: A tuple containing the posterior probabilities and the samples drawn from the Dirichlet process. List of posterior probabilities:
4549
- Pr(algorith_1 < algorithm_2)
@@ -62,6 +66,10 @@ def bayesian_sign_test(data: pd.DataFrame,
6266
else:
6367
raise ValueError("Initialization ERROR. Incorrect number of dimensions for axis 1")
6468

69+
# Set random seed for reproducibility
70+
if seed is not None:
71+
np.random.seed(seed)
72+
6573
# Compute the differences
6674
Z = sample1 - sample2
6775

@@ -93,7 +101,8 @@ def bayesian_signed_rank_test(data,
93101
rope_limits=[-0.01, 0.01],
94102
prior_strength=1.0,
95103
prior_place="rope",
96-
sample_size=1000) -> tuple:
104+
sample_size=1000,
105+
seed=None) -> tuple:
97106
"""
98107
Performs the Bayesian version of the signed rank test to compare the performance of two algorithms across multiple instances.
99108
The Bayesian sign test is a non-parametric statistical test used to compare the performance of two algorithms on multiple instances. The null hypothesis is that the algorithms perform equivalently, which implies their average ranks are equal.
@@ -126,6 +135,9 @@ def bayesian_signed_rank_test(data,
126135
sample_size (int):
127136
Total number of random_search samples generated. Default is 5000.
128137
138+
seed (int, optional):
139+
Random seed for reproducibility. Default is None (non-deterministic).
140+
129141
Returns:
130142
tuple: A tuple containing the posterior probabilities and the samples drawn from the Dirichlet process. List of posterior probabilities:
131143
- Pr(algorith_1 < algorithm_2)
@@ -153,6 +165,10 @@ def weights(n, s):
153165
else:
154166
raise ValueError("Initialization ERROR. Incorrect number of dimensions for axis 1")
155167

168+
# Set random seed for reproducibility
169+
if seed is not None:
170+
np.random.seed(seed)
171+
156172
# Compute the differences
157173
Z = sample1 - sample2
158174
Z0 = [-float("Inf"), 0.0, float("Inf")][["left", "rope", "right"].index(prior_place)]

SOFTWARE_X_COMPLIANCE.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# Software X Compliance
2+
3+
SAES meets all Software X publication requirements.
4+
5+
## 1. Deterministic Seeds ✅
6+
7+
Bayesian tests support `seed` parameter for reproducibility:
8+
9+
```python
10+
from SAES.statistical_tests.bayesian import bayesian_sign_test
11+
result, _ = bayesian_sign_test(data, sample_size=1000, seed=42)
12+
```
13+
14+
## 2. Multi-Platform CI ✅
15+
16+
`.github/workflows/multi-platform-test.yml` tests on:
17+
- Ubuntu, Windows, macOS
18+
- Python 3.10, 3.11, 3.12
19+
20+
## 3. Smoke Tests ✅
21+
22+
Run comprehensive smoke tests (10 tests) - fully automated:
23+
24+
```bash
25+
chmod +x examples/smoke_test.sh
26+
./examples/smoke_test.sh
27+
```
28+
29+
The script automatically:
30+
- Creates virtual environment if needed
31+
- Installs dependencies
32+
- Runs all tests in headless mode
33+
34+
**Tests cover:**
35+
- **LaTeX tables** (4): Mean/Median, Friedman, Wilcoxon pivot, Wilcoxon pairwise
36+
- **Plots** (3): Boxplot single, Boxplot grid, Critical distance
37+
- **Statistical APIs** (3): Bayesian tests with seeds, Plot classes
38+
39+
## 4. Environment Files ✅
40+
41+
Multiple installation options:
42+
43+
```bash
44+
# Option 1: Requirements file
45+
pip install -r requirements.txt
46+
47+
# Option 2: Conda
48+
conda env create -f environment.yml
49+
conda activate saes
50+
51+
# Option 3: Auto-install (smoke test does this)
52+
./examples/smoke_test.sh
53+
```
54+
55+
Files provided:
56+
- `requirements.txt` - Core dependencies
57+
- `requirements-dev.txt` - Development dependencies
58+
- `environment.yml` - Conda environment
59+
60+
## 5. Headless Mode ✅
61+
62+
For CI/CD and server environments:
63+
64+
```bash
65+
export MPLBACKEND=Agg
66+
python -m SAES -ls -ds data.csv -ms metrics.csv -m HV -s friedman -op output.tex
67+
```
68+
69+
The smoke test script runs in headless mode by default.
70+
71+
## Quick Start
72+
73+
```bash
74+
# Clone and test
75+
git clone https://github.com/jMetal/SAES.git
76+
cd SAES
77+
chmod +x examples/smoke_test.sh
78+
./examples/smoke_test.sh
79+
```
80+
81+
## Verification
82+
83+
```bash
84+
# Smoke tests (automated setup)
85+
./examples/smoke_test.sh
86+
87+
# Unit tests
88+
python -m unittest discover tests
89+
```
90+
91+
## Branch
92+
93+
Feature branch: `feature/software-x-requirements`
94+
Ready for merge (not merged yet, as requested)

docs/usage/reproducibility.rst

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
Reproducibility and Seeds
2+
=========================
3+
4+
SAES supports deterministic behavior for reproducible research through random seed control.
5+
6+
Why Reproducibility Matters
7+
---------------------------
8+
9+
When analyzing stochastic algorithms, reproducibility is crucial for:
10+
11+
- **Research validation**: Others can verify your results
12+
- **Debugging**: Consistent results make it easier to identify issues
13+
- **Comparisons**: Fair comparison requires consistent conditions
14+
- **Publication**: Many journals and conferences require reproducible results
15+
16+
Functions with Random Seeds
17+
---------------------------
18+
19+
The following SAES functions support deterministic execution via the ``seed`` parameter:
20+
21+
Bayesian Statistical Tests
22+
~~~~~~~~~~~~~~~~~~~~~~~~~~
23+
24+
Both Bayesian tests support the ``seed`` parameter for reproducibility:
25+
26+
.. code-block:: python
27+
28+
from SAES.statistical_tests.bayesian import bayesian_sign_test, bayesian_signed_rank_test
29+
import pandas as pd
30+
31+
data = pd.DataFrame({
32+
'Algorithm_A': [0.9, 0.85, 0.95, 0.9, 0.92],
33+
'Algorithm_B': [0.5, 0.6, 0.55, 0.58, 0.52]
34+
})
35+
36+
# Deterministic results with seed
37+
result1, _ = bayesian_sign_test(data, sample_size=5000, seed=42)
38+
result2, _ = bayesian_sign_test(data, sample_size=5000, seed=42)
39+
# result1 and result2 will be identical
40+
41+
# Same for signed rank test
42+
result3, _ = bayesian_signed_rank_test(data, sample_size=1000, seed=123)
43+
44+
Histogram Plots
45+
~~~~~~~~~~~~~~
46+
47+
The HistoPlot class supports seeding for consistent jitter when handling identical values:
48+
49+
.. code-block:: python
50+
51+
from SAES.plots.histoplot import HistoPlot
52+
import pandas as pd
53+
54+
data = pd.read_csv("results.csv")
55+
metrics = pd.read_csv("metrics.csv")
56+
57+
# Create histoplot with reproducible jitter
58+
histoplot = HistoPlot(data, metrics, "Accuracy", seed=42)
59+
histoplot.save_instance("Problem1", "output.png")
60+
61+
Best Practices
62+
-------------
63+
64+
1. **Always use seeds for published research**: Set explicit seeds for all random operations
65+
2. **Document your seeds**: Include seed values in your research papers and code
66+
3. **Use different seeds for different experiments**: Avoid accidentally reusing the same random sequence
67+
4. **Version control**: Include seed values in your version-controlled analysis scripts
68+
69+
Example: Complete Reproducible Workflow
70+
---------------------------------------
71+
72+
.. code-block:: python
73+
74+
from SAES.statistical_tests.bayesian import bayesian_sign_test, bayesian_signed_rank_test
75+
from SAES.plots.histoplot import HistoPlot
76+
import pandas as pd
77+
78+
# Load data
79+
data = pd.read_csv("algorithm_results.csv")
80+
metrics = pd.read_csv("metrics.csv")
81+
82+
# Reproducible Bayesian analysis
83+
SEED = 42
84+
algorithm_a = data[data['Algorithm'] == 'A']['MetricValue']
85+
algorithm_b = data[data['Algorithm'] == 'B']['MetricValue']
86+
87+
comparison_data = pd.DataFrame({
88+
'Algorithm_A': algorithm_a.values,
89+
'Algorithm_B': algorithm_b.values
90+
})
91+
92+
# Run Bayesian test with seed
93+
result, samples = bayesian_sign_test(
94+
comparison_data,
95+
sample_size=5000,
96+
seed=SEED
97+
)
98+
99+
print(f"P(A < B): {result[0]:.4f}")
100+
print(f"P(A ≈ B): {result[1]:.4f}")
101+
print(f"P(A > B): {result[2]:.4f}")
102+
103+
# Create reproducible visualization
104+
histoplot = HistoPlot(data, metrics, "Accuracy", seed=SEED)
105+
histoplot.save_all_instances("comparison.png")
106+
107+
Headless Mode for Automated Workflows
108+
-------------------------------------
109+
110+
SAES can be run in headless mode (without display) for automated pipelines and CI/CD:
111+
112+
.. code-block:: bash
113+
114+
# Set matplotlib to use non-interactive backend
115+
export MPLBACKEND=Agg
116+
117+
# Run SAES commands
118+
python -m SAES -ls -ds data.csv -ms metrics.csv -m HV -s friedman -op results.tex
119+
python -m SAES -bp -ds data.csv -ms metrics.csv -m HV -i Problem1 -op boxplot.png
120+
python -m SAES -cdp -ds data.csv -ms metrics.csv -m HV -op cdplot.png
121+
122+
For Python scripts in headless environments:
123+
124+
.. code-block:: python
125+
126+
import matplotlib
127+
matplotlib.use('Agg') # Must be called before importing pyplot
128+
129+
from SAES.plots.boxplot import Boxplot
130+
import pandas as pd
131+
132+
# Your analysis code here
133+
data = pd.read_csv("results.csv")
134+
metrics = pd.read_csv("metrics.csv")
135+
136+
boxplot = Boxplot(data, metrics, "Accuracy")
137+
boxplot.save_instance("Problem1", "output.png")
138+

docs/usage/usage.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ This section provides a brief overview of the three different features that this
1414
html
1515
bayesian
1616
violin
17+
reproducibility

0 commit comments

Comments
 (0)