Skip to content
This repository was archived by the owner on Aug 16, 2025. It is now read-only.

Commit ea88d83

Browse files
committed
Initial commit
0 parents  commit ea88d83

29 files changed

Lines changed: 3883 additions & 0 deletions

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
*.jl.cov
2+
*.jl.*.cov
3+
*.jl.mem
4+
deps/deps.jl

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2019 Martin Trapp
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Bayesian Sum-Product Networks
2+
3+
This Julia package implements Bayesian Sum-Product Networks and infinite mixtures of Bayesian Sum-Product Networks.
4+
Besides all necessary core routines, this package additionally implements approximate Bayesian inference using a combination of ancestral and Gibbs sampling as well as an implementation of the distributed slice sampler for infinite mixtures of SPNs.
5+
6+
Note that this package requires Julia version >= 1.0.
7+
8+
To install the package and run the provided code, make sure to first install Julia from [JuliaLang.org](https://julialang.org/downloads/).
9+
10+
After starting Julia, you can install the package using Julia's internal package manager.
11+
To do so, press `]` within the Julia command line to switch to the package manager prompt. You can leaf the package manager by pressing `BACKSPACE`.
12+
13+
To install the package, run the following command:
14+
```julia
15+
pkg> add https://github.com/trappmartin/BayesianSumProductNetworks.git
16+
```
17+
18+
### Dataset and Predictions
19+
All dataset and predictions can be found under: [download](https://github.com/trappmartin/BayesianSumProductNetworks/releases/download/v1.0.0/data_predictions_scripts.tar)
20+
21+
### Running Experiments
22+
To run the experiments, start the respective shell script located in the `hpc` folder. Those scripts are written such that they can be used as master scripts for slurm jobs.
23+
24+
### API
25+
All types and functions listed here contain doc-strings. Therefore, if you are interest in more details please about the use of those functions/types please use Julia's internal documentation system. Therefore, press `?` within the Julia command line and then enter the name of the function/type you want to know more about.
26+
27+
The package implements Bayesian Sum-Product Networks using the following types:
28+
29+
```julia
30+
RegionGraphNode{<:Real} <: AbstractRegionGraphNode
31+
PartitionGraphNode <: AbstractRegionGraphNode
32+
FactorizedDistributionGraphNode <: AbstractRegionGraphNode
33+
FactorizedMixtureGraphNode <: AbstractRegionGraphNode
34+
InfiniteSumNode
35+
NormalInverseGamma <: Distribution
36+
DirichletSufficientStats <: AbstractSufficientStats
37+
NormalInvGammaSufficientStats <: AbstractSufficientStats
38+
GammaSufficientStats <: AbstractSufficientStats
39+
BetaSufficientStats <: AbstractSufficientStats
40+
```
41+
42+
For Bayesian inference the following functions are implemented:
43+
44+
```julia
45+
randomscopes!
46+
templateRegion
47+
templatePartition
48+
ancestralsampling!
49+
gibbssamplescopes!
50+
slicesample!
51+
logpdf
52+
logpdf!
53+
logmllh!
54+
```

REQUIRE

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
julia 1.0
2+
3+
Distributions 0.16.0
4+
AxisArrays
5+
Reexport
6+
SpecialFunctions
7+
StatsFuns
8+
StatsBase

hpc/run_abda_infty.sh

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#!/bin/bash
2+
3+
# SETTINGS
4+
datasets=("abalone" "adult" "australian" "autism" "breast" "chess" "crx" "dermatology" "diabetes" "german" "student" "wine")
5+
ksums=(5 10)
6+
kleafs=(5 10)
7+
layers=(2 4)
8+
partitions=(2 4 8)
9+
chains=(1)
10+
11+
threads=8
12+
iterations=10000
13+
numsamples=10000
14+
burnin=5000
15+
16+
homedir="/home/"
17+
resultsdir="/PATH/TO/results_abda_infty"
18+
datadir="/PATH/TO/abda"
19+
codedir="/PATH/TO/BayesianSumProductNetworks"
20+
21+
# Actual code
22+
23+
# cd to code
24+
cd ${codedir}
25+
26+
# ensure we use threads
27+
export JULIA_NUM_THREADS=${threads}
28+
29+
# Loop over datasets and configurations...
30+
for chain in ${chains[@]}; do
31+
for ksum in ${ksums[@]}; do
32+
for kleaf in ${kleafs[@]}; do
33+
for layer in ${layers[@]}; do
34+
for partition in ${partitions[@]}; do
35+
for dataset in ${datasets[@]}; do
36+
echo "Trying (abda): Ksum ${ksum} L ${layer} Kleaf ${kleaf} J ${partition} chain ${chain} dataset ${dataset}"
37+
if [ "${kleaf}" -ge "${ksum}" ]; then
38+
OUTDIR="${resultsdir}/${dataset}/${ksum}_${kleaf}_${layer}_${partition}_${chain}"
39+
julia scripts/run_abda_infty.jl --Ksum ${ksum} --L ${layer} --Kleaf ${kleaf} --J ${partition} --chain ${chain} --numsamples ${numsamples} ${dataset} ${datadir} ${OUTDIR} ${burnin} ${iterations}
40+
else
41+
echo "skipping"
42+
fi
43+
done
44+
done
45+
done
46+
done
47+
done
48+
done

hpc/run_abda_rg.sh

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#!/bin/bash
2+
3+
# SETTINGS
4+
datasets=("abalone" "adult" "australian" "autism" "breast" "chess" "crx" "dermatology" "diabetes" "german" "student" "wine")
5+
ksums=(5 10)
6+
kleafs=(5 10)
7+
layers=(2 4)
8+
partitions=(2 4 8)
9+
chains=(1)
10+
11+
threads=8
12+
iterations=10000
13+
numsamples=10000
14+
burnin=5000
15+
16+
homedir="/home/"
17+
resultsdir="/PATH/TO/results_abda_rg"
18+
datadir="/PATH/TO/abda"
19+
codedir="/PATH/TO/BayesianSumProductNetworks"
20+
21+
# Actual code
22+
23+
# cd to code
24+
cd ${codedir}
25+
26+
# ensure we use threads
27+
export JULIA_NUM_THREADS=${threads}
28+
29+
# Loop over datasets and configurations...
30+
for chain in ${chains[@]}; do
31+
for ksum in ${ksums[@]}; do
32+
for kleaf in ${kleafs[@]}; do
33+
for layer in ${layers[@]}; do
34+
for partition in ${partitions[@]}; do
35+
for dataset in ${datasets[@]}; do
36+
echo "Trying (abda): Ksum ${ksum} L ${layer} Kleaf ${kleaf} J ${partition} chain ${chain} dataset ${dataset}"
37+
if [ "${kleaf}" -ge "${ksum}" ]; then
38+
OUTDIR="${resultsdir}/${dataset}/${ksum}_${kleaf}_${layer}_${partition}_${chain}"
39+
julia scripts/run_abda_rg.jl --Ksum ${ksum} --L ${layer} --Kleaf ${kleaf} --J ${partition} --chain ${chain} --numsamples ${numsamples} ${dataset} ${datadir} ${OUTDIR} ${burnin} ${iterations}
40+
else
41+
echo "skipping"
42+
fi
43+
done
44+
done
45+
done
46+
done
47+
done
48+
done

hpc/run_discrete_infty.sh

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#!/bin/bash
2+
3+
# SETTINGS
4+
datasets=("accidents" "ad" "baudio" "bbc" "bnetflix" "book" "c20ng" "cr52" "cwebkb" "dna" "jester" "kdd" "kosarek" "msnbc" "msweb" "nltcs" "plants" "pumsb_star" "tmovie" "tretail")
5+
ksums=(5 10)
6+
kleafs=(5 10)
7+
layers=(2 4)
8+
partitions=(2 4 8)
9+
chains=(1)
10+
11+
threads=10
12+
iterations=10000
13+
numsamples=10000
14+
burnin=5000
15+
16+
homedir="/home/"
17+
resultsdir="/PATH/TO/results_infty"
18+
datadir="/PATH/TO/discrete"
19+
codedir="/PATH/TO/BayesianSumProductNetworks"
20+
21+
# Actual code
22+
23+
# cd to code
24+
cd ${codedir}
25+
26+
# ensure we use threads
27+
export JULIA_NUM_THREADS=${threads}
28+
29+
# Loop over datasets and configurations...
30+
for chain in ${chains[@]}; do
31+
for ksum in ${ksums[@]}; do
32+
for kleaf in ${kleafs[@]}; do
33+
for layer in ${layers[@]}; do
34+
for partition in ${partitions[@]}; do
35+
for dataset in ${datasets[@]}; do
36+
echo "Trying (infty): Ksum ${ksum} L ${layer} Kleaf ${kleaf} J ${partition} chain ${chain} dataset ${dataset}"
37+
if [ "${kleaf}" -ge "${ksum}" ]; then
38+
OUTDIR="${resultsdir}/${dataset}/${ksum}_${kleaf}_${layer}_${partition}_${chain}"
39+
julia scripts/run_discrete_rg_infty.jl --Ksum ${ksum} --L ${layer} --Kleaf ${kleaf} --J ${partition} --chain ${chain} --numsamples ${numsamples} ${dataset} ${datadir} ${OUTDIR} ${burnin} ${iterations}
40+
else
41+
echo "skipping"
42+
fi
43+
done
44+
done
45+
done
46+
done
47+
done
48+
done

hpc/run_discrete_rg.sh

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#!/bin/bash
2+
3+
# SETTINGS
4+
datasets=("accidents" "ad" "baudio" "bbc" "bnetflix" "book" "c20ng" "cr52" "cwebkb" "dna" "jester" "kdd" "kosarek" "msnbc" "msweb" "nltcs" "plants" "pumsb_star" "tmovie" "tretail")
5+
ksums=(5 10)
6+
kleafs=(5 10)
7+
layers=(2 4)
8+
partitions=(2 4 8)
9+
chains=(1)
10+
11+
threads=10
12+
iterations=10000
13+
numsamples=10000
14+
burnin=5000
15+
16+
homedir="/home/"
17+
resultsdir="/PATH/TO/results_rg"
18+
datadir="/PATH/TO/discrete"
19+
codedir="/PATH/TO/BayesianSumProductNetworks"
20+
21+
# Actual code
22+
23+
# cd to code
24+
cd ${codedir}
25+
26+
# ensure we use threads
27+
export JULIA_NUM_THREADS=${threads}
28+
29+
# Loop over datasets and configurations...
30+
for chain in ${chains[@]}; do
31+
for ksum in ${ksums[@]}; do
32+
for kleaf in ${kleafs[@]}; do
33+
for layer in ${layers[@]}; do
34+
for partition in ${partitions[@]}; do
35+
for dataset in ${datasets[@]}; do
36+
echo "Trying: Ksum ${ksum} L ${layer} Kleaf ${kleaf} J ${partition} chain ${chain} dataset ${dataset}"
37+
if [ "${kleaf}" -ge "${ksum}" ]; then
38+
OUTDIR="${resultsdir}/${dataset}/${ksum}_${kleaf}_${layer}_${partition}_${chain}"
39+
julia scripts/run_discrete_rg.jl --Ksum ${ksum} --L ${layer} --Kleaf ${kleaf} --J ${partition} --chain ${chain} --numsamples ${numsamples} ${dataset} ${datadir} ${OUTDIR} ${burnin} ${iterations}
40+
else
41+
echo "skipping"
42+
fi
43+
done
44+
done
45+
done
46+
done
47+
done
48+
done

hpc/run_missing.sh

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#!/bin/bash
2+
3+
# SETTINGS
4+
datasets=("bbc" "cwebkb" "tmovie")
5+
ksums=(5)
6+
kleafs=(5)
7+
layers=(2)
8+
partitions=(2)
9+
chains=(1)
10+
11+
setups=("20_missing" "40_missing" "60_missing" "80_missing")
12+
13+
threads=10
14+
iterations=5000
15+
numsamples=2500
16+
burnin=1000
17+
18+
homedir="/home/"
19+
resultsdir="/PATH/TO/results_missing/"
20+
datadir="/PATH/TO/discrete_missing/"
21+
codedir="/PATH/TO/BayesianSumProductNetworks"
22+
23+
# Actual code
24+
25+
# cd to code
26+
cd ${codedir}
27+
28+
# ensure we use threads
29+
export JULIA_NUM_THREADS=${threads}
30+
31+
# Loop over datasets and configurations...
32+
for chain in ${chains[@]}; do
33+
for setup in ${setups[@]}; do
34+
for ksum in ${ksums[@]}; do
35+
for kleaf in ${kleafs[@]}; do
36+
for layer in ${layers[@]}; do
37+
for partition in ${partitions[@]}; do
38+
for dataset in ${datasets[@]}; do
39+
echo "Trying: Ksum ${ksum} L ${layer} Kleaf ${kleaf} J ${partition} chain ${chain} dataset ${dataset}"
40+
OUTDIR="${resultsdir}/${setup}/${dataset}/${ksum}_${kleaf}_${layer}_${partition}_${chain}"
41+
DATA="${datadir}/${setup}"
42+
julia scripts/run_discrete_missing_rg.jl --Ksum ${ksum} --L ${layer} --Kleaf ${kleaf} --J ${partition} --chain ${chain} --numsamples ${numsamples} ${dataset} ${DATA} ${OUTDIR} ${burnin} ${iterations}
43+
done
44+
done
45+
done
46+
done
47+
done
48+
done
49+
done

0 commit comments

Comments
 (0)