Proof of concept EM-Algorithm implementation that uses prior knowledge of probabilities on 2D points to train a multivariate Gaussian Mixture Model (GMM).
Basically the probability is used for normalization during the maximization step. When the sampling count is low, the square root of the probability p can be used instead of p as an optimization.
The expectation step is not changed:
- KMeans with random point initialization
- Low Max-Iterations (default: 5)
- Low count of training points (default: 20).
- Comparison and reference of EM implementation in OpenCV
-
Run:
./main.py -
Store:
./main.py --save test.json -
Replay:
./main.py --load test.json -
Run large test:
./compare.py
More information is used to approximate the incomplete data. It shows slightly better results than the reference algorithm, especially in a sparse sampled environment.
But keep in mind that with a low iteration count the initial guess via K-Means plays a big role.
Initial is the desired distribution that was used to sample the red dots.
OpenCV-EM is the reference algorithm by OpenCV.
Weighted-EM is the enhancement by using the probabilities in normalization.

On low sampling rate some normals can become faint. The square-root on the probability can bring them to the front again.





