Skip to content

Commit 5b81c45

Browse files
Renamed Experiments
1 parent 294e388 commit 5b81c45

8 files changed

Lines changed: 52 additions & 585 deletions

experiments/README.md

Lines changed: 50 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -1,108 +1,66 @@
11
# SimKit Experiments with Neo4j & Python
22

3-
This repository demonstrates how to run spectral clustering experiments using both the SimKit plugin (loaded into Neo4j) and scikit-learn. It includes:
4-
- A Dockerfile to build a Neo4j container with the SimKit plugin.
5-
- The `experiments_2.py` script which runs experiments against the Neo4j instance.
3+
This repository demonstrates how to run spectral clustering experiments using both the SimKit plugin (loaded into Neo4j) and scikit-learn. It also includes experiments using Neo4j Graph Data Science (GDS) algorithms.
4+
5+
## Features
6+
- Dockerized Neo4j setup with the SimKit plugin.
7+
- Scripts to compare clustering results using:
8+
- SimKit (custom Neo4j plugin)
9+
- scikit-learn
10+
- Neo4j GDS library
11+
- Batch and timing experiments for performance comparison.
12+
13+
## Contents
14+
- Dockerfile and docker-compose.yml: Builds and runs a Neo4j instance with the SimKit plugin.
15+
- requirements.txt: Python dependencies.
16+
- SimKit-0.1.1.jar: SimKit plugin (must be compatible with the Neo4j version used).
17+
- Python scripts:
18+
- experiment_gds.py: Runs k-means clustering using Neo4j GDS.
19+
- experiments_simkit-0.1.1.py: Runs and times clustering using SimKit and scikit-learn.
620

721
## Prerequisites
8-
9-
- Docker must be installed on your system.
22+
- Docker installed on your system.
1023
- Python 3.x and pip installed.
11-
- Ensure your system has enough memory (Neo4j can be memory intensive).
12-
- Place the `simkit.jar` plugin file (compatible with your Neo4j version) in the same folder as the Dockerfile.
13-
14-
## Building and Running the Neo4j Docker Container
15-
16-
1. **Build the Docker Image**
17-
18-
Open a terminal in the directory containing the Dockerfile and `simkit.jar` and run:
19-
20-
```bash
21-
docker build -t my-neo4j .
22-
```
23-
24-
This builds a Docker image named `my-neo4j` that includes the SimKit plugin.
25-
26-
2. **Run the Docker Container**
27-
28-
Start the container with:
29-
30-
```bash
31-
docker run --name neo4j -p 7687:7687 -p 7474:7474 -d my-neo4j
32-
```
33-
34-
- **Ports:**
35-
- `7687` is the Bolt port for Neo4j.
36-
- `7474` is the HTTP port (Neo4j Browser).
37-
38-
3. **Verify the Container**
39-
40-
- Check logs for any errors (especially plugin errors):
41-
42-
```bash
43-
docker logs neo4j
44-
```
45-
46-
- Open [http://localhost:7474](http://localhost:7474) in your browser to access the Neo4j Browser and verify that the SimKit procedures are available (e.g., try a test query like `RETURN simkit.experimental_spectralClustering({ ... })`).
47-
48-
## Running the Experiments
49-
50-
The `experiments_2.py` script runs spectral clustering experiments using both SimKit (via Neo4j procedures) and scikit-learn spectral clustering.
51-
52-
1. **Install Python Dependencies**
53-
54-
The script will attempt to install missing packages automatically. Alternatively, install manually:
55-
56-
```bash
57-
pip install neo4j pandas psutil tqdm scikit-learn scipy
58-
```
59-
60-
2. **Prepare Datasets**
61-
62-
Place your dataset CSV files under the `datasets/` directory. The script expects file names such as `iris.csv`, `cora_nodes.csv`, `cora_edges.csv`, etc.
63-
64-
3. **Run the Experiment Script**
65-
66-
Ensure the Neo4j container is running, then execute:
67-
68-
```bash
69-
python experiments_2.py
70-
```
71-
72-
The script will:
73-
- Delete existing nodes and indexes in Neo4j.
74-
- Create feature/graph nodes from the datasets.
75-
- Run experiments using both SimKit and scikit-learn.
76-
- Save results as CSV files in a `results/` directory.
24+
- Memory: Ensure your system has enough RAM (Neo4j can be memory-intensive).
25+
- Place the simkit.jar plugin (compatible with your Neo4j version) in the same directory as the Dockerfile.
7726

78-
## Troubleshooting
27+
## Setup Instructions
28+
1. Clone the repository:
7929

80-
- **Container Exits Early:**
81-
Check the container logs with:
30+
```bash
31+
git clone https://github.com/yourusername/simkit-experiments.git
32+
cd simkit-experiments
33+
```
8234

83-
```bash
84-
docker logs neo4j
85-
```
35+
2. Install Python dependencies:
36+
```bash
37+
pip install -r requirements.txt
38+
```
8639

87-
Review for errors related to plugin incompatibility, configuration issues, or memory constraints.
40+
3. Start the Neo4j database:
41+
```bash
42+
docker compose up
43+
```
8844

89-
- **SimKit Procedure Not Found:**
90-
Ensure the `simkit.jar` file is correctly placed and compatible with the Neo4j version used.
45+
This will build and start the Neo4j container with SimKit.
9146

92-
- **Python Errors:**
93-
Verify that all dataset files exist and have the expected schema.
9447

95-
## Cleanup
48+
4. Run experiments:
49+
- Run GDS clustering experiments on Neo4j:
50+
```bash
51+
python experiment_gds.py
52+
```
9653

97-
To stop and remove the Docker container:
54+
- Run SimKit and scikit-learn timing experiments:
55+
```bash
56+
python experiments_simkit-0.1.1.py
57+
```
9858

99-
```bash
100-
docker stop neo4j
101-
docker rm neo4j
102-
```
59+
## Notes
60+
- Ensure the Neo4j container is fully up before starting any experiments.
61+
- The scripts assume the Neo4j instance is accessible at the configured bolt:// address and uses default credentials (neo4j/neo4j by default; change if modified).
62+
- You may need to load or generate a sample graph dataset in Neo4j before running experiments.
10363

104-
To remove the Docker image:
64+
## License
10565

106-
```bash
107-
docker rmi my-neo4j
108-
```
66+
This project is licensed under the Apache License 2.0.

experiments/SimKit-0.1.1.jar

16.5 MB
Binary file not shown.

experiments/docker-compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ services:
1313
dbms.security.procedures.unrestricted: "simkit.*"
1414
volumes:
1515
# mount your locally built simkit.jar
16-
- ./simkit.jar:/var/lib/neo4j/plugins/simkit.jar
16+
- ./SimKit-0.1.1.jar:/var/lib/neo4j/plugins/simkit.jar
1717
- simkit_data:/data
1818
- simkit_logs:/logs
1919
ports:
File renamed without changes.

experiments/experiment_gds_kmeans_0.1.0_iris.py

Lines changed: 0 additions & 125 deletions
This file was deleted.

0 commit comments

Comments
 (0)