Added / fixed some docs.

jjhenkel · jjhenkel · commit f4cc735ebfbf · 2021-01-22T11:41:24.000-06:00
diff --git a/data/README.md b/data/README.md
@@ -27,3 +27,11 @@ in the `./data/build-results/*/` folder.
 This directory includes the (compressed) pre-processed version of the broken Dockerfiles that we used as input to 
 our clustering algorithm (BERT + HDBSCAN). We hope that, by providing the original pre-processed data, others can
 build new clustering techniques or refine the clustering approach and parameters.
+
+## `./data/non-clustered-data`
+
+This directory contains a (compressed) pre-processed version of the broken Dockerfiles that _did not cluster_ (HDBSCAN, under most configurations, does not place every element into a cluster). As part of `rq3` we analyze clustered and non-clustered data to compare how shipwright performs on either set. This data is used by `./shipwright.sh run-rq3`.
+
+## `./data/clustered-data`
+
+In this folder you can find the clusters we generated and corresponding metadata for each of the broken Dockerfiles that are in each cluster. This data is used by `./shipwright.sh run-rq3`. 
diff --git a/rq2/README.md b/rq2/README.md
@@ -6,7 +6,7 @@ If you wish to run things from this directory, you would need `python3` with the
 
 ## Too Long Didn't Read (TLDR)
 
-We provided the pre-processed form of our broken Dockerfiles in the `./data/for-clustering` directory (one gzipped json file per broken Dockerfile). You can use these files to run your own clustering, or our clustering, or you can use our pre-generated clusters in the `./rq3/Clusters` directory.
+We provided the pre-processed form of our broken Dockerfiles in the `./data/for-clustering` directory (one gzipped json file per broken Dockerfile). You can use these files to run your own clustering, or our clustering, or you can use our pre-generated clusters in the `./data/clustered-data` directory.
 
 To spit out some quick output data (and verify things can run) run `./clustering.py` --- this is quick and should print something like the following: