Skip to content

Commit 7a34402

Browse files
Add files via upload
Signed-off-by: Shahm Najeeb <Nirt_12023@outlook.com>
1 parent ca55753 commit 7a34402

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

Training Data/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# About Training Data
2+
3+
Here is a archive of training data used for VulnScan, its huge and pretty big, this includes 5 different sets of training, where 2 are test runs (Not actual data to train), and 3 proper sets ranging from 10 thousand to 1 million files,
4+
5+
All the training model data in total UNZIPPED is around 12.6 GB, make sure you have enough space to extract, and note that RAM is a huge factor in training, always have 2GB of RAM free AFTER the data is loaded, mostly you should vectorize data, save it, restart device to reallocate RAM, then train the model on the vectorizer, but you do you.
6+
7+
## About Z## Files
8+
9+
I have split the files into 40.9MB archives, all you have to do is just extract the zip file normally to get the training data normally, this is done to reduce the file size
10+
11+
Make sure you have all `Z###` files from `Z01` to `Z146` as well as `Training Data.zip`

0 commit comments

Comments
 (0)