You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+88-1Lines changed: 88 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,10 @@ A Java implementation of Locality Sensitive Hashing (LSH).
5
5
6
6
Locality Sensitive Hashing (LSH) is a family of hashing methods that tent to produce the same hash (or signature) for similar items. There exist different LSH functions, that each correspond to a similarity metric. For example, the MinHash algorithm is designed for Jaccard similarity (the relative number of elements that two sets have in common). For cosine similarity, the traditional LSH algorithm used is Random Projection, but others exist, like Super-Bit, that deliver better resutls.
7
7
8
+
LSH functions have two main use cases:
9
+
* Compute the signature of large input vectors. These signatures can be used to quickly estimate the similarity between vectors.
10
+
* With a given number of buekcts, bin similar vectors together.
11
+
8
12
This library implements Locality Sensitive Hashing (LSH), as described in Leskovec, Rajaraman & Ullman (2014), "Mining of Massive Datasets", Cambridge University Press.
9
13
10
14
Are currently implemented:
@@ -24,7 +28,6 @@ Using maven:
24
28
25
29
Or see the [releases](https://github.com/tdebatty/java-LSH/releases) page.
26
30
27
-
28
31
##MinHash
29
32
30
33
MinHash is a hashing scheme that tents to produce similar signatures for sets that have a high Jaccard similarity.
LSH object serialized to /tmp/lshobject5903174677942358274.ser
471
+
[55 ]
472
+
```
473
+
474
+
[Check the examples](https://github.com/tdebatty/java-LSH/tree/master/src/main/java/info/debatty/java/lsh/examples) or [read Javadoc](http://api123.io/api/java-LSH/head/index.html)
0 commit comments