|
3 | 3 | A high performance pure python module that helps in loading and performing operations on word vector spaces created using Google's Word2vec tool. |
4 | 4 |
|
5 | 5 | ## Installation |
6 | | -> Prerequisites: Python2.7 |
| 6 | +> Prerequisites: Python3.5 |
7 | 7 |
|
8 | 8 | ```bash |
9 | 9 | $ sudo apt install libopenblas-base |
10 | | -$ sudo pip install wordvecspace |
| 10 | +$ sudo pip3 install wordvecspace |
11 | 11 | ``` |
12 | 12 |
|
13 | 13 | ## Usage |
@@ -62,25 +62,6 @@ $ wordvecspace convert <input_dir> <output_dir> |
62 | 62 |
|
63 | 63 | # You can also generate shards by specifying number of vectors per each shard |
64 | 64 | $ wordvecspace convert <input_dir> <output_dir> -n 5000 |
65 | | -``` |
66 | | -### Interactive console |
67 | | -```bash |
68 | | -$ wordvecspace interact <input_dir> |
69 | | - |
70 | | -# <input_dir> is the directory which has vocab.txt and vectors.npy |
71 | | -``` |
72 | | -Example: |
73 | | -```bash |
74 | | -$ wordvecspace interact /home/user/data |
75 | | - |
76 | | -Total number of vectors and dimensions in .npy file (71291, 5) |
77 | | - |
78 | | ->>> help |
79 | | -['DEFAULT_K', 'VECTOR_FNAME', 'VOCAB_FNAME', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_load_vocab', '_make_array', '_perform_dot', '_perform_sgemm', '_perform_sgemv', 'data_dir', 'does_word_exist', 'get_distance', 'get_distances', 'get_nearest_neighbors', 'get_vector_magnitudes', 'get_word_at_index', 'get_word_index', 'get_word_occurrences', 'get_word_vector', 'get_word_vectors', 'load', 'magnitudes', 'num_dimensions', 'num_vectors', 'vectors', 'word_indices', 'word_occurrences', 'words'] |
80 | | - |
81 | | -WordVecSpace console |
82 | | ->>> wv = WordVecSpace |
83 | | - |
84 | 65 | ``` |
85 | 66 | ### Importing |
86 | 67 | ```python |
@@ -256,6 +237,73 @@ Int64Index([ 509, 486, 14208, 20639, 8573, 3389, 5226, 20919, 10172, |
256 | 237 | dtype='int64') |
257 | 238 | ``` |
258 | 239 |
|
| 240 | +### Service |
| 241 | + |
| 242 | +```bash |
| 243 | +# Run wordvecspace as a service (which continuously listens on some port for API requests) |
| 244 | +$ wordvecspace runserver <input_dir> -p <port_no> |
| 245 | + |
| 246 | +# -p is for giving port. If it is not mentioned, by default wordvecspace will run on 8900 port. |
| 247 | +# <port_no> is the port number of wordvecspace |
| 248 | +# <input_dir> is the directory which has vocab.txt and vectors.npy. |
| 249 | +``` |
| 250 | + |
| 251 | +Example: |
| 252 | + |
| 253 | +```bash |
| 254 | +$ wordvecspace runserver /home/user/data -p 8000 |
| 255 | + |
| 256 | +# Make API request |
| 257 | +$ curl "http://localhost:8000/api/v1/does_word_exist?word=india" |
| 258 | +{"result": true, "success": true} |
| 259 | +``` |
| 260 | + |
| 261 | +#### Making call to all API methods |
| 262 | + |
| 263 | +```bash |
| 264 | +$ http://localhost:8000/api/v1/does_word_exist?word=india |
| 265 | + |
| 266 | +$ http://localhost:8000/api/v1/get_word_index?word=india |
| 267 | + |
| 268 | +$ http://localhost:8000/api/v1/get_word_at_index?index=509 |
| 269 | + |
| 270 | +$ http://localhost:8000/api/v1/get_word_vector?word_or_index=509 |
| 271 | + |
| 272 | +$ http://localhost:8000/api/v1/get_vector_magnitudes?words_or_indices=[88, "india"] |
| 273 | + |
| 274 | +$ http://localhost:8000/api/v1/get_word_occurrences?word_or_index=india |
| 275 | + |
| 276 | +$ http://localhost:8000/api/v1/get_word_vectors?words_or_indices=[1, 'india'] |
| 277 | + |
| 278 | +$ http://localhost:8000/api/v1/get_distance?word1=ap&word2=india |
| 279 | + |
| 280 | +$ http://localhost:8000/api/v1/get_distances?row_words=india |
| 281 | + |
| 282 | +$ http://localhost:8000/api/v1/get_nearest_neighbors?word=india&k=100 |
| 283 | +``` |
| 284 | + |
| 285 | +> To see all API methods of wordvecspace please run http://localhost:8000/api/v1/apidoc |
| 286 | +
|
| 287 | +### Interactive console |
| 288 | +```bash |
| 289 | +$ wordvecspace interact <input_dir> |
| 290 | + |
| 291 | +# <input_dir> is the directory which has vocab.txt and vectors.npy |
| 292 | +``` |
| 293 | + |
| 294 | +Example: |
| 295 | +```bash |
| 296 | +$ wordvecspace interact /home/user/data |
| 297 | + |
| 298 | +Total number of vectors and dimensions in .npy file (71291, 5) |
| 299 | + |
| 300 | +>>> help |
| 301 | +['DEFAULT_K', 'VECTOR_FNAME', 'VOCAB_FNAME', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_load_vocab', '_make_array', '_perform_dot', '_perform_sgemm', '_perform_sgemv', 'data_dir', 'does_word_exist', 'get_distance', 'get_distances', 'get_nearest_neighbors', 'get_vector_magnitudes', 'get_word_at_index', 'get_word_index', 'get_word_occurrences', 'get_word_vector', 'get_word_vectors', 'load', 'magnitudes', 'num_dimensions', 'num_vectors', 'vectors', 'word_indices', 'word_occurrences', 'words'] |
| 302 | + |
| 303 | +WordVecSpace console |
| 304 | +>>> wv = WordVecSpace |
| 305 | + |
| 306 | +``` |
259 | 307 | ## Running tests |
260 | 308 |
|
261 | 309 | ```bash |
|
0 commit comments