File tree Expand file tree Collapse file tree 1 file changed +45
-0
lines changed
Expand file tree Collapse file tree 1 file changed +45
-0
lines changed Original file line number Diff line number Diff line change 1+ # ViTSTR: Vision Transformer for Fast and Efficient Scene Text Recognition
2+ [ Code] ( https://github.com/roatienza/deep-text-recognition-benchmark )
3+
4+
5+ ## Build
6+
7+ mkdir -p build && cd build
8+ cmake ..
9+ make -j4
10+
11+
12+ ## Usage
13+
14+
15+ <p align =" center " >
16+ <img src =" images/demo_1.png " alt =" example input " width =" 50% " height =" auto " >
17+ </p >
18+
19+ <pre >
20+ ./bin/vitstr -t 4 -m ../ggml-model-f16.gguf -i ../images/demo_1.png
21+ main: seed = 1706997535
22+ main: n_threads = 4 / 8
23+ vit_model_load: loading model from '../ggml-model-f16.gguf' - please wait
24+ vit_model_load: hidden_size = 768
25+ vit_model_load: num_hidden_layers = 12
26+ vit_model_load: num_attention_heads = 12
27+ vit_model_load: patch_size = 16
28+ vit_model_load: img_size = 224
29+ vit_model_load: num_classes = 96
30+ vit_model_load: ftype = 1
31+ vit_model_load: qntvr = 0
32+ operator(): ggml ctx size = 164.48 MB
33+ vit_model_load: ................... done
34+ vit_model_load: model size = 163.56 MB / num tensors = 152
35+ main: loaded image '../images/demo_1.png' (184 x 72)
36+ processed, out dims : (224 x 224)
37+ ------------------
38+ Available
39+ score : 1.00
40+ ------------------
41+
42+
43+ main: model load time = 144.64 ms
44+ main: processing time = 1176.77 ms
45+ main: total time = 1321.41 ms
You can’t perform that action at this time.
0 commit comments