Skip to content

Commit a4841f6

Browse files
committed
Add smoll README
1 parent 72d8410 commit a4841f6

File tree

1 file changed

+45
-0
lines changed

1 file changed

+45
-0
lines changed

extensions/vitstr.cpp/README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# ViTSTR: Vision Transformer for Fast and Efficient Scene Text Recognition
2+
[Code](https://github.com/roatienza/deep-text-recognition-benchmark)
3+
4+
5+
## Build
6+
7+
mkdir -p build && cd build
8+
cmake ..
9+
make -j4
10+
11+
12+
## Usage
13+
14+
15+
<p align="center">
16+
<img src="images/demo_1.png" alt="example input" width="50%" height="auto">
17+
</p>
18+
19+
<pre>
20+
./bin/vitstr -t 4 -m ../ggml-model-f16.gguf -i ../images/demo_1.png
21+
main: seed = 1706997535
22+
main: n_threads = 4 / 8
23+
vit_model_load: loading model from '../ggml-model-f16.gguf' - please wait
24+
vit_model_load: hidden_size = 768
25+
vit_model_load: num_hidden_layers = 12
26+
vit_model_load: num_attention_heads = 12
27+
vit_model_load: patch_size = 16
28+
vit_model_load: img_size = 224
29+
vit_model_load: num_classes = 96
30+
vit_model_load: ftype = 1
31+
vit_model_load: qntvr = 0
32+
operator(): ggml ctx size = 164.48 MB
33+
vit_model_load: ................... done
34+
vit_model_load: model size = 163.56 MB / num tensors = 152
35+
main: loaded image '../images/demo_1.png' (184 x 72)
36+
processed, out dims : (224 x 224)
37+
------------------
38+
Available
39+
score : 1.00
40+
------------------
41+
42+
43+
main: model load time = 144.64 ms
44+
main: processing time = 1176.77 ms
45+
main: total time = 1321.41 ms

0 commit comments

Comments
 (0)