Skip to content

Commit d13fbfd

Browse files
committed
Saving 5th tutorial
1 parent d03d1fe commit d13fbfd

5 files changed

Lines changed: 23 additions & 24 deletions

File tree

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,5 @@ Each tutorial has its own requirements.txt file for a specific mltu version. As
1616
1. [Text Recognition With TensorFlow and CTC network](https://pylessons.com/ctc-text-recognition), code in ```Tutorials\01_image_to_word``` folder;
1717
2. [TensorFlow OCR model for reading Captchas](https://pylessons.com/tensorflow-ocr-captcha), code in ```Tutorials\02_captcha_to_text``` folder;
1818
3. [Handwriting words recognition with TensorFlow](https://pylessons.com/handwriting-recognition), code in ```Tutorials\03_handwriting_recognition``` folder;
19-
4. [Handwritten sentence recognition with TensorFlow](https://pylessons.com/handwritten-sentence-recognition), code in ```Tutorials\04_sentence_recognition``` folder;
19+
4. [Handwritten sentence recognition with TensorFlow](https://pylessons.com/handwritten-sentence-recognition), code in ```Tutorials\04_sentence_recognition``` folder;
20+
5. [Introduction to speech recognition with TensorFlow](https://pylessons.com/speech-recognition), code in ```Tutorials\05_speech_recognition``` folder;

Tutorials/05_sound_to_text/inferenceModel.py

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -23,30 +23,29 @@ def predict(self, data: np.ndarray):
2323
import pandas as pd
2424
from tqdm import tqdm
2525
from mltu.configs import BaseModelConfigs
26-
import matplotlib.pyplot as plt
27-
import matplotlib
28-
matplotlib.interactive(False)
2926

30-
configs = BaseModelConfigs.load("Models/05_sound_to_text/202301221900/configs.yaml")
27+
configs = BaseModelConfigs.load("Models/05_sound_to_text/202302051936/configs.yaml")
3128

32-
model = WavToTextModel(model_path=configs.model_path, char_list=configs.vocab, force_cpu=True)
29+
model = WavToTextModel(model_path=configs.model_path, char_list=configs.vocab, force_cpu=False)
3330

34-
df = pd.read_csv("Models/05_sound_to_text/202301221900/val.csv").values.tolist()
31+
df = pd.read_csv("Models/05_sound_to_text/202302051936/val.csv").values.tolist()
3532

3633
accum_cer, accum_wer = [], []
3734
for wav_path, label in tqdm(df):
3835

3936
spectrogram = WavReader.get_spectrogram(wav_path, frame_length=configs.frame_length, frame_step=configs.frame_step, fft_length=configs.fft_length)
40-
WavReader.plot_raw_audio(wav_path, label)
37+
# WavReader.plot_raw_audio(wav_path, label)
4138

4239
padded_spectrogram = np.pad(spectrogram, ((configs.max_spectrogram_length - spectrogram.shape[0], 0),(0,0)), mode='constant', constant_values=0)
4340

44-
WavReader.plot_spectrogram(spectrogram, label)
41+
# WavReader.plot_spectrogram(spectrogram, label)
4542

4643
text = model.predict(padded_spectrogram)
4744

48-
cer = get_cer(text, label.lower())
49-
wer = get_wer(text, label.lower())
45+
true_label = "".join([l for l in label.lower() if l in configs.vocab])
46+
47+
cer = get_cer(text, true_label)
48+
wer = get_wer(text, true_label)
5049

5150
accum_cer.append(cer)
5251
accum_wer.append(wer)

Tutorials/05_sound_to_text/train.py

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,14 @@
22
try: [tf.config.experimental.set_memory_growth(gpu, True) for gpu in tf.config.experimental.list_physical_devices('GPU')]
33
except: pass
44

5-
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau, TensorBoard
5+
import stow
6+
import tarfile
7+
import pandas as pd
8+
from tqdm import tqdm
9+
from urllib.request import urlopen
10+
from io import BytesIO
611

12+
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau, TensorBoard
713
from mltu.dataProvider import DataProvider
814
from mltu.preprocessors import WavReader
915
from mltu.transformers import LabelIndexer, LabelPadding, SpectrogramPadding
@@ -14,16 +20,6 @@
1420
from model import train_model
1521
from configs import ModelConfigs
1622

17-
import stow
18-
import pandas as pd
19-
from tqdm import tqdm
20-
21-
import stow
22-
import tarfile
23-
from tqdm import tqdm
24-
from urllib.request import urlopen
25-
from io import BytesIO
26-
2723
def download_and_unzip(url, extract_to='Datasets', chunk_size=1024*1024):
2824
http_response = urlopen(url)
2925

Tutorials/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
# Tutorials and Examples made with MLTU library:
22
1. [Text Recognition With TensorFlow and CTC network](https://pylessons.com/ctc-text-recognition), code in ```Tutorials\01_image_to_word``` folder;
3-
2. [TensorFlow OCR model for reading Captchas](https://pylessons.com/tensorflow-ocr-captcha), code in ```Tutorials\02_captcha_to_text``` folder;
3+
2. [TensorFlow OCR model for reading Captchas](https://pylessons.com/tensorflow-ocr-captcha), code in ```Tutorials\02_captcha_to_text``` folder;
4+
3. [Handwriting words recognition with TensorFlow](https://pylessons.com/handwriting-recognition), code in ```Tutorials\03_handwriting_recognition``` folder;
5+
4. [Handwritten sentence recognition with TensorFlow](https://pylessons.com/handwritten-sentence-recognition), code in ```Tutorials\04_sentence_recognition``` folder;
6+
5. [Introduction to speech recognition with TensorFlow](https://pylessons.com/speech-recognition), code in ```Tutorials\05_speech_recognition``` folder;

mltu/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.1.5"
1+
__version__ = "0.1.6"

0 commit comments

Comments
 (0)