Skip to content
This repository was archived by the owner on Aug 28, 2024. It is now read-only.

Commit 8e2700a

Browse files
authored
Merge pull request #255 from jeffxtang/pocket_fft
Version 2 of the Streaming ASR app
2 parents 86aff27 + d9ad95c commit 8e2700a

11 files changed

Lines changed: 4228 additions & 645 deletions

File tree

StreamingASR/CMakeLists.txt

Lines changed: 0 additions & 6 deletions
This file was deleted.

StreamingASR/README.md

Lines changed: 14 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ In the Speech Recognition Android [demo app](https://github.com/pytorch/android-
66

77
## Prerequisites
88

9-
* PyTorch 1.11 and torchaudio 0.11 or above (Optional)
9+
* PyTorch 1.12 and torchaudio 0.12 or above (Optional)
1010
* Python 3.8 (Optional)
11-
* Android Pytorch library org.pytorch:pytorch_android_lite:1.11.0
11+
* Android Pytorch library org.pytorch:pytorch_android_lite:1.12.2
1212
* Android Studio 4.0.1 or later
1313

1414
## Quick Start
@@ -22,39 +22,32 @@ git clone https://github.com/pytorch/android-demo-app
2222
cd android-demo-app/StreamingASR
2323
```
2424

25-
If you don't have PyTorch 1.11 and torchaudio 0.11 installed or want to have a quick try of the demo app, you can download the optimized scripted model file [streaming_asr.ptl](https://drive.google.com/file/d/1awT_1S6H5IXSOOqpFLmpeg0B-kQVWG2y/view?usp=sharing), then drag and drop it to the `StreamingASR/app/src/main/assets` folder inside `android-demo-app/StreamingASR`, and continue to Step 3.
26-
27-
Also you need to download [Eigen](https://eigen.tuxfamily.org/), a C++ template library for linear algebra, for Android NDK build required to run the app (see last section of this README for more info):
28-
```
29-
mkdir external; cd external
30-
git clone https://github.com/jeffxtang/eigen
31-
```
25+
If you don't have PyTorch 1.12 and torchaudio 0.12 installed or want to have a quick try of the demo app, you can download the optimized scripted model file [streaming_asrv2.ptl](https://drive.google.com/file/d/1XRCAFpMqOSz5e7VP0mhiACMGCCcYfpk-/view?usp=sharing), then drag and drop it to the `StreamingASR/app/src/main/assets` folder inside `android-demo-app/StreamingASR`, and continue to Step 3.
3226

3327
### 2. Test and Prepare the Model
3428

35-
To install PyTorch 1.11, torchaudio 0.11, and other required Python packages (numpy and pyaudio), do something like this:
29+
To install PyTorch 1.12, torchaudio 0.12, and other required packages (numpy, pyaudio, and fairseq), do something like this:
3630

3731
```
38-
conda create -n pt1.11 python=3.8.5
39-
conda activate pt1.11
40-
pip install torch torchaudio numpy pyaudio
32+
conda create -n pt1.12 python=3.8.5
33+
conda activate pt1.12
34+
pip install torch torchaudio numpy pyaudio fairseq
4135
```
4236

43-
Now download the streaming ASR model file
44-
[scripted_wrapper_tuple_no_transform.pt](https://drive.google.com/file/d/1_49DwHS_a3p3THGdHZj3TXmjNJj60AhP/view?usp=sharing) to the `android-demo-app/StreamingASR` directory.
37+
First, create the model file `scripted_wrapper_tuple.pt` by running `python generate_ts.py`.
4538

46-
To test the model, run `python run_sasr.py`. After you see:
39+
Then, to test the model, run `python run_sasr.py`. After you see:
4740
```
4841
Initializing model...
4942
Initialization complete.
5043
```
51-
say something like "good afternoon happy new year", and you'll likely see the streaming recognition results `good afternoon happy new year` while you speak. Hit Ctrl-C to end.
44+
say something like "good afternoon happy new year", and you'll likely see the streaming recognition results `good afternoon happy new year` while you speak. Hit Ctrl-C to end.
5245

53-
To optimize and convert the model to the format that can run on Android, run the following commands:
46+
Finally, to optimize and convert the model to the format that can run on Android, run the following commands:
5447
```
5548
mkdir -p StreamingASR/app/src/main/assets
5649
python save_model_for_mobile.py
57-
mv streaming_asr.ptl StreamingASR/app/src/main/assets
50+
mv streaming_asrv2.ptl StreamingASR/app/src/main/assets
5851
```
5952

6053
### 3. Build and run with Android Studio
@@ -67,10 +60,6 @@ Start Android Studio, open the project located in `android-demo-app/StreamingASR
6760

6861
## Librosa C++, Eigen, and JNI
6962

70-
Note that this demo uses a [C++ port](https://github.com/ewan-xu/LibrosaCpp/) of [Librosa](https://librosa.org), a popular audio processing library in Python, to perform the MelSpectrogram transform. In the Python script `run_sasr.py` above, the torchaudio's [MelSpectrogram](https://pytorch.org/audio/stable/transforms.html#melspectrogram) is used, but you can achieve the same transform result by replacing `spectrogram = transform(tensor).transpose(1, 0)`, line 46 of run_sasr.py with:
71-
```
72-
mel = librosa.feature.melspectrogram(np_array, sr=16000, n_fft=400, n_mels=80, hop_length=160)
73-
spectrogram = torch.tensor(mel).transpose(1, 0)
74-
```
63+
The first version of this demo uses a [C++ port](https://github.com/ewan-xu/LibrosaCpp/) of [Librosa](https://librosa.org), a popular audio processing library in Python, to perform the MelSpectrogram transform, because torchaudio before version 0.11 doesn't support fft on Android (see [here](https://github.com/pytorch/audio/issues/408)). Using the Librosa C++ port and [JNI](https://developer.android.com/training/articles/perf-jni) (Java Native Interface) on Android makes the MelSpectrogram possible on Android. Furthermore, the Librosa C++ port requires [Eigen](https://eigen.tuxfamily.org/), a C++ template library for linear algebra, so both the port and the Eigen library are included in the first version of the demo app and built as JNI.
7564

76-
Because torchaudio currently doesn't support fft on Android (see [here](https://github.com/pytorch/audio/issues/408)), using the Librosa C++ port and [JNI](https://developer.android.com/training/articles/perf-jni) (Java Native Interface) on Android makes the MelSpectrogram possible on Android. Furthermore, the Librosa C++ port requires [Eigen](https://eigen.tuxfamily.org/), a C++ template library for linear algebra, so both the port and the Eigen library are included in the demo app and built as JNI, using the `CMakeLists.txt` and `MainActivityJNI.cpp` in `StreamingASR/app/src/main/cpp`.
65+
See [here](https://github.com/jeffxtang/android-demo-app/tree/librosa_jni/StreamingASR) for the first version of the demo if interested in an example of using native C++ to expand operations not yet supported in PyTorch or one of its domain libraries.

StreamingASR/StreamingASR/app/build.gradle

Lines changed: 1 addition & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,6 @@ android {
1313
versionName "1.0"
1414

1515
testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
16-
17-
externalNativeBuild {
18-
cmake {
19-
cppFlags ""
20-
arguments "-DLOGGER_BUILD_HEADER_LIB=ON", "-DBUILD_TESTING=OFF"
21-
}
22-
}
2316
}
2417

2518
buildTypes {
@@ -32,13 +25,6 @@ android {
3225
sourceCompatibility JavaVersion.VERSION_1_8
3326
targetCompatibility JavaVersion.VERSION_1_8
3427
}
35-
36-
externalNativeBuild {
37-
cmake {
38-
path "../../CMakeLists.txt"
39-
version "3.10.2"
40-
}
41-
}
4228
}
4329

4430
dependencies {
@@ -50,5 +36,5 @@ dependencies {
5036
androidTestImplementation 'androidx.test.ext:junit:1.1.3'
5137
androidTestImplementation 'androidx.test.espresso:espresso-core:3.4.0'
5238

53-
implementation 'org.pytorch:pytorch_android_lite:1.11'
39+
implementation 'org.pytorch:pytorch_android_lite:1.12.2'
5440
}

StreamingASR/StreamingASR/app/src/main/cpp/CMakeLists.txt

Lines changed: 0 additions & 5 deletions
This file was deleted.

StreamingASR/StreamingASR/app/src/main/cpp/MainActivityJNI.cpp

Lines changed: 0 additions & 92 deletions
This file was deleted.

0 commit comments

Comments
 (0)