You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's possible to use the code here in your own golang projects, but there are different dependencies in different parts.
The "basic" bindings directly to whisper.cpp are in the sys/whisper folder, and that contains everything you need to transcribe or translate to/from whisper. You'll notice that the CGO statements within some of the code there use pkg-config in order to bind directly to whisper libraries; so you just need to have an installed set of libraries and header files, and point PKG_CONFIG_PATH environment variable to your pkg-config file (usually libwhisper.pc). I have a makefile target called libwhisper to build and install the whisper library, and a generate target which will generate the pkg-config files, since those are broken in whisper.cpp:
BUILD_DIR=<build_path> make libwhisper generate
If you want to use GPU support, theres a whole load of other libraries you also need to include; my generate target will create those for Apple Metal, CUDA and Vulkan and you can include any of those optionally with go compile tags (-tags vulkan or -tags cuda) - Apple GPU libraries get included automatically so no tags needed there.
Depending on your situation, I would also look at the additional libraries needed, which are installed when compiling for Docker - take a look at some of the examples under the etc folder.
Once you did that, you can import as normal:
package main
import (
whisper "github.com/mutablelogic/go-whisper/sys/whisper"
)
funcmain() {
varpathstring// Path to modelvarparamsContextParams// Model parameterscontext:=whisper.Whisper_init_from_file_with_params(path, params)
deferwhisper.Whisper_free(context)
varsamples []float32// Raw audio samplesvartransribe_paramsFullParams// Transciption parametersiferr:=whisper.Whisper_full(context, transcribe_params, samples); err!=nil {
// ....
}
// Get text from context...
}
When you build/run your code, just link to the underlying whisper.cpp libraries by pointing PKG_CONFIG_DIR to the right place. for example:
PKG_CONFIG_DIR=<build_path>/install/lib/pkgconfig go run ./cmd/example
and here's an example for CUDA:
PKG_CONFIG_DIR=<build_path>/install/lib/pkgconfig go run -tags cuda ./cmd/example
You can see how I did it in the Makefile (look at the gowhisper target). Then it gets a bit more difficult...but not much....If you also want to compile in ffmpeg libraries, which can decode audio/video into raw samples, which whisper needs in order to work, the target is libffmpeg and then you can also use the code under pkg as a go package; notably pkg/whisper includes a New function which provides additional functions for loading models, transcribing and translating media files rather than raw audio samples:
BUILD_DIR=<build path> make libwhisper generate libffmpeg
You can of course use any recent ffmpeg libraries with pkg-config; I think version 8.0.
There's also pkg/manager.go which brings in text-to-speech services from Elevenlabs and OpenAI but I imagine that's not so interesting to you! If you look at the architecture diagram in the README.md that should give you a good understanding of which go packages you need to import in order to use them within your own code.
I think the hardest thing is the build phase, as you need to link against libwhisper.cpp and various libraries (and ffmpeg libraries if you need those). Using pkg-config makes this a little easier, but it's still a pain.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
It's possible to use the code here in your own golang projects, but there are different dependencies in different parts.
The "basic" bindings directly to whisper.cpp are in the
sys/whisperfolder, and that contains everything you need to transcribe or translate to/from whisper. You'll notice that the CGO statements within some of the code there use pkg-config in order to bind directly to whisper libraries; so you just need to have an installed set of libraries and header files, and pointPKG_CONFIG_PATHenvironment variable to your pkg-config file (usuallylibwhisper.pc). I have a makefile target calledlibwhisperto build and install the whisper library, and ageneratetarget which will generate the pkg-config files, since those are broken in whisper.cpp:BUILD_DIR=<build_path> make libwhisper generateIf you want to use GPU support, theres a whole load of other libraries you also need to include; my
generatetarget will create those for Apple Metal, CUDA and Vulkan and you can include any of those optionally with go compile tags (-tags vulkanor-tags cuda) - Apple GPU libraries get included automatically so no tags needed there.Depending on your situation, I would also look at the additional libraries needed, which are installed when compiling for Docker - take a look at some of the examples under the
etcfolder.Once you did that, you can import as normal:
When you build/run your code, just link to the underlying whisper.cpp libraries by pointing PKG_CONFIG_DIR to the right place. for example:
and here's an example for CUDA:
You can see how I did it in the
Makefile(look at thegowhispertarget). Then it gets a bit more difficult...but not much....If you also want to compile in ffmpeg libraries, which can decode audio/video into raw samples, which whisper needs in order to work, the target islibffmpegand then you can also use the code underpkgas a go package; notablypkg/whisperincludes aNewfunction which provides additional functions for loading models, transcribing and translating media files rather than raw audio samples:BUILD_DIR=<build path> make libwhisper generate libffmpegYou can of course use any recent ffmpeg libraries with pkg-config; I think version 8.0.
There's also
pkg/manager.gowhich brings in text-to-speech services from Elevenlabs and OpenAI but I imagine that's not so interesting to you! If you look at the architecture diagram in the README.md that should give you a good understanding of which go packages you need to import in order to use them within your own code.I think the hardest thing is the build phase, as you need to link against libwhisper.cpp and various libraries (and ffmpeg libraries if you need those). Using pkg-config makes this a little easier, but it's still a pain.
Beta Was this translation helpful? Give feedback.
All reactions