Skip to content

Commit cdcab75

Browse files
authored
merging development into master (#62)
chore: update README.md for v2 feat: modified orpheus generator script to handle batch generation
1 parent 5b6e522 commit cdcab75

File tree

2 files changed

+89
-49
lines changed

2 files changed

+89
-49
lines changed

README.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@
3131
<a href="https://github.com/existence-master/Sentient/issues/">Report Bug</a>
3232
<span> · </span>
3333
<a href="https://github.com/existence-master/Sentient/issues/">Request Feature</a>
34+
<span> · </span>
35+
<a href="https://www.youtube.com/watch?v=l481bvpCjbc">Watch our Ad!</a>
3436
</h4>
3537
</div>
3638

@@ -75,27 +77,27 @@ We at [Existence](https://existence.technology) believe that AI won't simply die
7577
### :camera: Screenshots
7678

7779
<div align="center">
78-
<img src="https://private-user-images.githubusercontent.com/59280736/431842199-b76c7a9a-1689-42de-93ed-5d04d6c7ad10.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDQzMTkyMTYsIm5iZiI6MTc0NDMxODkxNiwicGF0aCI6Ii81OTI4MDczNi80MzE4NDIxOTktYjc2YzdhOWEtMTY4OS00MmRlLTkzZWQtNWQwNGQ2YzdhZDEwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA0MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwNDEwVDIxMDE1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTY2ZThhMDIyZmJkZWYxYzE5MzMyNTYzZDM5NjY0MmM3ZDc2NmJjMmYwNGU5MjUzMmJhYTE1NDU3NDhhZGIwODgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.U2Bn6mIdJF2SvXpJ9fyKe2c36-feA2wKtvQNcYjaEYY" alt="screenshot" />
80+
<img src="https://i.postimg.cc/jqNX99VF/image.png" alt="screenshot" />
7981
<p align="center">Context is streamed in from your apps - Sentient uses this context to 👇</p>
8082
</div>
8183
<div align="center">
82-
<img src="https://private-user-images.githubusercontent.com/59280736/431841076-c7337318-38e2-4515-848d-df6ce9ec8685.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDQzMTkyMTYsIm5iZiI6MTc0NDMxODkxNiwicGF0aCI6Ii81OTI4MDczNi80MzE4NDEwNzYtYzczMzczMTgtMzhlMi00NTE1LTg0OGQtZGY2Y2U5ZWM4Njg1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA0MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwNDEwVDIxMDE1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWFkYmIzYWJkMDExMmU0NzllMmZmNjU0NmUyNzIyYzJlZjUwMzM1ZDY0NjY0NjlhYTM4ODNiOGNmNDRkYzhhZTQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.s87SsI2uPocqdoRQK-b_1R89ApFKnvOoVzislh77bAw" alt="screenshot" />
84+
<img src="https://i.postimg.cc/FRVMVKxj/image.png" alt="screenshot" />
8385
<p align="center">Learn Long-Term Memories about you</p>
8486
</div>
8587
<div align="center">
86-
<img src="https://private-user-images.githubusercontent.com/59280736/431841142-33edc431-6be9-45b3-9b9c-5262f459ede6.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDQzMTkyMTYsIm5iZiI6MTc0NDMxODkxNiwicGF0aCI6Ii81OTI4MDczNi80MzE4NDExNDItMzNlZGM0MzEtNmJlOS00NWIzLTliOWMtNTI2MmY0NTllZGU2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA0MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwNDEwVDIxMDE1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTFlOTVhMWEyODVhMWZmN2ZmMzNjMGMyZWMxZjQwYzFkNGM4OGZhZTQ4YjVkYTc5MmRhY2ZmZGQxZTBmOTY4NjUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.GUEsRDZzletVFm4uKBQhRehk4l2FhzEJuX5jFnglbZ4" alt="screenshot" />
88+
<img src="https://i.postimg.cc/hth7Fzzt/image.png" alt="screenshot" />
8789
<p align="center">Learn Short-Term Memories about you</p>
8890
</div>
8991
<div align="center">
90-
<img src="https://private-user-images.githubusercontent.com/59280736/431841274-ea980432-1357-451b-93d2-d952a65f4607.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDQzMTkyMTYsIm5iZiI6MTc0NDMxODkxNiwicGF0aCI6Ii81OTI4MDczNi80MzE4NDEyNzQtZWE5ODA0MzItMTM1Ny00NTFiLTkzZDItZDk1MmE2NWY0NjA3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA0MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwNDEwVDIxMDE1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZjMTM5OWRlMGI1ODI0Zjg4YmJiYjk2MDBmMWNjNDdhMDRjODM2YjBhNjJjY2JiMzMxMGNlM2UzYjU5OGFmYzcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.F76G4nymktipQtkQZQ_9sfmMFKiQ1AH-0hMoPWt0DQE" alt="screenshot" />
92+
<img src="https://i.postimg.cc/FFM9FYBK/image.png" alt="screenshot" />
9193
<p align="center">Perform Actions for you, asynchronously and by combining all the different tools it needs to complete a task.</p>
9294
</div>
9395
<div align="center">
94-
<img src="https://private-user-images.githubusercontent.com/59280736/431842176-c1ec90b6-edcc-4f9c-bc94-aa2e40b6422f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDQzMTkyMTYsIm5iZiI6MTc0NDMxODkxNiwicGF0aCI6Ii81OTI4MDczNi80MzE4NDIxNzYtYzFlYzkwYjYtZWRjYy00ZjljLWJjOTQtYWEyZTQwYjY0MjJmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA0MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwNDEwVDIxMDE1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWRkZGMyY2Y1NTkyMDk1YjY4NWEwZjY1NDUxNWQ5NDc2NWU1OTAwZmM3ZjVjYWNmZDQzYWE1ZGNkMjJiYjQ3ZDImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.4djs4rCVqHY4L5_gshezAiMNgIcLui_eiFbZc8rsKrY" alt="screenshot" />
96+
<img src="https://i.postimg.cc/TPpSW9yv/image.png" alt="screenshot" />
9597
<p align="center">You can also voice-call Sentient anytime for a low-latency, human-like interactive experience.</p>
9698
</div>
9799
<div align="center">
98-
<img src="https://private-user-images.githubusercontent.com/59280736/431842396-03af93ff-6acd-44c7-a973-dca20ac205bd.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDQzMTkyMTYsIm5iZiI6MTc0NDMxODkxNiwicGF0aCI6Ii81OTI4MDczNi80MzE4NDIzOTYtMDNhZjkzZmYtNmFjZC00NGM3LWE5NzMtZGNhMjBhYzIwNWJkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA0MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwNDEwVDIxMDE1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWZmNGNkZjI5OTQ1YmFjNmYzMmIzNThiOWEyZmIyZTBiMjVlMjczNTc2NmY3MjU1NjkzOTMwNjUwYzgyZDliMzImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.XCieUKi8dB-r8H75QHWKwX7UBtC6m1NXbSFxbUV_lkI" alt="screenshot" />
100+
<img src="https://i.postimg.cc/tJSWPhZ8/image.png" alt="screenshot" />
99101
<p align="center">Your profile can also be enriched with data from other social media sites.</p>
100102
</div>
101103

src/server/tests/test_orpheus.py

Lines changed: 81 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ def run_async():
137137
from llama_cpp import Llama
138138

139139
# Set the path to your GGUF model file (update this to the correct path)
140-
MODEL_PATH = "./models/orpheus-3b-0.1-ft-q4_k_m.gguf" # Replace with your GGUF file path
140+
MODEL_PATH = "../voice/models/orpheus-3b-0.1-ft-q4_k_m.gguf" # Replace with your GGUF file path
141141

142142
# Number of layers to offload to GPU (adjust based on your GPU memory, e.g., 30 for 8GB VRAM)
143143
N_GPU_LAYERS = 20
@@ -161,6 +161,23 @@ def run_async():
161161
END_TOKEN_IDS = [128009, 128260, 128261, 128257]
162162
CUSTOM_TOKEN_PREFIX = "<custom_token_"
163163

164+
# Default text to be spoken if no text is provided
165+
DEFAULT_TEXT = "This is a default sentence."
166+
BATCH_SENTENCES = [
167+
"Good morning Kabeer!",
168+
"You've got a busy day ahead.",
169+
"Meetings, presentations and even a night out with the boys! <chuckle>",
170+
"You ready to crush this?",
171+
]
172+
173+
def create_filename(sentence, max_words=3, max_length=50):
174+
words = sentence.split()[:max_words]
175+
base = "_".join(words)
176+
safe_base = "".join(c for c in base if c.isalnum() or c in ("_", "-"))
177+
if len(safe_base) > max_length:
178+
safe_base = safe_base[:max_length]
179+
return safe_base + ".wav"
180+
164181
def format_prompt(prompt, voice=DEFAULT_VOICE):
165182
"""Format prompt for Orpheus model with voice prefix and special tokens."""
166183
if voice not in AVAILABLE_VOICES:
@@ -351,55 +368,76 @@ def list_available_voices():
351368
print("<laugh>, <chuckle>, <sigh>, <cough>, <sniffle>, <groan>, <yawn>, <gasp>")
352369

353370
def main():
354-
# Parse command line arguments
355-
parser = argparse.ArgumentParser(description="Orpheus Text-to-Speech using local GGUF model")
356-
parser.add_argument("--text", type=str, help="Text to convert to speech")
357-
parser.add_argument("--voice", type=str, default=DEFAULT_VOICE, help=f"Voice to use (default: {DEFAULT_VOICE})")
358-
parser.add_argument("--output", type=str, help="Output WAV file path")
371+
parser = argparse.ArgumentParser(description="Generate speech from text.")
372+
parser.add_argument("text", nargs="*", help="Text to convert to speech")
373+
parser.add_argument("--voice", default="default_voice", help="Voice to use")
374+
parser.add_argument("--output", help="Output file or directory (in batch mode)")
375+
parser.add_argument("--batch", action="store_true", help="Process predefined batch of sentences")
376+
parser.add_argument("--temperature", type=float, default=0.7, help="Temperature for generation")
377+
parser.add_argument("--top_p", type=float, default=0.9, help="Top-p sampling")
378+
parser.add_argument("--repetition_penalty", type=float, default=1.0, help="Repetition penalty")
359379
parser.add_argument("--list-voices", action="store_true", help="List available voices")
360-
parser.add_argument("--temperature", type=float, default=TEMPERATURE, help="Temperature for generation")
361-
parser.add_argument("--top_p", type=float, default=TOP_P, help="Top-p sampling parameter")
362-
parser.add_argument("--repetition_penalty", type=float, default=REPETITION_PENALTY,
363-
help="Repetition penalty (>=1.1 required for stable generation)")
364380

365381
args = parser.parse_args()
366-
382+
367383
if args.list_voices:
368384
list_available_voices()
369385
return
370-
371-
# Use text from command line or prompt user
372-
prompt = args.text
373-
if not prompt:
374-
if len(sys.argv) > 1 and sys.argv[1] not in ("--voice", "--output", "--temperature", "--top_p", "--repetition_penalty"):
375-
prompt = " ".join([arg for arg in sys.argv[1:] if not arg.startswith("--")])
386+
387+
if args.batch:
388+
# Batch mode
389+
if args.output:
390+
batch_dir = args.output
391+
if not os.path.isdir(batch_dir):
392+
os.makedirs(batch_dir, exist_ok=True)
376393
else:
377-
prompt = input("Enter text to synthesize: ")
378-
if not prompt:
379-
prompt = "Hello, I am Orpheus, an AI assistant with emotional speech capabilities."
380-
381-
# Default output file if none provided
382-
output_file = args.output
383-
if not output_file:
384-
os.makedirs("outputs", exist_ok=True)
385-
timestamp = time.strftime("%Y%m%d_%H%M%S")
386-
output_file = f"outputs/{args.voice}_{timestamp}.wav"
387-
print(f"No output file specified. Saving to {output_file}")
388-
389-
# Generate speech
390-
start_time = time.time()
391-
audio_segments = generate_speech_from_api(
392-
prompt=prompt,
393-
voice=args.voice,
394-
temperature=args.temperature,
395-
top_p=args.top_p,
396-
repetition_penalty=args.repetition_penalty,
397-
output_file=output_file
398-
)
399-
end_time = time.time()
400-
401-
print(f"Speech generation completed in {end_time - start_time:.2f} seconds")
402-
print(f"Audio saved to {output_file}")
394+
batch_dir = "outputs"
395+
os.makedirs(batch_dir, exist_ok=True)
396+
397+
for sentence in BATCH_SENTENCES:
398+
filename = create_filename(sentence)
399+
output_file = os.path.join(batch_dir, filename)
400+
print(f"Generating audio for: {sentence}")
401+
start_time = time.time()
402+
audio_segments = generate_speech_from_api(
403+
prompt=sentence,
404+
voice=args.voice,
405+
temperature=args.temperature,
406+
top_p=args.top_p,
407+
repetition_penalty=args.repetition_penalty,
408+
output_file=output_file
409+
)
410+
end_time = time.time()
411+
print(f"Speech generation for '{sentence}' completed in {end_time - start_time:.2f} seconds")
412+
print(f"Audio saved to {output_file}")
413+
else:
414+
# Non-batch mode
415+
if args.text:
416+
prompt = " ".join(args.text)
417+
else:
418+
prompt = DEFAULT_TEXT
419+
print(f"No text provided. Using default text: {DEFAULT_TEXT}")
420+
421+
if args.output:
422+
output_file = args.output
423+
else:
424+
os.makedirs("outputs", exist_ok=True)
425+
timestamp = time.strftime("%Y%m%d_%H%M%S")
426+
output_file = f"outputs/{args.voice}_{timestamp}.wav"
427+
print(f"No output file specified. Saving to {output_file}")
428+
429+
start_time = time.time()
430+
audio_segments = generate_speech_from_api(
431+
prompt=prompt,
432+
voice=args.voice,
433+
temperature=args.temperature,
434+
top_p=args.top_p,
435+
repetition_penalty=args.repetition_penalty,
436+
output_file=output_file
437+
)
438+
end_time = time.time()
439+
print(f"Speech generation completed in {end_time - start_time:.2f} seconds")
440+
print(f"Audio saved to {output_file}")
403441

404442
if __name__ == "__main__":
405443
main()

0 commit comments

Comments
 (0)