feature request:please add voice in/out #4

develperbayman · 2023-02-01T22:25:26Z

title covers it would be sweet to talk and reply in audio

RaSan147 · 2023-02-02T04:11:01Z

The web ui is under development, try this branch if you want voice
https://github.com/RaSan147/VoiceAI-Asuna/tree/kivy-gui

I'm trying hard to replicate cute voice, but online services require payment or api and OS based voice is kinda feels off. So trying to learn alternative ways now.

And as for the sweet talk part, i just jumped of from python based kivy to server based (self made) web ui. So there are tons of ground work needs to be done before adding more commands, so things getting a bit slow

Sorry

RaSan147 · 2023-02-02T16:01:49Z

@develperbayman if you know any website that let users produce voice wav files via api (free) that would be a life saving help 🛐🙇‍♂️

RaSan147 · 2023-02-03T18:54:00Z

@develperbayman could you check if there's any voice in https://www.voicerss.org/api/demo.aspx
you like (goes well with the character)??
I may be able to handle pitch and speed to mimic some expression

develperbayman · 2023-06-19T05:54:31Z

wow i totally did not realize you replied its prob a little late (my apologies i havent been very active for a bit) but perhaps you would be more interested in a tts engine and a stt engine to accomplish this i am using one for python for my AI script im trying to do take a peek

develperbayman · 2023-06-19T05:56:50Z

import` threading
import time
import sys
import chat_commands
from gtts import gTTS
import os
import tkinter as tk
from tkinter import filedialog, messagebox
import speech_recognition as sr
import webbrowser
import re
import subprocess
import openai

doListenToCommand = True
listening = False

List with common departures to end the while loop

despedida = ["Goodbye", "goodbye", "bye", "Bye", "See you later", "see you later"]

Create the GUI window

window = tk.Tk()
window.title("Computer: AI")
window.geometry("400x400")

Create the text entry box

text_entry = tk.Entry(window, width=50)
text_entry.pack(side=tk.BOTTOM)

Create the submit button

submit_button = tk.Button(window, text="Submit", command=lambda: submit())
submit_button.pack(side=tk.BOTTOM)

Create the text output box

text_output = tk.Text(window, height=300, width=300)
text_output.pack(side=tk.BOTTOM)

Set your OpenAI API key here

openai.api_key = "your_api_key_here"

def submit(event=None, text_input=None):
global doListenToCommand
global listening

# Get the user input and check if the input matches the list of goodbyes
if text_input is not None and text_input != "":
    usuario = text_input
else:
    usuario = text_entry.get()

if usuario in despedida:
    on_closing()
else:
    prompt = f"You are ChatGPT and answer my following message: {usuario}"

# Getting responses using the OpenAI API
response = openai.Completion.create(
    engine="text-davinci-003",
    prompt=prompt,
    max_tokens=2049
)

respuesta = response["choices"][0]["text"]

# Converting text to audio
texto = str(respuesta)
tts = gTTS(texto, lang='en', tld='ie')
tts.save("audio.mp3")

# Displaying the answer on the screen
text_output.insert(tk.END, "ChatGPT: " + respuesta + "\n")

# Clear the input text
text_entry.delete(0, tk.END)

# Playing the audio
doListenToCommand = False
time.sleep(1)
os.system("play audio.mp3")
doListenToCommand = True

# Call function to listen to the user
if not listening:
    listen_to_command()

Bind the Enter key to the submit function

window.bind("", submit)

def load_core_principles(file_path):
with open(file_path, 'r') as file:
principles = file.readlines()
return principles

def listen_to_command():
global doListenToCommand
global listening

# If we are not to be listening then exit the function.
if not doListenToCommand:
    return

# Initialize the recognizer
r = sr.Recognizer()

# Use the default microphone as the audio source
with sr.Microphone() as source:
    print("Listening...")
    listening = True
    audio = r.listen(source)
    listening = False

try:
    # Use speech recognition to convert speech to text
    command = r.recognize_google(audio)
    print("You said:", command)
    text_output.insert(tk.END, "You: " + command + "\n")
    text_entry.delete(0, tk.END)

    # Process the commands
    # Prepare object to be passed.
    class PassedCommands:
        tk = tk
        text_output = text_output
        submit = submit

    chat_commands.process_commands(PassedCommands, command)

except sr.UnknownValueError:
    print("Speech recognition could not understand audio.")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service:", str(e))

listening = False
listen_to_command()

def on_closing():
if messagebox.askokcancel("Quit", "Do you want to quit?"):
window.destroy()

window.protocol("WM_DELETE_WINDOW", on_closing)

if name == "main":
# Create the menu bar
menu_bar = tk.Menu(window)

# Create the "File" menu
file_menu = tk.Menu(menu_bar, tearoff=0)
file_menu.add_command(label="Open LLM", command=lambda: filedialog.askopenfilename())
file_menu.add_command(label="Save LLM", command=lambda: filedialog.asksaveasfilename())
file_menu.add_separator()
file_menu.add_command(label="Exit", command=on_closing)
menu_bar.add_cascade(label="File", menu=file_menu)

# Create the "Run" menu
run_menu = tk.Menu(menu_bar, tearoff=0)
run_menu.add_command(label="Run as normal app", command=lambda: threading.Thread(target=run_as_normal_app).start())
run_menu.add_command(label="Run on Flask", command=lambda: threading.Thread(target=run_on_flask).start())
menu_bar.add_cascade(label="Run", menu=run_menu)

# Set the menu bar
window.config(menu=menu_bar)

# Start the main program loop
start_listening_thread = threading.Thread(target=listen_to_command)
start_listening_thread.daemon = True
start_listening_thread.start()
window.mainloop()

develperbayman · 2023-06-19T06:00:04Z

i hate markup it never works for me but yeah it generates the mp3 automatically this example uses openai
actually this script is complete however it uses another python script to supply any extra commands

develperbayman · 2023-06-19T06:09:57Z

import subprocess
import webbrowser
import re
import validators
import sys

def process_commands(passed_commands, command):
if "computer" in command.lower():
print("Activated Command: Computer")
passed_commands.text_output.insert(
passed_commands.tk.END, "Activated Command: Computer" + "\n")
passed_commands.submit(text_input=command)
# listen_to_command()

    # Open a website
    #if command.lower().startswith("open website"):
    if "open website" in command.lower():
        # Extract the website URL from the command
        #url = command.replace("open website", "")
        url = command.partition("open website")
        # access third tuple element
        url = url[2]
        url = url.strip() # Strip whitespace on both ends. Not working? As there is a space in the leading part of the URL variable after this.
        # Test for http:// or https:// and add http:// to the URL if missing.
        if not url.startswith("http://") and not url.startswith("https://"):
            url = "http://" + url
        
        print("Trying to open website: " + url)

        # Validating if the URL is correct
        if validators.url(url):
            webbrowser.open(url, new=0, autoraise=True)
            
            passed_commands.text_output.insert(
                passed_commands.tk.END, "Opening website: " + url + "\n")
        else:
            print("Invalid URL command. URL: " + url)
            passed_commands.text_output.insert(
                passed_commands.tk.END, "Invalid URL command. URL: " + url + "\n")

    return

def process_commands(passed_commands, command):
if "computer" in command.lower():
print("Activated Command: Computer")
passed_commands.text_output.insert(
passed_commands.tk.END, "Activated Command: Computer" + "\n")
passed_commands.submit(text_input=command)
# listen_to_command()

    # Open an application
    if "run program" in command.lower():
        # Extract the application name from the command
        app_name = command.partition("run program")[2]
        app_name = app_name.strip()

        print("Trying to open program: " + app_name)

        try:
            subprocess.Popen(app_name)
            passed_commands.text_output.insert(
                passed_commands.tk.END, "Opening program: " + app_name + "\n")
        except FileNotFoundError:
            print("Program not found: " + app_name)
            passed_commands.text_output.insert(
                passed_commands.tk.END, "Program not found: " + app_name + "\n")

        return

    print("Invalid command")
    passed_commands.text_output.insert(
        passed_commands.tk.END, "Invalid command" + "\n")


# Testing
# Stop listening to the microphone
if command.lower() == "stop listening":
    passed_commands.text_output.insert(
        passed_commands.tk.END, "Stopping the microphone." + "\n")
    # What goes here?

    return

# Testing
# Allow program exit via voice.
if command.lower() == "stop program":
    passed_commands.text_output.insert(
        passed_commands.tk.END, "Stopping the program." + "\n")
    
    sys.exit()

    return

develperbayman · 2023-06-19T06:10:58Z

again sorry for the very late reply but this should get you started please let me know if it helps or if you do anything cool with it

develperbayman · 2023-06-19T06:12:50Z

next im working on a huggingface transformers version to self host your own model but dear god the hardware needed for that is insane

RaSan147 · 2023-06-19T08:23:45Z

next im working on a huggingface transformers version to self host your own model but dear god the hardware needed for that is insane

thats why i dropped all the hopes of running AI just for TTS
I'll use edge_tts for speech output (half way done)
and for voice recog, this will run on client side, so your openAI solution is no help here. I'll use JS speech recog to voice2text. (need to start working)

EDGE_TTS has some real good collection of voice, thank you microsoft

develperbayman · 2023-06-19T12:05:13Z

Maybe I'll switch to edge I'm very interested in better sounding voice output

RaSan147 added the good first issue 🥇 Good for newcomers label Feb 3, 2023

RaSan147 added the Enhancement ✨ New feature or request label Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request:please add voice in/out #4

feature request:please add voice in/out #4

develperbayman commented Feb 1, 2023

RaSan147 commented Feb 2, 2023 •

edited

RaSan147 commented Feb 2, 2023

RaSan147 commented Feb 3, 2023 •

edited

develperbayman commented Jun 19, 2023 •

edited

develperbayman commented Jun 19, 2023 •

edited

develperbayman commented Jun 19, 2023 •

edited

develperbayman commented Jun 19, 2023

develperbayman commented Jun 19, 2023

develperbayman commented Jun 19, 2023

RaSan147 commented Jun 19, 2023

develperbayman commented Jun 19, 2023

feature request:please add voice in/out #4

feature request:please add voice in/out #4

Comments

develperbayman commented Feb 1, 2023

RaSan147 commented Feb 2, 2023 • edited

RaSan147 commented Feb 2, 2023

RaSan147 commented Feb 3, 2023 • edited

develperbayman commented Jun 19, 2023 • edited

develperbayman commented Jun 19, 2023 • edited

List with common departures to end the while loop

Create the GUI window

Create the text entry box

Create the submit button

Create the text output box

Set your OpenAI API key here

Bind the Enter key to the submit function

develperbayman commented Jun 19, 2023 • edited

develperbayman commented Jun 19, 2023

develperbayman commented Jun 19, 2023

develperbayman commented Jun 19, 2023

RaSan147 commented Jun 19, 2023

develperbayman commented Jun 19, 2023

RaSan147 commented Feb 2, 2023 •

edited

RaSan147 commented Feb 3, 2023 •

edited

develperbayman commented Jun 19, 2023 •

edited

develperbayman commented Jun 19, 2023 •

edited

develperbayman commented Jun 19, 2023 •

edited