Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

---Installation and run Instructions which worked for me---- #107

Open
manjunath7472 opened this issue Oct 16, 2023 · 6 comments
Open

---Installation and run Instructions which worked for me---- #107

manjunath7472 opened this issue Oct 16, 2023 · 6 comments

Comments

@manjunath7472
Copy link

manjunath7472 commented Oct 16, 2023

Hi there,
Below are the steps to make it work in windows.
1.Download repo.
2.Inside repo create a python venv.
3.Download and run vs_buildtools.exe from below link.
download
4.Go to individual components and check
MSVCv143 c++ x64/x86 build tools
MSVCv140 c++ x64/x86 build tools
windows 10 SDK (latest)
5.activate venv.
6.pip install Cython.
7.pip install -r requirements.txt.

8.If your using .ipynb file, then remove lines in transcribe cell.
del whisper_model
torch.cuda.empty_cache()

  1. update definition "_get_next_start_timestamp"
def _get_next_start_timestamp(word_timestamps, current_word_index):
        # if current word is the last word
        if current_word_index == len(word_timestamps) - 1:
            return word_timestamps[current_word_index]["start"]
        next_word_index = current_word_index + 1
        while current_word_index < len(word_timestamps) - 1:
            if word_timestamps[next_word_index].get("start") is None:
                # if next word doesn't have a start timestamp
                # merge it with the current word and delete it
                if word_timestamps[next_word_index]["word"] is not None:
                    
                    word_timestamps[current_word_index]["word"] += (
                        " " + str(word_timestamps[next_word_index]["word"])
                    )
                    word_timestamps[next_word_index]["word"] = None
                    next_word_index += 1
                else:                
                    next_word_index += 1
            else:
                return word_timestamps[next_word_index]["start"]
@manjunath7472
Copy link
Author

Thank you Ashraf for pipeline! It's good. As matter of fact am not using Realigning speech segments with punctuation. it is breaking whole purpose of diarization as it is merging chats. Otherwise its good.

@MahmoudAshraf97
Copy link
Owner

Thanks @manjunath7472 for your words, the function modification that you provided is not necessary as the if condition is always evaluated to True, also word_timestamps[next_word_index]["word"] is always a string so no conversion needed

@manjunath7472
Copy link
Author

But for me i got None as value for word_timestamps[next_word_index]["word"] at one instance some where.

@MahmoudAshraf97
Copy link
Owner

that was already fixed in a recent commit, please pull the latest code

@manjunath7472
Copy link
Author

oh ok cool. I ll take. Thank you.

@gallojorge
Copy link

install venv in the python 3.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants