-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speaker Identification #672
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
|
unrelated but fun:
whisper detect my voice in french (i spoke english) |
there are a few Thank you (something with VAD i suppose) but maybe not more than on main |
Okay there are some bug fixes to make. |
@louis030195 I was able to identify the source of the bug and fix it. |
looks great! @EzraEllette i want to merge this ASAP i think there might be some things that we don't know yet changed, so make sense to merge and ask a few people to test it out and see if it works as before roughly one last thing to fix before merging though: https://youtu.be/vk711s6h8W4 there is an issue with the audio data encoded to disk for some reason, speed or something is changed, check the video |
Okay, I don't have my computer with me right now but I have experienced something similar to this before. If you want to take a look at the sample rate that is passed to the stt function that's probably wrong because we have to use a 16000hz rate for segmentation and I'm probably not reflecting that change when STT is called. Sent from my phone at a concert so pls forgive the grammar |
any news? |
@louis030195 Making the UI today |
This should be safe to merge once tested again. UI can come soon. |
I fixed the audio storage issue. |
amazing /approve |
@louis030195: The claim has been successfully added to reward-all. You can visit your dashboard to complete the payment. |
@EzraEllette any suggestion next steps? |
|
lets continue here @EzraEllette |
description
This PR adds speaker identification to screenpipe. Audio is segmented by speaker then transcribed. transcriptions now have a
speaker_id
column. new tablespeakers
was added withname
andmetadata
columns.speaker_embeddings
table was created with a one-to-many relationship for speaker and embeddings.related issue: /claim #306
type of change
how to test
Run the speaker_identification test. run
screenpipe-server/src/db.rs
tests.Use screenpipe.