[Feature request] Adaptation to the Whisper's JSON output #15

sensboston · 2024-02-22T20:39:00Z

Hello, is it possible to adapt your project to the Whisper's JSON output? I'm working on karaoke program for Windows, and need all words in the lyrics to be timestamped.
I'll be glad to issue a PR for this feature but unfortunately I'm not proficient in Python programming (mostly use a C# & C++).

EtienneAb3d · 2024-02-23T07:39:43Z

@sensboston

I will have a look at it as soon as possible.
To be sure of what you expect, can you provide me with an example (JSON+TXT)?

WhisperTimeSync is not written in Python but in Java.
;-)

sensboston · 2024-02-23T16:02:09Z

Here we go: samples.zip
There are two directories: English (Smokie, "Living Next Door to Alice" and Russian (Bit-quartet Secret, "Alice") with JSONs and original lyrics (my daughter's name is Alice 😉 ). Whisper's English output is kinda affordable but Russian is a complete mess.

EtienneAb3d · 2024-02-23T19:01:17Z

@sensboston
Hmmm... The problem I see with this JSON format is that each word has a mandatory description including its timestamp. It will be very hard to decide what to do with not-matching words.
🤔

sensboston · 2024-02-23T19:38:14Z

Yeah, it's an issue, agree. But I haven't looked to your (or Java code you've ported) implementation (yet). Theoretically it's possible, even without involving AI - for example by using "soundex" algorithm. I thought about this but initially wanna check if someone already done this.

EtienneAb3d · 2024-02-23T20:11:43Z

@sensboston
The problem is not to match word by word, this is what WhisperTimeSync is doing.
The problem is to know what to do with unmatching words in this specific JSON description.
I may adapt an algo I already have for similar cases, but this is quite a work.
Do you have a budget for this?

sensboston · 2024-02-23T20:17:49Z

No budget at all, I do development just for fun, will publish open source here when it done.

P.S. If you want, I'll add you to this private repo (but you need a Windows PC to test at least).

sensboston · 2024-02-29T05:57:33Z

Any progress? Or you have no idea how to implement this? Please le me know - I don't wanna to waste a time.

EtienneAb3d · 2024-03-05T07:32:39Z

@sensboston
I understood you were working on the subject on your side.
On my side, without a budget, I have to find/allocate on my free time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Adaptation to the Whisper's JSON output #15

[Feature request] Adaptation to the Whisper's JSON output #15

sensboston commented Feb 22, 2024

EtienneAb3d commented Feb 23, 2024

sensboston commented Feb 23, 2024 •

edited

Loading

EtienneAb3d commented Feb 23, 2024

sensboston commented Feb 23, 2024

EtienneAb3d commented Feb 23, 2024

sensboston commented Feb 23, 2024 •

edited

Loading

sensboston commented Feb 29, 2024 •

edited

Loading

EtienneAb3d commented Mar 5, 2024

[Feature request] Adaptation to the Whisper's JSON output #15

[Feature request] Adaptation to the Whisper's JSON output #15

Comments

sensboston commented Feb 22, 2024

EtienneAb3d commented Feb 23, 2024

sensboston commented Feb 23, 2024 • edited Loading

EtienneAb3d commented Feb 23, 2024

sensboston commented Feb 23, 2024

EtienneAb3d commented Feb 23, 2024

sensboston commented Feb 23, 2024 • edited Loading

sensboston commented Feb 29, 2024 • edited Loading

EtienneAb3d commented Mar 5, 2024

sensboston commented Feb 23, 2024 •

edited

Loading

sensboston commented Feb 23, 2024 •

edited

Loading

sensboston commented Feb 29, 2024 •

edited

Loading