Integrate transcription in PeerTube #3

Chocobozzz · 2024-06-19T08:09:12Z

Changes in your packages:

Force optional options -> required options to increase constraint and force implementer (server, runner) to specify important options (logger, default transcoding
directory...)
Removed ...Sync methods in favour of async ones
Use objects for options instead of list of args (easier to add other options, better understand argument purpose)
Prefer using fs-extra functions instead of fs for consistency (move instead of rename for example that can have issues when moving a file between different devices etc.)
Use a type for the engine name
Deleted unused methods and whisper-timestamped
Added the ability to install whisper engines on the fly so PeerTube instance admins don't have to install them manually
Added a custom engine path arg so the PeerTube server can install whisper engines locally and specify their path instead of relying on the global path
Put custom models to download in fixtures directory (in gitignore but we have a CI cache) so we don't have to download them manually each time we run the tests
Reduces fixture sizes for the transcription tests (just decreased video quality while keeping the audio stream as is)
Deleted the transcript directory option from the constructor in favour of the method so we can instantiate only once the transcriber but use multiple times the transcribe method using a different directory each time
Created a transcription-devtools that includes the benchmark, jiwer and test tools, so packages that just use the transcription don't include these files
Removed requirements.txt file but added the pip install command in the test documentation

Added:

PIP/Hugging Face models cache in Github action
Transcription support in PeerTube runner (uses whisper engines installed globally)
Transcription support in PeerTube server (uses whisper engines installed on the fly in the storage/bin/pip directory)
Config to enable/disable video transcription
Notification for video owner when the transcription is finished
Display auto-transcription info in upload/import page and "features found on this instance" in about page
Add ability to select the auto engine/model, but admins can also specify a custom engine and model paths
Server and runner transcription tests

lutangar

Hey @Chocobozzz , since I made a few comments already I might as well submit them right now. But I'll dig deeper tomorrow and add some more comments...

lutangar · 2024-06-19T14:15:25Z

apps/peertube-runner/src/server/process/shared/process-transcription.ts

+
+    const transcriptFile = await transcriber.transcribe({
+      mediaFilePath: inputPath,
+      model: config.modelPath


custom constructor to use or ad

I'm not sure to understand your comment

Sorry, I left this comment in a hurry as a side note to myself.

I tried to implement a custom constructor for these use cases in either TranscriptionModel (or WhisperBuiltinModel since this implementation is tied to Whisper as it is) but I failed to do so... in fact this should be achievable with the default constructor 🤔

But current constructor may lead to an invalid state (with a path which doesn't exists) and I'm not sure how to deal with this since there is sync/async dilemma and since is no such thing as async constructor...

Btw the invalid object state is also possible with the TranscriptFile default constructor

apps/peertube-runner/src/server/process/shared/process-transcription.ts

config/default.yaml

config/production.yaml.example

packages/jiwer/requirements.txt

packages/transcription/src/whisper/transcriber/ctranslate2-transcriber.ts

packages/transcription/src/whisper/transcriber/openai-transcriber.ts

server/core/initializers/config.ts

Currently translated at 87.3% (2111 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/es/

Currently translated at 100.0% (274 of 274 strings) Translation: PeerTube/server Translate-URL: https://weblate.framasoft.org/projects/peertube/server/es/

Currently translated at 98.7% (2388 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/ru/

Currently translated at 100.0% (144 of 144 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/es/

Currently translated at 87.3% (2111 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/es/

Currently translated at 100.0% (2418 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/gl/

Currently translated at 100.0% (2418 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/zh_Hant/

Currently translated at 98.1% (2374 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Currently translated at 98.2% (2375 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Currently translated at 98.3% (2377 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Currently translated at 98.5% (2383 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Currently translated at 100.0% (143 of 143 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/hr/

Currently translated at 98.5% (2383 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Currently translated at 98.6% (2385 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Currently translated at 98.6% (2386 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Currently translated at 98.3% (2378 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/ja/

Currently translated at 100.0% (145 of 145 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/fa/

Currently translated at 99.3% (144 of 145 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/sq/

Currently translated at 89.7% (2169 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/uk/

lutangar · 2024-06-24T07:42:01Z

apps/peertube-runner/src/server/process/shared/process-transcription.ts

+
+    const transcriptFile = await transcriber.transcribe({
+      mediaFilePath: inputPath,
+      model: config.modelPath


Sorry, I left this comment in a hurry as a side note to myself.

I tried to implement a custom constructor for these use cases in either TranscriptionModel (or WhisperBuiltinModel since this implementation is tied to Whisper as it is) but I failed to do so... in fact this should be achievable with the default constructor 🤔

But current constructor may lead to an invalid state (with a path which doesn't exists) and I'm not sure how to deal with this since there is sync/async dilemma and since is no such thing as async constructor...

lutangar · 2024-06-24T07:42:56Z

apps/peertube-runner/src/server/process/shared/process-transcription.ts

+
+    const transcriptFile = await transcriber.transcribe({
+      mediaFilePath: inputPath,
+      model: config.modelPath


Btw the invalid object state is also possible with the TranscriptFile default constructor

packages/transcription/src/abstract-transcriber.ts

packages/transcription/src/whisper/transcriber/openai-transcriber.ts

lutangar · 2024-06-24T08:10:25Z

server/core/lib/video-captions.ts

+      const transcriptFile = await transcriber.transcribe({
+        mediaFilePath: videoInputPath,
+
+        model: CONFIG.VIDEO_TRANSCRIPTION.MODEL_PATH


Might be another usecase for the previsouly mentionned custom constructor.

support/doc/development/tests.md

packages/transcription-devtools/package.json

scripts/ci.sh

CI fails, our projects generates too many chunks unfortunately

chore: fiddling around some more chore: add ctranslate2 and timestamped chore: add performance markers chore: refactor test chore: change worflow name chore: ensure Python3 chore(duration): convert to chai/mocha syntahx chore(transcription): add individual tests for others transcribers chore(transcription): implement formats test of all implementations Also compare result of other implementation to the reference implementation chore(transcription): add more test case with other language and models size and local model chore(test): wip ctranslate 2 adapat chore(transcription): wip transcript file and benchmark chore(test): clean a bit chore(test): clean a bit chore(test): refacto timestamed spec chore(test): update workflow chore(test): fix glob expansion with sh chore(test): extract some hw info chore(test): fix async tests chore(benchmark): add model info feat(transcription): allow use of a local mode in timestamped-whisper feat(transcription): extract run and profiling info in own value object feat(transcription): extract run concept in own class an run more bench chore(transcription): somplify run object only a uuid is now needed and add more benchmark scenario docs(transcription): creates own package readme docs(transcription): add local model usage docs(transcription): update README fix(transcription): use fr video for better comparison chore(transcription): make openai comparison passed docs(timestamped): clea chore(transcription): change transcribers transcribe method signature Introduce whisper builtin model. fix(transcription): activate language detection Forbid transcript creation without a language. Add `languageDetection` flag to an engine and some assertions. Fix an issue in `whisper-ctranslate2` : Softcatala/whisper-ctranslate2#93 chore(transcription): use PeerTube time helpers instead of custom ones Update existing time function to output an integer number of seconds and add a ms human-readable time formatter with hints of tests. chore(transcription): use PeerTube UUID helpers chore(transcription): enable CER evaluation Thanks to this recent fix in Jiwer <3 https://github.com/jitsi/jiwer/issues/873 chore(jiwer): creates JiWer package I'm not very happy with the TranscriptFileEvaluator constructor... suggestions ? chore(JiWer): add usage in README docs(jiwer): update JiWer readme chore(transcription): use FunMOOC video in fixtures chore(transcription): add proper english video fixture chore(transcription): use os tmp directory where relevant chore(transcription): fix jiwer cli test reference.txt chore(transcription): move benchmark out of tests chore(transcription): remove transcription workflow docs(transcription): add benchmark info fix(transcription): use ms precision in other transcribers chore(transcription): simplify most of the tests chore(transcription): remove slashes when building path with join chore(transcription): make fromPath method async chore(transcription): assert path to model is a directory for CTranslate2 transcriber chore(transcription): ctranslate2 assertion chore(transcription): ctranslate2 assertion chore(transcription): add preinstall script for Python dependencies chore(transcription): add download and unzip utils functions chore(transcription): add download and unzip utils functions chore(transcription): download & unzip models fixtures chore(transcription): zip chore(transcription): raise download file test timeout chore(transcription): simplify download file test chore(transcription): add transcriptions test to CI chore(transcription): raise test preconditions timeout chore(transcription): run preinstall scripts before running ci chore(transcription): create dedicated tmp folder for transcriber tests chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): use short video for local model test chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): raise timeout some more chore(transcription): setup verbosity based on NODE_ENV value

Can be specified on-demand using NODE_DEBUG=execa env variable

Chocobozzz · 2024-06-28T07:17:29Z

Merged manually in upstream develop branch: Chocobozzz@1bfb791

Thanks again!

Chocobozzz changed the base branch from transcription-backend-workbench to transcription-backend-workbench-v2 June 19, 2024 08:18

Fix short uuid use in custom playlist markup

42c78c7

lutangar self-requested a review June 19, 2024 13:39

lutangar assigned Chocobozzz Jun 19, 2024

Support Service AP actors

346be1d

lutangar reviewed Jun 19, 2024

View reviewed changes

Chocobozzz and others added 24 commits June 19, 2024 17:37

Fix detecting account actor

0d0a965

Fix channel update federation

802601c

Fix bulk finder

1b9b904

Detect internal link on plugin pages

797c2f4

Prevent hotkeys in contenteditable element

4aadf69

Fix line typo

419da53

Fix lint

7a7e040

Translated using Weblate (Spanish)

7e7e5af

Currently translated at 87.3% (2111 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/es/

Translated using Weblate (Spanish)

5071fed

Currently translated at 100.0% (274 of 274 strings) Translation: PeerTube/server Translate-URL: https://weblate.framasoft.org/projects/peertube/server/es/

Translated using Weblate (Spanish)

5520043

Currently translated at 100.0% (274 of 274 strings) Translation: PeerTube/server Translate-URL: https://weblate.framasoft.org/projects/peertube/server/es/

Translated using Weblate (Russian)

93e5c38

Currently translated at 98.7% (2388 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/ru/

Translated using Weblate (Spanish)

84e41b4

Currently translated at 100.0% (144 of 144 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/es/

Translated using Weblate (Spanish)

73c6efa

Currently translated at 87.3% (2111 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/es/

Translated using Weblate (Galician)

6e26911

Currently translated at 100.0% (2418 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/gl/

Translated using Weblate (Chinese (Traditional))

ca02b1f

Currently translated at 100.0% (2418 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/zh_Hant/

Translated using Weblate (Croatian)

de40c05

Currently translated at 98.1% (2374 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Translated using Weblate (Croatian)

1746af5

Currently translated at 98.2% (2375 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Translated using Weblate (Croatian)

1ee2754

Currently translated at 98.3% (2377 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Translated using Weblate (Croatian)

0c7d0c7

Currently translated at 98.5% (2383 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Translated using Weblate (Croatian)

b8400e7

Currently translated at 98.5% (2383 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Translated using Weblate (Croatian)

38bf925

Currently translated at 100.0% (143 of 143 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/hr/

Translated using Weblate (Croatian)

4549328

Currently translated at 98.5% (2383 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Translated using Weblate (Croatian)

00c5ff0

Currently translated at 98.6% (2385 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Translated using Weblate (Croatian)

9367ddd

Currently translated at 98.6% (2386 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/hr/

Chocobozzz and others added 13 commits June 21, 2024 14:39

Update server dependencies

a722194

Update apps dependencies

4f4d3ad

Fix lint and tests

985e79f

Fix loading actor involved in video

05d84f6

Fix legacy upload req timeout

209043e

Fix build

5412465

Translated using Weblate (Japanese)

0cbd280

Currently translated at 98.3% (2378 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/ja/

Translated using Weblate (Persian)

0c6dc79

Currently translated at 100.0% (145 of 145 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/fa/

Translated using Weblate (Albanian)

8c93ecf

Currently translated at 99.3% (144 of 145 strings) Translation: PeerTube/player Translate-URL: https://weblate.framasoft.org/projects/peertube/player/sq/

Translated using Weblate (Ukrainian)

af67a6e

Currently translated at 89.7% (2169 of 2418 strings) Translation: PeerTube/angular Translate-URL: https://weblate.framasoft.org/projects/peertube/angular/uk/

Update translations

c49b67b

Fix lint

bc8c853

Update client dependencies

ec33467

lutangar reviewed Jun 24, 2024

View reviewed changes

Chocobozzz and others added 12 commits June 26, 2024 08:33

Upgrade to angular 18 & vite

9772280

Fix lint

9b2a054

Remove bundlewatch

2728810

CI fails, our projects generates too many chunks unfortunately

Add views tag to middlewares too

43e186e

Fix E2E tests

564089d

Add server restart test

ef0a6b2

Fix lint

b10482e

Integrate transcription in PeerTube

1bfb791

Metadata to know if the caption is auto generated

fd4831e

Runner can choose job type

b66963f

Remove verbose option from transcription

0b30e58

Can be specified on-demand using NODE_DEBUG=execa env variable

Chocobozzz force-pushed the feature/transcription branch from e654342 to 0b30e58 Compare June 28, 2024 07:06

Chocobozzz closed this Jun 28, 2024

Chocobozzz deleted the feature/transcription branch June 28, 2024 07:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate transcription in PeerTube #3

Integrate transcription in PeerTube #3

Chocobozzz commented Jun 19, 2024 •

edited

Loading

lutangar left a comment •

edited

Loading

lutangar Jun 19, 2024

Chocobozzz Jun 21, 2024

lutangar Jun 24, 2024

lutangar Jun 24, 2024

lutangar Jun 24, 2024

lutangar Jun 24, 2024

lutangar Jun 24, 2024

Chocobozzz commented Jun 28, 2024

Integrate transcription in PeerTube #3

Integrate transcription in PeerTube #3

Conversation

Chocobozzz commented Jun 19, 2024 • edited Loading

lutangar left a comment • edited Loading

Choose a reason for hiding this comment

lutangar Jun 19, 2024

Choose a reason for hiding this comment

Chocobozzz Jun 21, 2024

Choose a reason for hiding this comment

lutangar Jun 24, 2024

Choose a reason for hiding this comment

lutangar Jun 24, 2024

Choose a reason for hiding this comment

lutangar Jun 24, 2024

Choose a reason for hiding this comment

lutangar Jun 24, 2024

Choose a reason for hiding this comment

lutangar Jun 24, 2024

Choose a reason for hiding this comment

Chocobozzz commented Jun 28, 2024

Chocobozzz commented Jun 19, 2024 •

edited

Loading

lutangar left a comment •

edited

Loading