Extract audio lambda #4

cjjenkinson · 2021-07-28T13:36:30Z

The extract audio lambda should have access to the following resources:

ffmpeg lambda layer
S3 video input bucket
S3 extracted audio bucket

Lambda business logic

The purpose of the extract audio is to extract the audio from the uploaded videos and store the audio file into the extracted audio bucket using the ffmpeg tool.

Receive and parse the S3 put event to get the location of the video and create a new video row on the DynamoDB table with the following attributes

ID (generate an ID using uuid package)
state (starts with pending)
videoBucketKey (location of uploaded video on s3)
extractedAudioKey (location of extracted audio key, will be null initially)
transcriptionState (starts with pending)
transcriptionKey (location of transcription SRT file from Assembly.AI, will be null initially)

Read Stream the video blob and write the video to the tmp disk in order to run the ffmpeg process on it
Run the extract audio ffmpeg command line operation ffmpeg -i sample.mp4 -q:a 0 -map a sample.mp3
The extracted audio will be written to the tmp disk space which then needs to be store on the S3 extracted audio bucket and update the extractedAudio key property on the video row.
Clean up the temporary disk space

Concerns:

The max disk space of lambdas is 500 MB so if the video exceeds 499 MB the lambda execution will fail. This needs to be considered when reading videos but for the MVP we can limit video uploads to 400mb.

One option to investigate would be to understand if there is a way to stream the file from S3 using createReadStream and pipe the response into the ffmpeg command?

The text was updated successfully, but these errors were encountered:

cjjenkinson · 2021-07-31T08:30:21Z

Running the ffmpeg lambda layer when given access to the lambda

import childProcess from "child_process";
import os from "os";

const handler = () => {

     const args = [
       "-i" ,
      "tmp/sample.mp4"
       "-q:a",
        "0",
       "-map a",
       "tmp/sample.mp3"
    ];  

    ... prepare the files locally before running the executable

      const stout = childProcess.execFileSync("/opt/ffmpeg", args, {});

     .... rest of the logic to clear up locally written files from running the executable
}

h3dg3-Wytch · 2021-11-07T04:18:10Z

All resolved in https://github.com/cjjenkinson/captionease-backend/pull/13/files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract audio lambda #4

Extract audio lambda #4

cjjenkinson commented Jul 28, 2021 •

edited

Loading

cjjenkinson commented Jul 31, 2021

h3dg3-Wytch commented Nov 7, 2021

Extract audio lambda #4

Extract audio lambda #4

Comments

cjjenkinson commented Jul 28, 2021 • edited Loading

cjjenkinson commented Jul 31, 2021

h3dg3-Wytch commented Nov 7, 2021

cjjenkinson commented Jul 28, 2021 •

edited

Loading