LPMS Bottleneck Tasks #997

j0sh · 2019-07-20T01:32:59Z

This issue tracks the tasks needed for livepeer/lpms#119

In order to minimize the overhead of per-segment transcode sessions, we need to keep the transcode session persistent across segments. This involves a major reworking of the Cgo transcoder internals from being "one-shot" to supporting resumable, per-segment transcoding.

The general behavior of the LPMS transcoding API will remain as-is, taking an input and synchronously returning multiple outputs. However, the input will also include a context pointer for the transcode loop. Under the hood, the API will dispatch the input to the transcode loop, and wait until results are complete, before returning to the caller.

There will also be a new API function introduced to stop the transcode loop.

Modified LPMS Golang API

Probably similar as Cgo in usage (see following section on Cgo), although the function signatures should be a bit friendlier. The signature is derived from livepeer/lpms#124

type TranscodeContext *C.transcode_context
type TranscodeResults struct {
  ... other data ...
  
  // hidden field for the transcoder.
  TranscodeCtx TranscodeContext
}
func Transcode3(input *TranscodeOptionsIn, ps []TranscodeOptions) (res[] TranscodeResults, err error)
func TranscodeStop(ctx Transcode

Modified Cgo API:

Invoke lpms_transcode for the first time, receive a transcode context handle. Use this handle to continue transcoding with the same session. Invoke lpms_transcode_stop with the handle to halt the session.

// Same as existing `lpms_transcode` with the addition of a `output_results` struct.
// See https://github.com/livepeer/lpms/issues/124#issuecomment-502888723 for details
int lpms_transcode(input_params *inp, output_params *params, output_results *res, int nb_outputs);

// API to terminate the transcode loop.
int lpms_transcode_stop(transcode_context *ctx)

typedef struct {
  ...  existing fields ...

   // Pointer to a persistent transcode context.
  // If NULL, a new context is allocated.
   transcode_context *ctx;
} input_params

typedef struct {

  // Number of encoded pixels. (Payment accounting)
  int64_t pixels;

  // Pointer to the persistent transcode context
  // Use within `input_params` to continue the loop
  // Must be released via`lpms_transcode_stop`
  // This is a little redundant since we're returning a ctx
  // per transcoded rendition, but don't think it's a big deal...
  transcode_context *ctx;

} output_results;

Unknowns

Check NVENC state reset, and ensure it works for our needs
x264 does not appear flushable. Check cost to set up new x264 session per segment
Determine whether we can recreate the muxer per segment

Code Changes

Separate IO (AVIOContext*) from demuxer within input_ctx
New muxer per stream (?)
Define "close loop" API - lpms_transcode_stop
Separate transcode loop thread from Transcode function entry point
Separate out initialization and teardown routines as necessary
Return thread handle or context pointer from Transcode function entry point
Implement older Transcode APIs in terms of newer API to facilitate testing

To Test For

Preroll audio handling. Suspect we still need to drop the very first audio frame but subsequent ones should be OK since the encoder is persistent. Changes to existing preroll handling should be minimal (if anything at all), but it's something that still needs to be checked.
Ensure we can still correctly handle discontinuous and out-of-order segments for the same stream. Might require manually resetting some pts-related state.
Memory issues via Valgrind / asan
Memory usage with many idle transcode sessions, especially on GPUs
Concurrency issues with ThreadSanitizer

Go-Livepeer Client Integration

The integration between the goclient and the new LPMS API can be detailed later, but here is a sketch of some possibilities.

We might want to make usage more straightforward on the go-livepeer side, such as automatically closing transcoders on a timer after some period of inactivity, or upon transcode-loop expiration. This can be implemented within transcoder.go on the goclient side, as it seems preferable to keep transcoder usage explicit within LPMS.

For cases where the GPU is used, we might want to track previously assigned GPUs, determine their state (busy/idle), and start a new transcode session on an idle GPU if necessary. Then subsequent segments have two local GPUs to choose from. Need to determine the RAM implications of this.

The text was updated successfully, but these errors were encountered:

j0sh self-assigned this Jul 20, 2019

This was referenced Jul 29, 2019

[RFC] return video pixel and frame counts after transcode livepeer/lpms#140

Closed

LPMS bottleneck fix livepeer/lpms#141

Merged

j0sh mentioned this issue Oct 2, 2019

GPU Load Balancing #1108

Closed

9 tasks

j0sh closed this as completed in livepeer/lpms#141 Jan 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LPMS Bottleneck Tasks #997

LPMS Bottleneck Tasks #997

j0sh commented Jul 20, 2019

LPMS Bottleneck Tasks #997

LPMS Bottleneck Tasks #997

Comments

j0sh commented Jul 20, 2019

Modified LPMS Golang API

Modified Cgo API:

Unknowns

Code Changes

To Test For

Go-Livepeer Client Integration