Skip to content

Capturing Audio Data

davidliu edited this page Sep 21, 2023 · 3 revisions

This page covers the ways to capture audio data from local and remote sources.

Capturing from Local Audio Source

The local audio source data can be obtained by calling setSamplesReadyCallback on the JavaAudioDeviceModule.Builder when creating a Room object:

Example:

val room = LiveKit.create(
    appContext = application,
    overrides = LiveKitOverrides(
        audioOptions = AudioOptions(
            javaAudioDeviceModuleCustomizer = { builder: JavaAudioDeviceModule.Builder ->
                // Listen to the local microphone's audio
                builder.setSamplesReadyCallback { audioSamples: JavaAudioDeviceModule.AudioSamples ->
                    processAudioData(audioSamples.data)
                }
            }
        )
    )
)

By default, the audio format is 16-bit PCM encoding. This can be checked against the AudioSamples.getAudioFormat() value, conforming to the encoding values in android's AudioFormat class, such as AudioFormat.ENCODING_PCM_16BIT. For example, with the default 16-bit format:

fun processAudioData(audioData: ByteArray) {
    // Wrap in a ByteBuffer to handle endianness
    val byteBuffer = ByteBuffer
        .wrap(audioData)
        .order(ByteOrder.nativeOrder())
    while (byteBuffer.hasRemaining() && byteBuffer.remaining() >= 2) {
        // Short is 16-bits
        val value = byteBuffer.getShort()
        // ...
    }
}

Capturing from Remote Audio Source

There's two ways to capture the remote audio data: mixed and individual tracks. Mixed audio data contains all the audio from all remote participants and is what the client will eventually playback. Individual track data contains the data for only a single track, if you want to capture the audio data for a specific participant.

Mixed Audio

The mixed remote audio data can be obtained by calling setPlaybackSamplesReadyCallback on the JavaAudioDeviceModule.Builder when creating a Room object:

Example:

val room = LiveKit.create(
    appContext = application,
    options = RoomOptions(adaptiveStream = true, dynacast = true),
    overrides = LiveKitOverrides(
        audioOptions = AudioOptions(
            javaAudioDeviceModuleCustomizer = { builder: JavaAudioDeviceModule.Builder ->
                builder.setPlaybackSamplesReadyCallback { audioSamples : JavaAudioDeviceModule.AudioSamples ->
                    // Process audio data...
                    processAudioData(audioSamples.data)
                }
            }
        )
    )
)

By default, the audio format is 16-bit PCM encoding. This can be checked against the AudioSamples.getAudioFormat() value, conforming to the encoding values in android's AudioFormat class, such as AudioFormat.ENCODING_PCM_16BIT. For example, with the default 16-bit format:

fun processAudioData(audioData: ByteArray) {
    // Wrap in a ByteBuffer to handle endianness
    val byteBuffer = ByteBuffer
        .wrap(audioData)
        .order(ByteOrder.nativeOrder())
    while (byteBuffer.hasRemaining() && byteBuffer.remaining() >= 2) {
        // Short is 16-bits
        val value = byteBuffer.getShort()
        // ...
    }
}

Individual Track Audio

The audio for a specific remote track can be accessed through the addSink method on RemoteAudioTrack.

Example:

val audioTrackSink = object: AudioTrackSink {
    override fun onData(
        audioData: ByteBuffer?, bitsPerSample: Int, sampleRate: Int,
        numberOfChannels: Int, numberOfFrames: Int,
        absoluteCaptureTimestampMs: Long
    ) {
        handleAudioData(audioData)
    }
}
remoteAudioTrack.addSink(audioTrackSink)

By default, the audio format is 16-bit PCM encoding. This can be checked against the bitsPerSample value, which should be 16 for 16-bit PCM encoding. For example, with the default 16-bit format:

fun processAudioData(audioData: ByteBuffer) {
    while (byteBuffer.hasRemaining() && byteBuffer.remaining() >= 2) {
        val value = byteBuffer.getShort()
        // ...
    }
}