This repository contains the source code for the Android client for Speechly SLU API. Speechly allows you to easily build applications with voice-enabled UIs.
- Android 8.0 (API level 26) and above
- Android Emulator version must be >= 30.4.5
Add android-client to your build.gradle dependencies.
dependencies {
implementation 'com.speechly:android-client:0.1.11'
}
Remember to add permissions for microphone input and network connections as well as audio stream sampling in AndroidManifest.xml
:
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS"/>
Create a client, usable for the total lifetime of the app:
val speechlyClient: Client = Client.fromActivity(activity = this, UUID.fromString("your APP_ID"))
Then, create a button which handles the opening and closing of the microphone:
var button: SpeechlyButton = findViewById(R.id.speechly)
var buttonTouchListener = object : View.OnTouchListener {
override fun onTouch(v: View?, event: MotionEvent?): Boolean {
when (event?.action) {
MotionEvent.ACTION_DOWN -> {
speechlyClient.startContext()
}
MotionEvent.ACTION_UP -> {
speechlyClient.stopContext()
}
}
return true
}
}
button.setOnTouchListener(buttonTouchListener)
The final thing is to react to the events the API sends back:
speechlyClient.onSegmentChange { segment: Segment ->
val transcript: String = segment.words.values.map{it.value}.joinToString(" ")
print(transcript)
}
Check out the Speechle Android Client Example for a demo app built using this client.
You can find the detailed API documentation in GitHub repository.
Also, the web client tutorial contains a similar event driven client as we have here.
Speechly is a developer tool for building real-time multimodal voice user interfaces. It enables developers and designers to enhance their current touch user interface with voice functionalities for better user experience. Speechly key features:
- Fully streaming API
- Multi modal from the ground up
- Easy to configure for any use case
- Fast to integrate to any touch screen application
- Supports natural corrections such as "Show me red – i mean blue t-shirts"
- Real time visual feedback encourages users to go on with their voice