Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to read audio file? #12

Open
bewantbe opened this issue Apr 9, 2017 · 5 comments
Open

Ability to read audio file? #12

bewantbe opened this issue Apr 9, 2017 · 5 comments

Comments

@bewantbe
Copy link
Owner

bewantbe commented Apr 9, 2017

As per user request in Play Store it is good to have the ability to load recored files and show its spectrum/spectrogram.

There are several things need to be fixed:

  • Load the whole file or part of a file?

    1. The ability to load whole file would require tricky time zoom in/zoom out, and can be CPU and RAM intensive.
  • How to move along time? or maybe just play it in real time? Related to previous question.

  • Do we need loudspeaker play back?

Having all these features is of course good, but that can be too complex. It is better to solve the most wanted/useful part first.

@nfsmaster208

@nfsmaster208
Copy link
Contributor

The idea came from the idea of isolating and analyzing the recording of the sound of a bird chirping.

I imagine the user using the app to record a sample, using the app even in order to record the outdoors, recognizing the sound of a bird, and trying to identify, 'what type of bird was that? A hawk, blue jay, or a woodpecker?'

So 1) if the user has been using the app and decides later they want to look at the recording where the bird chirped, it is a relevant question to ask how does the user find where the bird chirped? Do they take a note of the time in the recording on another app, and continue recording? Or do they stop recording and then record again?

Let's first assume the recording was brief. Maybe 15 seconds of outdoor recording and then the user hears the bird and stops recording. The user will look at the overall picture of the recording and zoom to the timeframe where they believe the sound was recorded.

  1. The frame of the app will instruct the recording how many seconds and where it should have loaded in memory the sound, from beginning of frame to the end of the frame, where it will continue to be paused until the user presses "run". If the user lets the app continue to run, it will continue to play the sample timeframe over and over.

  2. Loudspeaker playback would likely be necessary, and it is possible that playback is enabled by an external operation, e.g. to an embedded element for VLC Media player. This is something that Steven and I are looking into, hopefully without the dependency of an external library or SDK like VLC.

@bewantbe
Copy link
Owner Author

I did something similar to the situation you descried... Holding laptop that running Audacity (and Audition, shh), the other hand holds microphone. Not a pleasant experience I would say. And at that time there are very few spectrogram apps, they either lack essential functions or keep crashing...

Enough prehistory.

My design goal (so far) of this app is a full function spectrum/spectrogram real time monitor, and serves as a bottom line of more "advanced"/"professional" audio analyzers.

Your usage example is more or less going beyond the "real time monitor" so that I have to re-think about the goal. I'm not questioning the suggesting, they are good, and I used to do it on PC. But just whether these are suitable tasks on a cell phone. Once you have the ability to view clips, some basic editing would be desirable (adjust volume, copy/cut/paste), then more advanced audio processing are wanted... I can't see clearly where is the boundary of design goal.

Say, let's limit the goal to view the audio file as an audio source, and discuss the UI.

  • What's the reasonable action once the file is opened? Simulate the playback as if doing real time monitoring or display whole file spectrogram?

  • What happen if user press "record"? Close the file then start a new recording or insert the recording to the opened file?

  • If the loudspeaker playback is considered, how to select the play back range? The current view range or place one or two extra cursor?

Technically the playback function is simple, just use the AudioTrack class. For decoding an audio file, MediaCodec class can do that (I have tried these).

@nfsmaster208
Copy link
Contributor

  • Once the file is opened, simulating the playback as real-time monitoring should suffice. If the user can "stop" the monitoring and swipe left or right to move the frame backward or forward, respectively, that could simulate the app's buffer for the whole file's spectrogram. If that functionality proves too difficult, it may make sense to just offer a limited subset of that functionality.
  • In the "load/play file" functionality, it would make sense to have a context key to switch quickly back to real-time recording, which would require hopping from the loaded media file to the appropriate audio source as defined in user preferences and immediately beginning to record. I do not think this app should need to record an audio sample over or into a previously recorded sample.
  • I think a single cursor that moves window by window (e.g. five seconds at a time and then abrupt movement to next five seconds) could be an option for some people. For others, they may prefer the real-time moving window, like spectrogram shifting in the app now.

One aspect I thought of that may pose an issue down the line is alternatively recorded audio samples, e.g. 44.1kHz vs 48kHz vs 192kHz sample rate recordings. Would those recordings need to be re-sampled to the appropriate sample rate for the app to plot their values correctly, or should the app simply attempt to determine the sample rate (or ask directly) and display a warning if the app does not recognize or support the current sample rate?

@bewantbe
Copy link
Owner Author

For the last issue regarding sampling rate changes. Not quite sure what the "alternatively recorded audio" means. I think:

Once a file is loaded, set the sample rate to match the file metadata. If the user changes the sample rate, view the samples in this new sample rate (no re-sampling, just let the pitch shifts, i.e. only changes the axis label).

Then if user press "Rec", record things in this new sample rate.

Then if user changes sampling rate during recording, stop current recording, start a new record (This is also the current behavior).

@nfsmaster208
Copy link
Contributor

Missed the above message!

Yes, that is far more articulate and accurate compared to what I was saying. This behavior sounds correct for those situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants