-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deformers to implement #1
Comments
Note: fix timing vs group delay with convolved impulse response |
You should use not only ps. nice project, I have some scripts scripts to do that, but python library would be idea. I will fork and put some changes then ask for merge |
I'd rather not use any command-line tools, but rather library calls. Python bindings weren't quite there at the time I needed this to work, so the cmdline stuff was hacked in. I'd also prefer to avoid proprietary (ie, non-free software) dependencies. But otherwise: yeah, it'd be great to have a general audio effects binding! Do you think that's possible?
Yup! The details are in the muda paper, which (I hope!) explains what the difference between muda and adt is, and why we didn't simply fork adt.
Great! I'm also planning to do a bit more development on this and polish it into a proper python library with tests and documentation, hopefully before the end of october. |
Hi, thanks for link to the paper, clear now. commandline in python must be avoided, sure. I have some bash scripts that use mrswatson and proprietary plugins. I also plan to do much of the work at the end of October. |
"time clip" is duration? |
roger on the "rather not use any command-line tools" ... I'd be keen to sync on this in a side-bar? depending on the conversation, we can summarize for posterity here or in a separate issue / proposal if need be. |
offset + duration, yeah. Think of randomly slicing the data and getting time-aligned chunks out. This is usually done in sampling / training pipelines, but it could be considered an "augmentation" as well.
what all did you have in mind? |
I don't share the aversion to leveraging command-line interfaces under the hood if it provides functionality we can't otherwise get (easily) through native libraries / interfaces. I agree that proprietary hard dependencies are no-go's, but I quite like the idea of making the framework as versatile as possible, if it means that a user might have to configure tools separately if they really want to harness muda. For example, with time-stretching, we could provide different algorithms / backends for how this gets accomplished. Rubberband is fine, but what if I want to use dirac, elastique, or some other thing that doesn't / won't have a python implementation. |
That's why you can extend the Seriously though, cmdline dependencies are a total pain for maintainability. I'd have to check, but I'm pretty sure that 100% of the error reports I've received on muda have come down to broken cmdline dependencies with rubberband -- and that's a well-behaved and maintained package.
This sounds like bloat/feature creep to me. IMO, the current stretch/shift stuff is good enough for government work*, and our efforts are better spent broadening the types of available deformations, rather than adding six variations of a thing we already have. *downstream feature extraction |
Quick update: I have a first cut at chord simplification as part of a tag-encoding module here. It wouldn't be difficult to patch this into a muda deformer. |
Hi, I would like to propose and add a new audio deformer to
|
That sounds interesting, and it should be pretty easy to implement since you don't have to do any annotation modification. The Otherwise, the parameters you describe sound reasonable. The key thing is to push all of the parameters that the deformation function needs into the |
Ok, thanks for the reply. I'll work on that and make a pull request once I validated some sound examples and produced the corresponding test functions. |
@bmcfee quick question - by Multi-loudness training (MLT) has been shown to be especially useful for far-field sound recognition (e.g. original PCEN paper), so it would be a great deformer to have for projects such as BirdVox and SONYC. Perhaps a reasonable interface for this is for the user to provide min and max DBFS values, and then the deformer chooses a value uniformly in the provided interval and adjusts the gain of the input signal to match the selected value? |
Yes, that's how ADT specified it (where this list originally came from). More generally, attenuation as a function of sub-bands (maybe notch filtering?), ala Sturm, might be useful as well. |
That's more in the direction of EQ, no? Also a useful deformer, though I'd probably keep it separate from a global loudness deformer (color vs intensity). |
Sure, but the former is a special case of the latter. Seems reasonable to me to keep the implementation unified. |
Side note: once bmcfee/pyrubberband#15 gets merged, it would be possible to simulate tape-speed wobble (as done by ADT) by piece-wise linear approximation. We'd have to reimplement the timing logic for annotations, but this shouldn't be too difficult. |
Simple(ish) deformers (many from audio degradation toolbox):
duration == 0
and duplicate them at some random offset and degradation in confidenceAdvanced deformers:
The text was updated successfully, but these errors were encountered: