Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code styling #4

Open
mattetti opened this issue Jan 5, 2017 · 10 comments
Open

code styling #4

mattetti opened this issue Jan 5, 2017 · 10 comments

Comments

@mattetti
Copy link
Member

mattetti commented Jan 5, 2017

I'd like to discuss code styling for a minute. It would be good to agree on a general approach of the code we'd like to write. Here are some examples, I'd like to hear more about what you think

##Code styling

Embedded type and optimized struct sizes

type Format struct {
	SampleRate uint32
	Channels   int16
}

type Buffer32 struct {
	Format
	Data []float32
}

vs pointers and convenient field types

type Format struct {
	SampleRate int
	Channels   int
}

type Buffer32 struct {
	Format *Format
	Data []float32
}

Convenient but straight forward constructor

func NewBuffer32(format Format, sampleCount int) *Buffer32 {
	return &Buffer32{
		Format: format,
		Data:   make([]float32, sampleCount, sampleCount),
	}
}

vs DIY approach. (note that in this code, the sample count should probably
multiple by the number of channels which is an easy thing to forget.)

&Buffer32{ 
    Format: format,
	Data:   make([]float32, sampleCount),
}

Explicit, manually, duplicated functions

func (buffer *Buffer32) Process32(output *Buffer32) error {
	// stuff
	return nil
}

func (buffer *Buffer64) Process64(output *Buffer64) error {
    // stuff
    return nil
}

vs interface and type switching

    func Process(buf1, buf2 Buffer) error {
        switch t := t.(type) {
        *Buffer32:
            // some stuff
        *Buffer64:
            // some other stuff
        }
        return nil
    }

vs using go generate for any functions implemented the same way in float32 and 64

I personally like:

2: I don't think the convenience of a having a field in a type we can easily manipulate trumps the memory size gain. Plus using a pointer would allow us to provide predefined formats.

3: Constructors are often convenient, especially as entry points and when the setup isn't trivial. This example is a good example since it was was written by hand and technically has a bug. This bug would be mitigated by a constructor. It does come at a price of a bigger API and potentially surprising behaviors.

Finally 5: It might be a bit of a pain at first but it works better with code completion (easier for people not knowing the API), it can be optimized (custom logic and/or SIMD) but can be the source of more bugs (fix something on the float32 side but not float64 for instance). 7 (code generation) is nice but it always feels a bit clunky to have to move things outside of the code generation path because of some edge cases.

@egonelbre @kisielk @nigeltao what do you think?

@kisielk
Copy link

kisielk commented Jan 5, 2017

For 1/2. I don't really have an opinion yet. I suspect that a pointer would be preferable but I don't really have any basis for that. I agree that having predefined formats would probably be useful.

3/4. Constructors all the way. It's pretty easy to screw up the buffer size if you're doing it manually.

5/6/7. I think we need to iterate a couple of sample applications to figure this out.

I think the best thing to do would be to write a simple application like a format converter and suss out some of the API. It would:

  • Accept a file as an input
  • Have [a] format flag(s) to specify the desired output format
  • An argument for output filename

Then the application could detect the format of the input file from a known set, and perform a conversion to the desired output type. This is pretty much the bare minimum "round trip" that the audio API would need to be able to handle (apart from the slightly more simple case of synthesizing audio to an output file).

@nigeltao
Copy link

nigeltao commented Jan 5, 2017

1 vs 2: not really either. Embedding is useful to pick up methods, but that's not a big win here. I'd rather explicitly name the field, but drop the pointer indirection:

type Buffer32 struct{
	Format Format
	etc
}

You don't need a pointer in the type to provide pre-defined Format values.

3 vs 4: 4 although on a similar note, you might be able to drop the "&".

5 vs 6 vs 7: hard to say in the abstract. It depends on what the Buffer type actually is and how sources, sinks and filters are supposed to use it.

On an unrelated note (but not having anywhere better to say this), good luck with the repo, but I don't really have the spare time to pay this a lot of attention.

@mattetti
Copy link
Member Author

mattetti commented Jan 5, 2017

@kisielk I have an example like going between wav and aiff formats. It gets interesting when we add processing such as downsampling. I'll post what I have so we can look at it.

@nigeltao Thanks, your feedback is very much appreciated. We'll ping you once we'll have made more progress to see what you think.

In the meantime, I tried to compare the 2 synth examples @egonelbre and I worked on:

// synthesis API 1 (similar to web audio)
sine := &osc.Sine{}
sine.Frequency.Set(440.0)

gain := &effect.Gain{}
gain.Value.Set(0.5)

buf := audio.NewBuffer32(audio.Format{
    Channels:   1,
    SampleRate: 44100,
}, 512)

// get/set values on the transform
gain.Value.Set(gain.Value.Get() + 0.10)

// chain generator
sine.Process32(buf)
// and processor
gain.Process32(buf)
// synthesis API 2
buf := &audio.FloatBuffer{
    Data:   make([]float64, bufferSize),
    Format: audio.FormatMono44100,
}

osc := generator.NewOsc(generator.WaveSine, 440.0,
 buf.Format.SampleRate)

// get/set values on the generator
osc.Amplitude += 0.10

// chain generator
osc.Fill(buf)
// and processor
transforms.Gain(buf, 0.70)

API 1 reminded me of https://developer.mozilla.org/en-US/docs/Web/API/AudioNode which is quite a nice API but feels clunky at time. In API 2, thread safety isn't taken in consideration and it would be up to the developer to lock their code. Because a lot of audio code doesn't run on multiple threads, that might be fine.

@egonelbre
Copy link

For optimized struct sizes there is also:

type Format struct {
	SampleRate32 uint32
	Channels16   int16
}

func (f Format) SampleRate() int { return int(f.SampleRate32) }
func (f Format) Channels() int { return int(f.Channels16) }

I would go with convenient, without pointers and pay the extra 8 byte cost. When you need to optimize for memory usage, you probably have more knowledge of the whole system and can use a single format in a group of buffers.


Constructors or constant defaults.

const Mono44100 = Format{44100, 1}
const Stereo44100 = Format{44100, 2}

Interface/type-switching. Here are my arguments:

For the core of processing we shouldn't be dealing with N-different buffer formats, otherwise trying to make everything consistent will be quite difficult. Hence the interface will be unnecessary.

With different ProcessX it is clearer what formats you support -- although, not completely clear in terms of how many channels you support.

Single ProcessX function will be non-trivial. If I got an interface as an argument I would probably end-up immediately switching to Process32/Process64 rather than writing them inline.

Using go generate would still be an option. With some limitations, this could work:

//go:generate go-audio-generate Process64
func (buffer *Buffer32) Process32(output *Buffer32) error {
    // ...
    return nil
}

@kisielk
Copy link

kisielk commented Jan 5, 2017

I agree, most of the time a given application will only deal with one buffer format. There may be some rare cases where you want to be able to process differently in float32 or float64, but I feel like in most cases you'd just convert from one to the other. Having to type switch every time you write a Process function is a nuisance so I would prefer to just have separate Process32 and Process64 methods. They wouldn't need to both be implemented, a client could check for support via interfaces for example.

It's true that it doesn't leave any way to express the number of supported channels in the function signature, but I haven't really seen any API that does that yet. Most of the time the number of channels is either set by convention or reported via some other functions or properties. For example in web audio, the AudioNode type has the channelCount and channelCountMode properties.

@egonelbre
Copy link

I tried to avoid this for few reasons:

// chain generator
osc.Fill(buf)
// and processor
transforms.Gain(buf, 0.70)

It will be harder to write generic processors such as:

type NodeSequence []audio.Node

func (nodes NodeSequence) Process32(output *Buffer32) error {
	for _, node := range nodes {
		if err := node.Process32(output); err != nil {
			return err
		}
	}
	return nil
}

With regards to thread-safety. I think having the "atomic" values in the node is necessary, however we should have examples available how to synchronize, when your audio code runs in a thread.

There could be some package control for synchronization:

package control

type Float64 struct {
	mu sync.Mutex
	value  float64
	target *float64
}

func (p *Float64) Poll() {
	p.mu.Lock() // do this without locks
	*p.target = p.value 
	p.mu.Unlock()
}

func (p *Float64) Set(v float64) {
	p.mu.Lock()
	p.value = v
	p.mu.Unlock()
}

@kisielk
Copy link

kisielk commented Jan 5, 2017

Synchronizing on a per-sample basis seems a bit extreme to me. I think if you were doing any kind of parallel processing it would be at the buffer level. The main use case I can see is where you have a single source feeding multiple destinations and then those are joined together later.

@egonelbre
Copy link

Having to type switch every time you write a Process function is a nuisance so I would prefer to just have separate Process32 and Process64 methods. They wouldn't need to both be implemented, a client could check for support via interfaces for example.

We can provide automatic converters:

func (sine *Sine) Process64(buf *audio.Buffer64) error {
	return convert.Node32_Process64(sine, buf)
}

// package convert
func Node32_Process64(n audio.Node32, buf *audio.Process64) error {
	// ...
}

Not sure where to allocate/keep the intermediate buffers, for this approach.

Alternative approach would be to implement Converter nodes:

type Convert32To64 struct {
	Node    audio.Node32
	Scratch audio.Buffer32
}

// usage
type Sine struct {
	// ...
	convert Convert32To64
}

func (sine *Sine) Process64(buf *audio.Buffer64) error {
	return sine.convert.Process64(buf)
}

@egonelbre
Copy link

Synchronizing on a per-sample basis seems a bit extreme to me. I think if you were doing any kind of parallel processing it would be at the buffer level.

Yeah, the control package was meant as "call Poll once per buffer". Example:

type Control interface { Poll() } // maybe name Sync would be better

func main() {
	gain := transform.Gain{0.5}

	var controls []Control
	gainValue := control.NewFloat64(&gain.Value)
	controls = append(controls, gainValue)

	go func(){
		time.Sleep(time.Second)
		gainValue.Set(0)
	}()

	for {
		for _, ctrl := range controls {
			ctrl.Poll()
		}

		// process the buffer
		gain.Process32(work)
		// ... write to output
	}
}

@egonelbre
Copy link

With regards to thread-safety. I think having the "atomic" values in the node is necessary, however we should have examples available how to synchronize, when your audio code runs in a thread.

Yeah... that should have been "...atomic values in the node are not necessary...". So, I'm fine with not having atomic controls in Node-s. There are other ways of getting the same result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants