Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confidence intervals for deviations #47

Open
aewallin opened this issue Jul 17, 2016 · 9 comments
Open

Confidence intervals for deviations #47

aewallin opened this issue Jul 17, 2016 · 9 comments

Comments

@aewallin
Copy link
Owner

We should add confidence intervals (errorbars) for the computed deviations.
For most common deviations the equivalent degrees of freedom are already computed in uncertainty_estimate()

However it looks like this will require a change in the API, both for the calling signalture which is now:
mdev(data, rate=1.0, data_type="phase", taus=None)
to something like:
mdev(data, rate=1.0, data_type="phase", taus=None, ci=None)
where the ci parameter would be used to select between simple 1/sqrt(N) errorbars and the more complicated calculation based on edf. In theory one could allow asymmetric confidence intervals, but perhaps this is not required? For example ci=0.683 would give standard error (1-sigma) confidence intervals.

Also on the output-side we need an API change since the return values are now:
(taus, devs, deverrs, ns)
A straightforward change would be to add the lower and upper confidence intervals:
(taus, devs, devs_lo, devs_hi, ns)
But I wonder if this starts to be too many variables to remember when calling the function.
An alternative would be a DevResult object where the outputs would have names:
myresult = allantools.adev(...)
myresult.taus
myresult.devs
myresult.devs_lo
..and so on...

any other ideas or comments on this?

@aewallin
Copy link
Owner Author

From the above I have forgot the noise-estimation completely.

On the input side we need a noise-estimate from the user (e.g. alpha-parameter, noise-PSD exponent), or an indication of what noise-estimating function to use e.g. "B1" (see NIST SP1065).

on the output side for each (tau, dev) pair the estimated noise-type shown as an integer alpha could also be output.

With so many output parameters I am now leaning towards a ResultObject so that one doesn't have to remember positional output-arguments...

@mivade
Copy link
Contributor

mivade commented Jul 17, 2016

I definitely like the idea of using a ResultObject (or maybe just a dict) for the output. In addition to not having to remember the order of positional output, it also would mean that any further changes to the output values would not break existing code.

@aewallin
Copy link
Owner Author

It looks like the latest Stable32 doesn't use the simple formulas of NIST SP1065 but instead an algorithm by Greenhall available over here:
http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20050061319.pdf
I will try to implement it over the next week - so we at least have some tests and algorithms that reproduce confidence intervals from Stable32 first. After that we can change the API.

@fmeynadier
Copy link
Contributor

Hello,

Trying to examine if it is really necessary to break the API again :

  • For the input : adding optional input parameters is straightforward and will not break the API
  • For the output : deverrs could be meant to be a single array or a couple of arrays [deverr_lo, deverr_hi].

Not sure yet what to do with the output "estimated noise" integer value. Another possibility would be to write a wrapper object, leaving current functions as they are.

@mivade
Copy link
Contributor

mivade commented Jul 22, 2016

I've started implementing a proof-of-concept results object here. It's not ready yet, but I have added an rtype keyword argument to calculations allowing easy switching between the current tuple return values and the unified object return values. Advantages to the approach of using an object:

  • Robust against future additions to return values
  • Results are immutable thanks to the use of namedtuple
  • Ability to add convenience methods to the results object (e.g. producing quick plots as implemented so far).

Thoughts? I could make a pull request if it would make more sense to continue discussion there (I have not yet since this is nowhere near ready).

@fmeynadier
Copy link
Contributor

Hi,

Here is my take on this issue : fmeynadier/allantools@fab5bc9

I propose a simple front-end object (that could also use the results object proposed above...) that encapsulates the existing functions and gives room for adding other inputs/outputs/methods.

You can test it simply by getting frontend.py and call it along those lines (it should be possible to get a fancier call, but you get the idea) :

import allantools
import allantools.frontend
import numpy as np

a = allantools.frontend.AllantoolsFrontend()
a.set_input(np.random.rand(1000))
a.compute(allantools.mdev)
print(repr(a.out))

@mivade
Copy link
Contributor

mivade commented Jul 28, 2016

@fmeynadier I'm personally not a fan of that style of giving inputs since it generally seems like it overuses objected oriented features to the point of being too verbose (it isn't in this simple case, but I would worry that it could become that way if more is added to it). Instead I prefer keyword arguments to functions which still allows for a lot of flexibility. That said, the nice thing about your proposal is that we can have both options, so whatever style the user prefers is viable. Some suggestions:

  1. It would seem cleaner to me if there were aliased methods so instead of doing a.compute(function) one could call a.mdev().
  2. It would be nice to be able to provide the input data on initialization.
  3. The name "frontend" doesn't seem terribly descriptive. Maybe something like allantools.dataset.Dataset?

@fmeynadier
Copy link
Contributor

Thanks for the comments. I agree : I am no fan either of full OO style, and I confess not being very good at picking names...

So, please have a look at fmeynadier/allantools@a77c07a , here are my answers :

  1. I agree but for the time being (i.e. early phases of development) I'd prefer to keep the generic "compute" method : less code to write and modify if we change our minds with Dataset(). But in the end, yes, we shall have one method per allantools stat function (benefits, apart from cleaner calls : it naturally limits the number of functions that can be called, and it allows to customize calls when necessary).
  2. Done. "data" is a keyword argument there too, so it can be optional.
  3. Done. I also added the relevant lines in the package's "init.py", so now you can simply use it as
import allantools
import numpy as np

a = allantools.Dataset(data=np.random.rand(1000))
a.compute(allantools.mdev)
print(repr(a.out))

@mivade
Copy link
Contributor

mivade commented Jul 30, 2016

Looks good.

I think you make a good point with using the compute method. We might even be able to use some Python magic to automatically map method calls to the relevant computations which would mean not having to manually update or add additional functions.

One final naming suggestion: maybe out should be called result instead?

@aewallin aewallin mentioned this issue Jul 3, 2019
12 tasks
@aewallin aewallin mentioned this issue Jul 27, 2019
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants