stat_analysis:

This repository contains several statistical tools useful for scientific computations.

installation

pip install git+https://github.com/pretidav/stat_analysis

pfit

Is an linear (in the parameters) fitting script which takes into account statistical uncertainties on y-axis. Chi^2 is used a likelihood function, minimized analytically.

Once requirements are installed with

pip install requirements

For unit testing the code can be executed as

python pfit.py

Import the module as

from stat_analysis.pfit import pfit

For common usage, class pfit can be customized with the following arguments:

x          # input data in list or np.array format 
y          # input data in list or np.array format
dy         # [Optional] input data in list or np.array format, if missing error is ignored (i.e. =1)
ndeg=1     # degree of polynomial, i.e. 0 is a constant fit, 1 is a*x+b and so on.
kconst=[2] # [Optional] a list of coefficients, starting from the highest to the lowest degree. (e.g. ndeg=3, kconst=[2,3] is a*x**3+b*x**2+2*x+3) these coefficients are kept fix in the fit.

.log(): fit parameters with uncertainties, reduced chi2, and covariance matrix are displayed.

==================== Fit Log ====================
 k       = [4.98881479 1.89398851 6.6610509 ]
dk       = [ 1.25188083  9.53787807 10.14975134]
t        = [3.98505565 0.19857546 0.65627725]
p-values = [0.00528968 0.84823908 0.53261758]
chi2/dof = 0.0007074425784127696
dof      = 7
cov      = 
[[  1.56720562 -11.82348128  11.5408929 ]
 [-11.82348128  90.971118   -91.18678554]
 [ 11.5408929  -91.18678554 103.01745234]]
pred   = 631.1415143152262
dpred  = 58.04015781775017

.plot(): plot the data and fit with error band. Optional parameters are xlabel,ylabel and savefile. if nor specified the plot will not be saved.

.predict(): plot the data and fit with error band and the predicted point x. Optional parameters are xlabel,ylabel and savefile. if not specified the plot will not be saved.

example usage:

x = np.linspace(0,10,num=10)
y = 5*x**2+ 2*x + 5 + 2*np.random.rand(len(x))
dy = 100*np.random.rand(len(x))

ff = pfit(x=x,y=y,dy=dy,ndeg=2)
ff.fit()
ff.plot(xlabel='x',ylabel='y',savefile='../fig/quad.png')
plt.clf()
ff.predict(x=11,xlabel='x',ylabel='y',savefile='../fig/quad_pred.png')
ff.log()

nlfit

This routine allows to perform non-linear fits with errors on both axis. If errors are not provided they are ignored. Input function has to be passed in this form (for instance):

Once requirements are installed with

pip install requirements

For unit testing the code can be executed as

python nlfit.py

Import the module as

from stat_analysis.nlfit import nlfit

For common usage, class pfit can be customized with the following arguments:

f  # input function to be fitted (see below) 
x  # input data in list or np.array format
y  # input data in list or np.array format
dx # [Optional] input data in list or np.array format
dy # [Optional] input data in list or np.array format
k0 # parameters initial values for the optimizer e.g. [0.,0., ... ]. Notice that the length of this vector *must* match the number of parameter of f function.

The input function f should be of this form:

import numpy as np
    def f(B, x):
        return np.exp(B[0]*x) + B[1]*np.sin(x) + B[2]

.log(): fit parameters with uncertainties, R^2, and covariance matrix are displayed.

==================== Fit Log ====================
 k       = [0.20230562 0.91546002 5.48977518]
dk       = [0.0030456  0.06418912 0.06665966]
t        = [66.42561823 14.26191952 82.35528149]
p-values = [0.00000000e+00 6.87618851e-11 0.00000000e+00]
R**2     = 5.540221043876636
dof      = 17
cov      = 
[[ 9.27565727e-06 -4.12808332e-05 -1.41792933e-04]
 [-4.12808332e-05  4.12024270e-03 -6.44158588e-05]
 [-1.41792933e-04 -6.44158588e-05  4.44351047e-03]]
 k       = [13.05072629]
dk       = [0.24797352]
=================================================

.plot(): plot the data and fit with error band. Optional parameters are xlabel,ylabel and savefile. if nor specified the plot will not be saved.

.predict(): plot the data and fit with error band and the predicted point x. Optional parameters are xlabel,ylabel and savefile. if not specified the plot will not be saved.

example usage:

    def f(B, x):
        return np.exp(B[0]*x) + B[1]*np.sin(x) + B[2] 

    x = np.linspace(0,10,num=20)
    y = f([0.2,1,5],x) + 1*np.random.rand(len(x))
    dy = 1*np.random.rand(len(x))
    dx = 0.5*np.random.rand(len(x))

    ff = nlfit(func=f,x=x,y=y,dx=dx,dy=dy,k0=[0.,0.,0.])
    ff.fit()
    ff.plot(savefile='../fig/nonlin.png')
    plt.clf()
    ff.predict(x=10.5,xlabel='x',ylabel='y',savefile='../fig/nonlin_pred.png')
    ff.log()

bootstrap

This routine allows for two kind of bootstrap resampling methods.

bootstrap: usual bootstrap with repetition with possibility of data blocking.
synt_bootstrap: syntetic bootstrap data generated stating from a vector of parameters with a given covariance.

The modules can be imported as

from stat_analysis.bootstrap import bootstrap
BS=bootstrap(k=data,Nb=10000000)
b=BS.sample()
print(np.mean(b,axis=0))
print(np.std(b,axis=0))

or

from stat_analysis.bootstrap import synt_bootstrap

k = [2,3]
cov = [[0.3,0.5],[0.5,2]]
BS=synt_bootstrap(k=k,Nb=10000000,cov=cov)
b=BS.sample()
print(np.cov(b.T))
print(np.mean(b,axis=0))

comparator

This routine allows for the comparison of two arrays of data. A pipeline of statistical tests, including Welch's t-test, Student's t-test and Kolmogorov-Smirnov test are included to distinguish with a given confidence if the data are coming from the same distribution or not.

from stat_analysis.comparator import comparator
a = [30.02,29.99,30.11,29.97,30.01,29.99]
b = [29.89,29.93,29.72,29.98,30.02,29.98]
aa = comparator(A=a,B=b,normaltest=False)
bb = aa.compare()
aa.plot(bins=100)
print(bb)

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
fig		fig
stat_analysis		stat_analysis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stat_analysis:

installation

pfit

example usage:

nlfit

example usage:

bootstrap

comparator

About

Releases 3

Packages

Languages

License

pretidav/stat_analysis

Folders and files

Latest commit

History

Repository files navigation

stat_analysis:

installation

pfit

example usage:

nlfit

example usage:

bootstrap

comparator

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages