Currently in rewriting phase.
Package for uncertainty analysis in Python. It was developed while I was doing my Bachelor in physics. The functionality is based on the lab courses.
The function exec is used within some functionalities. This can cause harm if exploited.
The main classes of the package are Data
, Table
and Formula
.
The package holds functionality for:
- Error/Uncertainty propagation in equations
- Create tables and calculate different types of averages including arithmetic, geometric, modus, ...
- Normal and simplified gauss propagation of uncertainties
- Regressions including covariant matrices as well as automated chi2 test and residuum plot
- Propagation of units within equations
The Data
-class holds a mean value and an error.
Furthermore, can a Data
carry a Unit
.
The basic arithmetic methods including *, /, +, -
are implemented.
Using these the error is propagated using simplified gauss.
For a normal propagation use a Formula
.
For natural and other constants the Const
-class can be used.
A Const
also carries a Unit
but not an uncertainty.
Controls if an instance of a class is modified during a method.
Raises an AttributeError
if the instance is changed.
USAGE AS DECORATOR:
class Dummy:
@instancemethod
def fuzz(self, a):
return Dummy(...)
Controls for a given sequence of tuples if each variable has a specific type.
Raises an TypeError
if the type is incorrect.
type_check(*args) -> True
USAGE:
>>> type_check((3, int), (1.2, float))
True
>>> type_check([("3", int)])
TypeError: Type of '<class 'str'>' is false! Try <class 'int'> instead
Controls if each element of a given list as a given type.
list_type_check(lst, data_type) -> bool
lst
:List[object]
= list of objects which should have the same typedata_type
data_type
:type
= specific type
Returns True
if all objects
in lst
have the type data_type
otherwise returns False
.
USAGE:
>>> list_type_check([2, 3, 1], int)
True
>>> list_type_check([2, "3", 1], int)
False
Represents a unit like 'm' for meter.
Unit(numerator="", denominator="")
numerator
=str
represents the upper units in a fractiondenominator
=str
represents the lower units a fraction
For each numerator
and denominator
can a sequence of units be defined as specified by the following sign-EBNF:
S := '"' units '"'
units := unit | unit ';' units
unit := string | string '^' integer
METHODS:
flip()
:
inverts the unit which means an exchange of numerator and denominator
SPECIAL OPERATIONS:
types | + | - | * | / | > | < | == | <= | >= |
---|---|---|---|---|---|---|---|---|---|
Unit |
Unit |
Unit |
bool |
USAGE:
>>> a = Unit("m", "s")
>>> a
m/s
>>> b = Unit("J;s", "m")
>>> b
(J*s)/m
>>> c = a * b
>>> c
J
Assistance functionality for working with units.
Controls if the given argument other
carries the same unit as self
.
Raises an TypeError
if other
does not carry any unit.
Raises an ValueError
if other
does not carry the same unit.
USAGE AS DECORATOR:
class Dummy:
@unit_control
def __add__(self, other):
pass
Returns if the given variable has a unit
attribute.
has_unit(variable) -> bool
variable
= object to check
USAGE:
>>> has_unit(3)
False
Provides the Data
-class and the Const
-class.
Represents a real measurement with an uncertainty.
Data(value: str, error: str, sign: Union[str, Unit] = "", power: int = 0, n: int = 0)
value
= string of the mean valueerror
= string of the uncertaintysign
= either string orUnit
. Since only one string is given the usage differs a bit from the one inUnit
. The extended sign-EBNF reads:The string will be split at '/' intoS := '"' units '"' | '"' units '/' units '"' units := unit | unit ';' units unit := string | string '^' integer
numerator
anddenominator
.power
= additional factor to compensate pre-factors (eg. km -> 10^3 m -> power = 3). It is advantageous to work with SI-units only and usepower
instead of kilo, milli, ...n
= number of significant digits. Has to be greater than 0. If 0 (or not set seperatly), the number of digits will be determined from the length oferror
SPECIAL OPERATIONS:
types | + | - | * | / | ** | > | < | == | <= | >= |
---|---|---|---|---|---|---|---|---|---|---|
Data |
Data |
Data |
Data |
Data |
bool |
bool |
bool |
bool |
bool |
|
Const |
Data |
Data |
Data |
Data |
bool |
bool |
bool |
bool |
bool |
|
int |
Data |
Data |
Data |
Data |
Data |
bool |
bool |
bool |
bool |
bool |
float |
Data |
Data |
Data |
Data |
Data |
bool |
bool |
bool |
bool |
bool |
USAGE:
>>> a = Data("123.456", "1.234", power=-2, sign="m/s")
>>> a
(123.456±1.234)*10^-2 m/s
>>> b = Data("4", "1.00000001", sign="s", n=2)
>>> b
(4.0±1.0) s
>>> c = a * b
>>> c
(4.9±1.2) m
Represents a natural constant (through SI definition some may have no error, eg. the speed of light), or a measurement with neglected uncertainty.
Const(value: Union[int, float], sign: Union[str, Unit])
value
= value of the constantsign
= either string orUnit
. Since only one string is given the usage differs a bit from the one inUnit
. The extended sign-EBNF reads:The string will be split at '/' intoS := '"' units '"' | '"' units '/' units '"' units := unit | unit ';' units unit := string | string '^' integer
numerator
anddenominator
.
SPECIAL OPERATIONS:
types | + | - | * | / | ** | > | < | == | <= | >= |
---|---|---|---|---|---|---|---|---|---|---|
Data |
Data |
Data |
Data |
Data |
bool |
bool |
bool |
bool |
bool |
|
Const |
Const |
Const |
Const |
Const |
bool |
bool |
bool |
bool |
bool |
|
int |
float |
float |
Const |
Const |
Const |
bool |
bool |
bool |
bool |
bool |
float |
float |
float |
Const |
Const |
Const |
bool |
bool |
bool |
bool |
bool |
The addition and subtraction with ints, floats is only possible if the Const
has a Unit
equal to Unit()
USAGE:
>>> a = Const(1, "m")
>>> a
1.0 m
>>> a + 1
ArithmeticError: Addition of values with different units is not possible
>>> b = Const(1, "")
>>> b
1.0
>>> b + 1
2.0
>>> type(b + 1)
<class 'float'>
Assistance functionality for working with Data.
Returns the number of significant digits from a given error.
digits(error: Union[str, int, float]) -> int
error
= the error the significant digits should be determined for
The result is the number of significant digits.
USAGE:
>>> digits(1.23)
3
>>> digits("0.0012")
2
>>> digits(99.25)
4
Shortens the value
and error
of a Data
to its length Data.n
. Furthermore, it calculates which power should be used in the representation.
round_data(data: Data)
data
= a instance of typeData
to round values to correct length
Usage:
It is used inside Data
automatically. There should not be any need to use this.
Used for more complex functions and connections of uncertain data as well as a better
uncertainty analysis.
Formula(formula_string: str)
formula_string
=str
the given formula. This can be any validsympy
expression.
METHODS:
-
latex(type_dict: Dict[str: type]) -> str
type_dict
= Dictionary specifying which type each variable in theFormula
has
Returns a string of Latex-code of the givenFormula
-
show_error(type_dict: Dict[str: type], unicode=False) -> str
type_dict
= Dictionary specifying which type each variable in theFormula
hasunicode
= Specifies if unicode should be used Returns a string of the formula and the propagation formula for the uncertainty
-
calc_unit(value_dict: Dict[Data, Const, float, int]) -> Unit
value_dict
= Dictionary assign a python object to each variable in theFormula
Returns aUnit
for the result. If theUnit
could not be determined shows awarning
-
calc(value_dict: Dict[Data, Const, float, int]) -> Unit
value_dict
= Dictionary assign a python object to each variable in theFormula
Returns the result of the calculation. If theUnit
could not be determined shows awarning
USAGE:
>>> f = Formula("x**2 + exp(y)")
>>> f.latex({'x': Data, 'y': Data})
('x^{2} + e^{y}', '\sqrt{4 \left(__delta^{x}\right)^{2} x^{2} + \left(__delta^{y}\right)^{2} e^{2 y}}')
>>> a = Data("2.12", "0.05")
>>> b = Data("1.23", "0.02")
>>> f.calc({'x': a, 'y': b})
(79±2)*10^-1
>>>
>>> f2 = Formula("x**2 + 2 * y")
>>> x = Data("1.3", "0.3", sign="m")
>>> y = Data("0.1", "0.22", sign="m^2")
>>> f2.calc({'x': x, 'y': y})
(19±9)*10^-1 m^2
>>>
>>> f = Formula("acos((2*x-1) / sin(y))")
>>> f
acos((2*x-1) / sin(y))
>>> f.show_error({"x": Data, "y": Data})
Due to problems with markdown is a screenshot of the result given bellow:
Holds classes for different types of regressions.
For implementing new Regressions an abstract base class Regression
is given.
Abstract base class for regressions. All regressions have to inherit from Regression
.
Regression(tab: Table, data_dict: Dict[str, int], n_o_f_p: int)
tab
= Table holding data for regressiondata_dict
= Dictionary specifying what to use for the regression. The dictionary must specify columns for 'x' and 'y':{'x': int, 'y' : int}
n_o_f_p
= number of free parameters. Specifies the number of parameters which will be determined within the regression
ABSTRACT-METHODS:
-
__str__() -> str
- Gives a string representation of thr regression parameters.
-
calc(x) -> Union[float, int, Data, Const]
- calculates the theoretical expected value at
x
using the regression results
- calculates the theoretical expected value at
-
residues()
- creates a plot of the residues and shows it.
Simple regression ignoring uncertainties of x
and y
.
SimpleRegression(table: Table, data_dict: Dict[str], n=2)
tab
= Table holding data for regressiondata_dict
= Dictionary specifying what to use for the regression. The dictionary must specify columns for 'x' and 'y':{'x': int, 'y' : int}
n
= number of significant digits for the parameters
The information of the regression include the y-intercept
and slope
.
They can be accessed through __str__()
.
USAGE:
>>> b = (2 * random.random() - 1) * 100
>>> a = random.random() * 100
>>> b, a
(-35.60688387713966, 39.481128050640734)
>>> f = lambda x: b * x + a
>>> xs = list(range(10))
>>> ys = [f(x) + random.gauss(0, 10) for x in xs]
>>> tab = Table(column_names=["x", "y"])
>>> for x, y in zip(xs, ys):
tab.add((x, y))
>>> r = SimpleRegression(tab, {"x": 0, "y": 1})
>>> r
Regression:
y = a + b * x
a: (38.4±4.4)
b: (-345.9±8.3)*10^-1
>>> r.residues() # shown bellow
>>> r.plot() # shown bellow
residues | plot |
---|---|
Simple regression ignoring uncertainties of x
and working with uncertainties of y
.
GaussRegression(table: Table, data_dict: Dict[str], n=2)
tab
= Table holding data for regressiondata_dict
= Dictionary specifying what to use for the regression. The dictionary must specify columns for 'x' and 'y':{'x': int, 'y' : int}
n
= number of significant digits for the parameters
The information of the regression include the y-intercept
, slope
, reduced chi2
, chi2
and a probability
for
such a chi2
.
They can be accessed through __str__()
.
USAGE:
>>> b = (2 * random.random() - 1) * 100
>>> a = random.random() * 100
>>> b, a
(-57.571015317118835, 60.44085386096929)
>>> f = lambda x: b * x + a
>>> xs = [Const(str(x), sign="s") for x in range(15)]
>>> ys = [Data(str(f(x.value) + random.gauss(0, 20)), "20", n=2, sign="m") for x in xs]
>>> tab = Table(column_names=["x", "y"], signs=["s", "m"])
>>> for x, y in zip(xs, ys):
tab.add((x, y))
>>> r = GaussRegression(tab, {"x": 0, "y": 1})
>>> r
Regression:
y = a + b * x
a: (67.4±9.8) m
b: (-57.7±1.2) m/s
Chi2 : 10.127125000000001, Probability: 0.684, Chi2 red.: 0.7790096153846154
>>> r.residues() # shown bellow
>>> r.plot() # shown bellow
residues | plot |
---|---|
!!!Attention: Unfortunatly, does this class has a bug in calculating the error with the .calc()-Method!!!
Simple regression ignoring uncertainties of x
and working with uncertainties of y
.
CovRegression(formula_string: str, table: Table, data_dict: Dict[str, int], pars: List[str], n=2)
-
formula_string
= A string specifying a formula which should be fitted.
The format of a formula is specified as given by the formula-EBNF.S := 'y = ' exprs exprs := expr | expr '+' exprs | expr '-' exprs expr := para '*' func | para para := char ?has to be unique? func := ?mathematical function of x. Has to be a valid sympy expression?
The
?
start and end EBNF-extensions. -
table
= Table holding data for regression -
data_dict
= Dictionary specifying what to use for the regression. The dictionary must specify columns for 'x' and 'y':{'x': int, 'y' : int}
-
pars
= List of the used parameters -
n
= number of significant digits for the parameters
The information of the regression include the parameters
, reduced chi2
, chi2
and a probability
for
such a chi2
.
They can be accessed through __str__()
.
USAGE:
>>> data = [
-0.849, -0.738, -0.537, -0.354, -0.196,
-0.019, 0.262, 0.413, 0.734, 0.882,
1.258, 1.305, 1.541, 1.768, 1.935,
2.147, 2.456, 2.676, 2.994, 3.200,
3.318]
>>> datas = [(Const(i, sign="°C"), Data(str(data[i // 5]), "0.05", sign="mV")) for i in range(0, 105, 5)]
>>> tab = Table(columns=2, column_names=["x", "y"], signs=["°C", "mV"])
>>> for data in datas:
tab.add(data)
>>> r = CovRegression("y = a + b*x + c*x**2", tab, {"x": 0, "y": 1}, ["a", "b", "c"])
Regression:
y = a + b*x + c*x**2
a: -0.919926595143981
b: 0.0376650836577612
c: 0.0000551933192665457
Chi2 : 26.32205952, Probability: 0.093, Chi2 red.: 1.46233664
residues | plot |
---|---|
Calculates the reduced chi2_red
, the chi2
, and the probability
for a Regression
.
Chi2(reg: Regression, chi2=True)
reg
= Regression to calculatechi2
and residues forchi2
= specifies ifchi2
should be calculated or just the residues
METHODS:
show_residues()
- creates and shows a plot of the residues
USAGE:
This class is automatically used within a Regression
.
It should not be necessary to use it manually.
Class for storing many relevant data of different types. It can perform calculations columnwise and simplifies the
handling of the calculations.
Table(column_names: List[str] = [], columns=2, signs: List[Union[str, Unit]] = [])
column_names
= List of strings representing the names for the columncolumns
= Number of columns for the tablesigns
= List of units defined as given by the sign-EBNF:
S := '"' units '"' | '"' units '/' units '"'
units := unit | unit ';' units
unit := string | string '^' integer
METHODS:
-
calc(formula: Formula, para_dict: Dict[str: Union[int, str, Any]], column_name="", sign=True)
Uses the given formula on each row of the table to calculate a new column.formula
= Formula to use for calculationpara_dict
= Dictionary specifying how the parameters are set. How the parameter will be treated depends on the type of the value.int
: The value will be used as column index, and the parameter will be set row-wisestr
: The value will be used as value usingfloat(value)
Any
: All other types will be used as they are
-
add(data_tuple: Tuple[Any])
Adds a new data-row to theTable
. All columns have to be set. Use an empty string if you wish to fill it later.data_tuple
= Tuple of data to add in. If aData
orConst
is inserted it must have the sameUnit
as the column orUnit()
-
delete(line_index: int)
Deletes a data-row in theTable
.line_index
= The line to delete.
-
drop(line_index: int) -> List[Any]
Deletes a data-row in theTable
and returns it.line_index
= The line to delete.
-
add_column(name="")
Adds a new column to theTable
.name
= Name of the new column. If not set the name will be 'Column <i>' with i is the number of columns.
-
insert(line: int, column: int, value: Any)
Inserts a value atline
,column
.line
= The line the value has to be insertedcolumn
= The column the value has to be insertedvalue
= The value to insert into theTable
. If the value contains aUnit
it will be replaced by the one corresponding to the column.
-
export(path: str, name: str, replace_dot=False)
Exports theTable
in '.csv'-format.path
= path were the table should be savedname
= name of the file. Should end with '.csv'replace_dot
= Specifies if a '.' or a ',' is used to separate the decimals
-
max() -> List[Any]
Returns the max of each column.
The type of each element in the list depend on the types in the column. -
min() -> List[Any]
Returns the min of each column.
The type of each element in the list depend on the types in the column. -
arithmetic_average() -> List[Data]
Returns the arithmetic average of all columns (mean + std).
The type of each element in the list depend on the types in the column.
If the type of the column isData
the average will be a weighted mean with its error.
Otherwise the uncertainty will be the sample standard deviation. -
geometric_average() -> List[Union[Unit, float]]
Returns the geometric average of all columns (mean only).
The type of each element depend on theUnit
of the column an will either be afloat
orConst
. -
harmonic_average() -> List[Union[Unit, float]]
Returns the harmonic average of all columns (mean only).
The type of each element depend on theUnit
of the column an will either be afloat
orConst
. -
median() -> List[Union[Unit, float]]
Returns the median of all columns (mean only).
The type of each element depend on theUnit
of the column an will either be afloat
orConst
. -
modus() -> List[Union[Unit, float]]
Returns the modus of all columns (mean only).
The type of each element depend on theUnit
of the column an will either be afloat
orConst
.
USAGE:
>>> data = tuple(zip([1, 1, 3, 2], [random.random() for i in range(4)]))
>>> tab = Table(column_names=["x", "y"], columns=2)
>>> for element in data:
tab.add(element)
>>> tab
x |y |
------------------------------
1 |0.9650392731516132 |
1 |0.42768822918919414 |
3 |0.0979070782420336 |
2 |0.12872070585370476 |
>>> tab.add_column(sign="m/s")
True
>>> tab
x |y |Column 2 [m/s] |
-------------------------------------------------
1 |0.9650392731516132 | |
1 |0.42768822918919414 | |
3 |0.0979070782420336 | |
2 |0.12872070585370476 | |
>>> data = [Data(str(random.randint(1, 20)), str(random.random()), n=2) for i in range(4)]
>>> for index, data_ in enumerate(data):
tab.insert(index, 2, data_)
>>> tab
x |y |Column 2 [m/s] |
-----------------------------------------------------
1 |0.9650392731516132 |(170.0±9.4)*10^-2 |
1 |0.42768822918919414 |(1300.0±3.8)*10^-4 |
3 |0.0979070782420336 |(60.0±6.5)*10^-2 |
2 |0.12872070585370476 |(40.0±3.1)*10^-2 |
>>> f = Formula("a * c")
>>> tab.calc(f, {"a": 1, "c": Const(1, "s")}, "time")
>>> tab
x |y |Column 2 [m/s] |time [s] |
-----------------------------------------------------------------------------
1 |0.9650392731516132 |(170.0±9.4)*10^-2 |0.9650392731516132 |
1 |0.42768822918919414 |(1300.0±3.8)*10^-4 |0.42768822918919414 |
3 |0.0979070782420336 |(60.0±6.5)*10^-2 |0.0979070782420336 |
2 |0.12872070585370476 |(40.0±3.1)*10^-2 |0.12872070585370476 |
>>> tab.calc(f, {"a": 2, "c": 3}, "distance")
>>> tab
x |y |Column 2 [m/s] |time [s] |distance [m] |
---------------------------------------------------------------------------------------------------
1 |0.9650392731516132 |(170.0±9.4)*10^-2 |0.9650392731516132 |(164.1±9.1)*10^-2 |
1 |0.42768822918919414 |(1300.0±3.8)*10^-4 |0.42768822918919414 |(556.0±1.6)*10^-4 |
3 |0.0979070782420336 |(60.0±6.5)*10^-2 |0.0979070782420336 |(58.7±6.4)*10^-3 |
2 |0.12872070585370476 |(40.0±3.1)*10^-2 |0.12872070585370476 |(51.5±4.0)*10^-3 |
>>> tab.arithmetic_average()
[(17.5±4.8)*10^-1, (4.0±2.0)*10^-1, (1300.8±3.8)*10^-4 m/s, (4.0±2.0)*10^-1 s, (556.0±1.6)*10^-4 m]
>>> tab.export(<path>, "dummy.csv") # <path> needs to be a real path
The last order produces a file called 'dummy.csv'. This file is given bellow:
-
make to python package
-
Tests for chi_2
-
Tests for regression
-
Improve all Tests (they no longer test all functionalities)
-
Data should format in such a way, that the mean value is given by 1,... not the error
-
Refractor Table (access via column_name, access via tab[index], smarter design)
-
Allow fast access tables for weighted average between data or similar functionality
-
Add confindece band to plots
-
Add automatic x/y label to plot (for SimpleRegression already done)
-
update the images
- Covariant Matrix has problem with error (comparison with presentaion result shows a missing 2* in the total error)
- Data ** 0.5 does not work for the units