Introduction to Data Science ¶

1MS041, 2023¶

Standard error simulations¶

Balance scale problem¶

With the help of a balance scale we have measured the weights of two items with true weights $m_1,m_2$. We have measured $m_1,m_2,m_1-m_2,m_1+m_2$. Whenever we measure we make a measurement error with standard deviation $\sigma$. $X_1,X_2,X_3,X_4$ are the four measurements

We can use $X_1$ as an estimate for $m_1$
We could also use $(X_1-X_2+X_4)/2$ as an estimate for $m_1$
To estimate $m_1+m_2$ we could use $(X_3+X_4)/2$.
To estimate $m_1+m_2$ we could also use $(X_1+X_2+X_3)/2$

import numpy as np
def experiment():
    m1 = 25
    m2 = 16

    err = np.random.normal(size=4)
    X1 = m1+err[0]
    X2 = m2+err[1]
    X3 = m1-m2+err[2]
    X4 = m1+m2+err[3]

    theta1 = X1
    theta2 = (X1-X2+X4)/2
    theta3 = (X3 + X4)/2
    theta4 = (X1 + X2 + X3)/2

    return (theta1,theta2,theta3,theta4)

experiments = np.array([experiment() for i in range(10000)])

experiments.shape

(10000, 4)

import matplotlib.pyplot as plt
for i in range(experiments.shape[1]):
    plt.figure(figsize=(3,3))
    _=plt.hist(experiments[:,i],bins=20)
    plt.xlim(22,28)
    plt.show()

np.var(experiments,axis=0)

array([1.00503568, 0.73489716, 0.50705038, 0.75441519])

Reliability of parallel systems¶

A reliability system consists of two parallell circuits which break independently of each other and has the probabilities $p_1$ and $p_2$ to break during a week. The weekwise probability that the system breaks is thus $p_1p_2$. We now test $n$ such systems and find that the first circuit breaks $x_1$ times and the second system breaks $x_2$ times and that the full system has broken down $x$ times. Two estimators for $p_1p_2$ has been proposed. \begin{align*} p^\ast &:= \frac{x}{n} \\ \hat p &:= \frac{x_1}{n}\frac{x_2}{n} \end{align*} Show that they are unbiased and compute the corresponding variances.

import numpy as np
def experiment(n=100):
    p1 = 0.1
    p2 = 0.3

    X1 = np.random.binomial(1,p1,size=n)
    X2 = np.random.binomial(1,p2,size=n)

    past = np.mean(X1*X2)
    phat = np.mean(X1)*np.mean(X2)
    return (past,phat)

experiments = np.array([experiment() for i in range(1000)])

true_value = 0.1*0.3
true_value

0.03

np.mean(experiments,axis=0)

array([0.02971  , 0.0297426])

np.std(experiments,axis=0)

array([0.01732674, 0.01040136])

Introduction to Data Science¶

1MS041, 2023¶

Standard error simulations¶

Balance scale problem¶

Reliability of parallel systems¶

Introduction to Data Science ¶