04.15.08

Assignment 2

Posted in Math, Uncategorized tagged , , , , , , , at 4:23 pm by missreid

Shelly K Bernard
MTH332-01
Assignment2

Definitions:

 x: a data entry
n
: total number of data entries
μ and \overline{x}: mean of x
σ and s: standard deviation of x


Equations:

Mean:

The term “mean” or “arithmetic mean” is preferred in mathematics and statistics to distinguish it from other averages such as the median and the mode. In mathematics and statistics, the (arithmetic) mean of a list of numbers is the sum of all the members of the list divided by the number of items in the list. Sample mean is typically denoted with a horizontal bar over the variable x; \overline{x} enunciated as “x bar”.

\mu = \overline{x} = \frac{1} {n}\sum_{i=1}^{n}x_i = \frac{x_1+x_2+...+x_n}{n}

Median:

In probability theory and statistics, a median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one.

median = \frac{x_n+1}{2}

Skewness:

In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. positive skew: The right tail is longer; the mass of the distribution is concentrated on the left of the figure. The distribution is said to be right-skewed. negative skew: The left tail is longer; the mass of the distribution is concentrated on the right of the figure. The distribution is said to be left-skewed.

\gamma = \frac{\mu_3}{\sigma_3} = \frac{\sum_{i=1}^{n}(x_i-\overline{x})^3}{(n-1)s^3}


Kurtosis:

In probability theory and statistics, kurtosis (from the Greek word kyrtos or kurtos, meaning bulging) is a measure of the “peakedness” of the probability distribution of a random variable and how outlier-prone a distribution is. Higher kurtosis means more of the variance is due to infrequent extreme deviations, as opposed to frequent modestly-sized deviations. The kurtosis of the normal distribution is 3. Distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3; distributions that are less outlier-prone have kurtosis less than 3.

kurtosis = \frac{\mu_4}{\sigma_4} = \frac{\sum_{i=1}^{n}(x_i-\overline{x})^4}{(n-1)s^4}


Uniform Distribution (continuous):

In probability theory and statistics, the continuous uniform distribution is a family of probability distributions such that for each member of the family, all intervals of the same length on the distribution’s support are equally probable. The support is defined by the two parameters, a and b, which are its minimum and maximum values. The distribution is often abbreviated X~U(a,b).

A continuous random variable X which has probability density function given by:

f(x) = \frac{1}{b-a}

For a\geq{x}\geq{b} as f(x) has a value of 1 on interval [0,1].


Normal Distribution:

The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. Each member of the family may be defined by two parameters, location and scale—the mean and variance, standard deviation squared s^2—respectively. The importance of the normal distribution as a model of quantitative phenomena in the natural and behavioral sciences is due to the central limit theorem.

The normal distribution also arises in many areas of statistics. For example, the sampling distribution of the sample mean is approximately normal, even if the distribution of the population from which the sample is taken is not normal. In addition, the normal distribution maximizes information entropy among all distributions with known mean and variance, which makes it the natural choice of underlying distribution for data summarized in terms of sample mean and variance. The normal distribution is the most widely used family of distributions in statistics and many statistical tests are based on the assumption of normality.

In probability theory, normal distributions arise as the limiting distributions of several continuous and discrete families of distributions. This distribution is often abbreviated X~N(\mu, \sigma^2). A normal distribution with mean value of 1 and standard deviation equal to 2 is given by:

\varphi_{\mu,\sigma^2}(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{x-\mu^2}{2 \sigma^2}}

\varphi_{1,4}(x) = \frac{1}{2 \sqrt{2 \pi}} e^{-\frac{(x-1)^2}{8}}

 For \quad x \in \mathbb{R}.

 

 PROBLEM 1:
Using a moderate size data set, use R to complete the following tasks.

Data entered into R:

Rate of marriages per 1,000 total population residing in the United States in 2004.

x<-c(9.4, 8.5, 6.6, 13.4, 4.8, 7.4, 5.8, 6.1, 4.5, 9.0, 7.8, 22.8, 10.8, 6.1, 7.8, 6.9, 7.0, 8.8, 8.0, 8.5, 6.9, 6.5, 6.1, 6.0, 6.1, 7.1, 7.5, 7.1, 62.4, 8.0, 5.8, 7.4, 6.8, 7.3, 7.0, 6.6, 6.5, 8.1, 5.9, 7.6, 8.2, 8.4, 11.4, 7.9, 10.0, 9.4, 8.3, 6.5, 7.5, 6.2, 9.4)

sort(x)

Mean:

\mu = \overline{x} = \frac{1}{51} \sum_{i=1}^{51}x_i = \frac{x_1+x_2+...+x_{51}}{51}
mean(x)= 8.939216

Median:

median = \frac{x_{51}+1}{2}
median(x)= 7.4


Skewness:

\gamma = \frac{\mu_3}{\sigma_3} = \frac{\sum_{i=1}^{51}(x-\overline{x})^3}{(51-1)s^3}
skewness(x) = 5.91365


Kurtosis:

kurtosis = \frac{\mu_4}{\sigma_4} = \frac{\sum_{i=1}^{51}(x_i-\overline{x})^4}{(51-1)s^4}
kurtosis(x)= 39.08219


Histogram:

See Histogram

d



PROBLEM 2:
Use R to generate 1,000 uniformly distributed real numbers between 0 and 1, using the runif command.

x=runif(1000, 0, 1)

Mean:

\mu = \overline{x} = \frac{1}{1000} \sum_{i=1}^{1000} x_i = \frac{(x_1+x_2+...+x_{1000})}{1000}
mean(x) = 0.5069946


Median:

median = \frac{x_{51}+1}{2}
median(x) = 0.4988877

Skewness:

\gamma = \frac{\mu_3}{\sigma_3} = \frac{\sum_{i=1}^{1000}(x_i-\overline{x})^3}{(1000-1)s^3}
skewness(x) = -0.01528395

Kurtosis:

kurtosis = \frac{\mu_4}{\sigma_4} = \frac{\sum_{i=1}^{1000}(x_i-\overline{x})^4}{(1000-1)s^4}
kurtosis(x) = 1.867278

Histogram:

See histogram

d



PROBLEM 3:

Using R to generate 1,000 normally distributed real numbers with mean 1 and standard deviation 2, using rnorm command.

x=rnorm(1000, 1, 2)

Mean:

\mu = \overline{x} = \frac{1}{1000} \sum_{i=1}^{1000}x_i = \frac{x_1+x_2+...+x_{1000}}{1000}
mean(x) = 0.9273835

Median:

median = \frac{x_{1000}+1}{2}
median(x) = 0.921323


Skewness:

\gamma = \frac{\mu_3}{\sigma_3} = \frac{\sum_{i=1}^{1000}(x_i-\overline{x})^3}{(1000-1)s^3}
skewness(x) = 0.0870111


Kurtosis:

kurtosis = \frac{\mu_4}{\sigma_4} = \frac{\sum_{i=1}^{1000}(x_i-\overline{x})^4}{(1000-1)s^4}
kurtosis(x) = 2.720586


Histogram:

See histogram