# Binomial, Normal Distribution, Matrices for Data Science

Binomial, Normal Distribution, Matrices for Data Science, Building on the Foundation: Binomial & Normal Distribution, CRISP DM, Anova, Matrices, Coordinate Geometry, Calculus.

Building on the Foundation:

In this course we continue to build your foundation on Data Science. In our Part 2 course you learned Probability, Descriptive Statistics, Data Visualization, Histogram, Boxplot & Scatter plot, Covariance & Correlation. In Part 3 we will help you learn Binomial & Normal Distribution, TOH, CRISP-DM, Anova, Matrices, Coordinate Geometry & Calculus.

You will learn the following concepts with examples in this course:

**Normal distribution** describes continuous data which have a symmetric distribution, with a characteristic ‘bell’ shape.

**Binomial distribution** describes the distribution of binary data from a finite sample. Thus it gives the probability of getting r events out of n trials.

**Z**–**distribution** is used to help find probabilities and percentiles for regular normal **distributions** (X). It serves as the standard by which all other normal **distributions** are measured.

**Central limit theorem** (**CLT**) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a bell curve) even if the original variables themselves are not normally distributed.

**Decision making: **You **can** calculate the **probability** that an event **will** happen by dividing the number of ways that the event **can** happen by the number of total possibilities. **Probability can** help you to make better **decisions**, such as deciding whether or not to play a game where the outcome may not be immediately obvious.

**CRISP**–**DM** is a cross-industry process for **data mining**. The **CRISP**–**DM** methodology provides a structured approach to planning a **data mining** project. It is a robust and well-proven methodology.

**Hypothesis testing** is an act in statistics whereby an analyst **tests** an assumption regarding a population parameter. **Hypothesis testing** is used to assess the plausibility of a **hypothesis** by using sample data. Such data may come from a larger population, or from a data-generating process.

Analysis of variance (**ANOVA**) is a collection of statistical models and their associated estimation procedures (such as the “variation” among and between groups) used to analyze the differences among group means in a sample. **ANOVA** was developed by statistician and evolutionary biologist Ronald Fisher.

**Basics** of Matrices, Coordinate Geometry, Calculus & Algebra

Through our **Four-part series** we will take you **step by step**, this course is our **third part** which will solidify your foundation.