In these worksheets, students will derive statistics from data sets.

#### Bias in data science is the branch of statistics that we use to measure the level of the record you have. It is entirely different from expectations of what you have collected. If you talk about its fundamental perspective, the bias fundamental relates to a fault in the validity of what you are pondering. Remember, all the mistakes will very subtle or the error that you may ignore. The data that we use for training is the central focus of predictive models. Even, in reality, they are aware of the absence of other reality in this process of forming a bias. The model's accuracy and fidelity need to compromise when you find systemic biased. With significant stakeholders, the models of biased data restrict the credibility of the information. The worst situation occurs when biased models actively distinguish against a specific set of people. It would be best if you had to be well aware of such perils, especially during the information entry process. A analytical scientist can easily and quickly remove all the errors from the data with this awareness. As a result of the process of forming unbiased values, higher-quality models become efficient in improving analytics adoption and increasing the value of analytics investment. The major types of bias : Confirmation bias - lacking a full confirmation that your data is biased. Selection bias - it refers to the error confirmation in subjectively selected data. Outliers - it relates to the presence of extreme value. Overfitting and underfitting - this kind of bias works when it becomes able to give an oversimplistic picture of reality. Confounding variables - It is not famous compared to another existing analytical model. It impacts both the explanatory variable and the dependent variable.

In these worksheets, student will study categorizing data and bias, normal distribution and standard deviation, organizing and interpreting data, percentiles and quartiles, frequency tables, and statistics. They will learn the concepts of qualitative or quantitative data; uni-variate or bi-variate data; and biased or unbiased data and will determine types of data. They will find the percentage of the normally distributed data that lies within a specified standard deviation of the mean. They will organize and interpret data and find the specified percentiles and quartiles. They will also learn to read and understand frequency tables. With some worksheets, extra paper will be required for students to construct frequency tables. For the statistics worksheets, students will draw line graph and bar graphs. This set of worksheets contains step-by-step solutions to sample problems, both simple and more complex problems, a review, and a quiz. It also includes ample worksheets for students to practice independently. Worksheets are provided at both the basic and intermediate skills levels. When finished with this set of worksheets, students will be able to derive statistics from data sets. These worksheets explain how to derive statistics from data sets. Sample problems are solved and practice problems are provided.

# Categorizing Data and Bias Worksheets

## Lesson

This worksheet explains the difference between qualitative and quantitative data. A sample problem is solved, and two practice problems are provided.

## Categorizing Data and Bias Worksheet

For questions 1-5, determine if the problem deals with qualitative or quantitative data; 6-8, determine uni-variate or bi-variate data; 9-10, determine biased or unbiased data.

## Practice

Students will practice determining whether a problem deals with qualitative or quantitative data; uni-variate or bi-variate data; and biased or unbiased data. Ten problems are provided.

## Review and Practice

The concept of how to qualify a stream of data is reviewed. A sample problem is solved. Six practice problems are provided.

## Quiz

Students will assess the value of data that is about to be collected. An example would be qualifying the data collected as: Ms. Claudia keeps a list of the amount of time her husband spends on exercise in the Gym.

## Check

Students will determine types of data that are present in each situation. Three problems are provided, and space is included for students to copy the correct answer when given.

## Lesson

This worksheet explains how to find the percentage of the normally distributed data that lies within a specified standard deviation of the mean. A sample problem is solved, and two practice problems are provided.

## Normal Distribution and Standard Deviation Worksheet

You will solve problems such as: A group of 200 boys has a mean age of 18.4 years with a standard deviation of 0.2 years. The ages are normally distributed. How many students are younger than 18.2 years? Express answer to the nearest student?

## Practice

You will get into problems like this: Battery lifetime is normally distributed for large samples. The mean lifetime is 450 days and the standard deviation is 50 days. To the nearest percent, what percent of batteries have lifetimes longer than 350 days? You will find ten exercise to practice this with.

## Review and Practice

The concept of how to calculate the normal distribution of a data set and standard deviation is reviewed. A sample problem is solved. Six practice problems are provided.

## Quiz

You will approach problems such as: A group of 230 students has a mean age of 19.5 years with a standard deviation of 0.6 years. The ages are normally distributed. How many students are younger than 18.8 years? Express answer to the nearest student?

## Skills Check

You will work on exercises like this: Jack's scores in English this semester were rather inconsistent: 60, 25, 38, 62, 72, 86, 34, 50. For this population, how many scores are within one standard deviation of the mean?

## Organizing and Interpreting Data Lesson

This worksheet explains how to organize and interpret data. A sample problem is solved, and two practice problems are provided.

## Worksheet

John spent \$300 as shown in the graph. How much of the total money in %, is spent on the following things?

## Practice

Students will practice organizing and interpreting data sets. You will get ten chances in all with this skill.

## Review and Practice Page 1

The grades of 10 students in a class are: 44,55,65,46,59,35,36,79,24,68. Complete the charts showing tally, frequency.

## Review and Practice Page 2

Let's breakdown how Andy spent his money.

## Quiz

According to the box-and-whisker plot shown above, what are : the median? The first quartile? The third quartile? The minimum value? The maximum value?

## Percentiles and Quartiles Lesson

This worksheet explains how to find the quartile that meets specified criteria. A sample problem is solved.

## Lesson and Practice

The snowfall amounts in a city (in mm) for a period of 15 days were: 3, 2, 6, 6, 7, 8, 9, 9, 11, 12, 14, 17, 18, 19, 21. What percent of the months have snowfall of at least 11 mm?

## Worksheet

The average monthly temperatures (in 0C) of a city over the year were: 26, 28, 30, 34, 35, 35, 38, 40, 42, 42, and 48. You will get ten chances in all with this skill.

## Review Page 1

Read the data set and write in the table the frequency of each number. A sample problem is solved.

## Review Page 2

In a test, scores obtained by 11 students were: 3, 3, 4, 5, 5, 6, 7, 8, 8, 9, and 9.

## Quiz

Students will demonstrate their proficiency with understanding percentiles and quartiles. Ten problems are provided.

## Check

Which interval contains the third (upper) quartile?

## Reading and Understanding Frequency Tables Basic Skills Worksheet

Students will use basic skills to read and understand frequency tables. Five problems are provided.

## Basic Skills Practice

We will work your core fundamentals and practice reading and understand frequency tables. Five problems are provided.

## Intermediate Skills Worksheet

The goal will be to craft a bunch of frequency tables.

## Intermediate Skills Practice

You will get another go at these types of exercises.

## Intermediate Skills Drill

You will get more work under this skill.

## Lesson

Using the frequency table, find the mean of the given data set. Mean is the 'average' of the given data set. So all the numbers are added and divided by the number of elements in the data set. The formula may be expressed thus: Mean = Sum of Values / Number of Values.

## Statistics Problems Lesson and Practice

Anna recorded her weight for 6 months. Here are the results. Make a line graph according to given data table.

## Worksheet

Bob recorded the temperature on his yard for 10 days in the month of June. Here are the results. Make a line and bar graph. Find the mean. Find the average temperature for first five days. What is the highest frequency?

## Practice

Given table is a set of a score for a class with 12 students. Answer the questions based on the data set.

## Drill

Jack recorded the number of people of different age-groups; use to walk in the morning. Here are the results. 60, 20, 20, 40, 40, 40, 30, 20, 30, 20, 20, 30, 30. Now make sense of the data.

## Warm Up

Draw the line graphs for the following data.

## Lesson

In a test, scores obtained by 11 students were: 3, 3, 4, 5, 5, 6, 7, 8, 8, 9, and 9. Complete the table and find the interval that contains the median.

## Statistics Quartiles and Percentiles Worksheet

The average monthly temperatures (in 0 degrees C) of a city over the year were: 26, 28, 30, 34, 35, 35, 38, 40, 42, 42, and 48.

## Practice

The following data represents the rainfall (in mm) of a city on 13 days: 23, 25, 31, 33, 36, 36, 43, 47, 56, 56, 61, 63, 67.

## Review and Practice

Complete the table and find the interval that contains the median. Read the data set and write in the table the frequency of each number.

## Quiz

Students will demonstrate their proficiency in answering questions about quartiles and percentiles based on the given data set. You will get ten chances in all with this skill.

## Check

Students will answer questions about quartiles and percentiles based on the given data set. Three problems are provided, and space is included for students to copy the correct answer when given.