# 1: How to process the data - Mathematics

Generally, you do not need a computer to process the data. However, contemporary statistics is “heavy” and almost always requires the technical help from some kind of software.

## A Complete Guide To Math And Statistics For Data Science

Join the DZone community and get the full member experience.

Math and Statistics for Data Science are essential because these disciples form the basic foundation of all the Machine Learning Algorithms. In fact, Mathematics is behind everything around us, from shapes, patterns, and colors, to the count of petals in a flower. Mathematics is embedded in each and every aspect of our lives.

Although having a good understanding of programming languages, Machine Learning algorithms and following a data-driven approach is necessary to become a Data Scientist, Data Science isn’t all about these fields. In this blog post, you will understand the importance of Math and Statistics for Data Science and how they can be used to build Machine Learning models.

Here’s a list of topics I’ll be covering in this Math and Statistics for Data Science blog:

1. Introduction To Statistics
2. Terminologies In Statistics
3. Categories In Statistics
4. Understanding Descriptive Analysis
5. Descriptive Statistics In R
6. Understanding Inferential Analysis
7. Inferential Statistics In R

## Children's Developing Data Collection

Even before birth our brains collect and organize information constantly. In the womb, babies store information on the prosody of their mother’s voice (intonation, rhythm, and stress). Then, as newborns, they differentiate and prefer her voice to another female’s.

Babies and children are constantly taking in data about the world around them. They make use of this data—often unconsciously—in a wide variety of ways. They learn which objects not to touch (if the object is steaming, came from an oven, or has been sitting in the sun), and what activities require them to hang on (playing on a merry-go-round or swingset).

Consider language development. Young children learn, through repeated exposure, that plural nouns usually end in the sound /s/. We know this because they then generalize this rule and turn footinto foots, mouseinto mouses, and sheepinto sheeps. Eventually, through repeated exposure to feet, mice, and sheep, they collect data to know that there are exceptions to the /s/ rule.

Through play, toddlers and preschoolers learn that rotating shape sorter pieces frequently results in a better fit into the sorter. They discover that mixing blue and yellow paint results in green paint, and if they keep adding many colors they end up with brown paint.

## Research Shows

Research into the effectiveness of CBM has been ongoing for nearly 40 years. Below are just a few of those findings.

• Struggling students reported that they enjoyed monitoring their progress in mathematics and felt more motivated to learn. Additionally, their mathematics performance improved significantly as a result of progress monitoring.
(Fuchs, Fuchs, Karns, Hamlett, Katzaroff & Dutka, 1997)
• CBM can be used to predict success in traditional year-end standardized assessments.
(Good, Simmons, & Kameenui, 2001)
• The performance of all students (high-, average-, and low-achieving students, as well as those with disabilities) improved when teachers modified their instruction based on CBM data.
(Stecker, Fuchs, & Fuchs, 2005)

### GOM and Struggling Students

GOM data can also help teachers to improve the academic growth of at-risk students or students with disabilities. These data can assist teachers in identifying students who may need a change of instruction or additional educational support. Teachers can use GOM data to:

• Compare the effectiveness of different instructional strategies
• Identify students who are not making adequate progress in a general education setting and who may therefore require additional supports
• Track progress toward individualized education program (IEP) goals for students receiving special education services
• Identify skills with which students are having the most difficulty

Lynn Fuchs, PhD
Dunn Family Chair in Psychoeducational Assessment
Department of Special Education
Vanderbilt University

Transcript: Lynn Fuchs

CBM can help teachers improve the learning outcomes of their at-risk students and students with learning disabilities. Teachers can use CBM data to improve the learning outcomes of their students, and they can use CBM in two ways for that purpose. First, they can compare rates of development under contrasting instructional interventions, and in that way they can identify which instructional components result in optimal growth rates. In addition, CBM data can help pinpoint the kinds of instructional programs and the academic skills specifically that students need more help in. And in that way teachers can direct their instructional effort more efficiently to only use program components that actually result in good growth for students, and to use their instructional time to tailor the specific skills that students are in need of working on. CBM can help teachers not only to assist them in developing strong instructional programs but also in communicating specifically and efficiently about students’ academic development. So the information on CBM most typically is graphed across time, and those graphs can be shared with other teachers, can be shared with principals and with parents to help those individuals understand in a very concrete way the rates of development that students are experiencing. In addition, the CBM data can be aggregated across students to help teachers understand for themselves how well they’re doing in affecting growth for their classrooms of students.

Our Training Data consists of X and y values so we can plot them on the graph, that’s damn easy. now what’s next? how to find that blue line.

First lets talk about how to draw a linear line in the graph,

In math we have an equation which is called linear equation

so we can draw the line if we take any values for m and b

How do we get the m and b values . and how do we know exact m and b values for the best fit line??

Lets take a simple data set (sine wave form -3 to 3) and First time we take random values of m and b values and we draw a line something like this.

we take the first X value(x1) from our data set and calculate y value(y1)

That line is not fitting well to the data so we need to change m and b values to get the best fit line.

How do we change m and b values for the best fit line??

Either we can use an awesome algorithm called Gradient Descent (Which I will cover in next story with also the math used in there.)

Or we can borrow direct formulas from statistics(they call this Least Square Method) I will also cover if possible in next story.

Right now lets black box, we assume that we are getting the m and b values, Every time when the m and b values change we may get a different line and finally we get the best fit line

So What’s next. Predicting new data, remember?? so we give new X values we get the predicted y values how does it work ??

same as above y= m X +b , we now know the final m and b values.

This is called simple linear regression as we have only one independent X value. Lets say we wanna predict housing price based the size of house

X= Size (in sqft’s) y= Price (in dollar’s)

What if we have more independent values of X.

Lets say we wanna predict housing price not only by the size of house but also by no of bedrooms

x1= Size (in sqft’s), x2=N_rooms and y= Price (in dollar’s)

The process same as above but the equation changes a bit

Note: Lets alias b and m as θ0 and θ1 (theta 0 and theta 1 ) respectively.

y = θ0+θ1*X → b+mX → Simple LR → Single variable LR

y=θ0+θ1*x1+θ2*x2+..θn*xn → Multiple LR → Multi variable LR

Part I of the script is for the initial fall meeting, and Part II is for subsequent follow-up meetings. Although many items are very similar, there are some important differences to be aware of before using the script at DAT meetings. The most important difference is that the initial meeting will focus mostly on planning, whereas the follow-up meetings involve much more evaluation and fine-tuning of strategies. In follow-up meetings, previous student data are available for comparing performance over time. Previous universal screening data are helpful in determining if there is overall improvement, especially in examining specific skills via item analysis or other methods. Also, there is an increased emphasis on evaluating past decisions at follow-up meetings. In addition to selecting new strategies, the team also discusses how well the strategies they planned at the previous meeting have been working for the students. The team can decide to continue with the existing strategies or to select new ones. Finally, follow-up meetings may include more detailed discussions about tier movement. As the year progresses, students will move between the tiers, in and out of various intervention groups.

Using a systematic team approach to RTI allows teachers and staff to all be involved in planning for every student’s academic performance. By sharing responsibility as a team, more educators are accountable for student progress and aware of the diversity of needs among students. The DAT model described by the script is very explicit and detailed for the purposes of keeping teams on task and focused on the data. Although the script may seem rigid, adherence to an established systematic model helps ensure implementation fidelity and, thus, improved outcomes for students.

## Data Analysis In The Big Data Environment

Big data is invaluable to today’s businesses, and by using different methods for data analysis, it’s possible to view your data in a way that can help you turn insight into positive action.

To inspire your efforts and put the importance of big data into context, here are some insights that you should know – facts that will help shape your big data analysis techniques.

• By 2023 the industry of big data is expected to be worth approximately $77 billion. of enterprises say that analyzing data is important for their business growth and digital transformation. • Companies that exploit the full potential of their data can increase their operating margins by 60%. • We already told you the benefits of Artificial Intelligence through this article. This industry's financial impact is expected to grow up to$40 billion by 2025.

Data analysis concepts may come in many forms, but fundamentally, any solid methodology will help to make your business more streamlined, cohesive, insightful, and successful than ever before.

This might not be exactly what the asker wanted (there's not much clear info on what type of details are required for each process id), but you can get some details of a task by its pid using the BASH command ps -p $PID (ps being short for process status) With default options as ps -p$PID this returns:

• PID: echos the process id
• TTY: the name of the controlling terminal (if any)
• TIME: how much CPU time the has process used since execution (e.g. 00:00:02)
• CMD: the command that called the process (e.g. java )

Here's one example that tells you a particular process PID's full command with arguments, user, group and memory usage (note how the multiple -o flags each take a pair, and how the command outputs with lots of whitespace padding):

Tip: for human-read output in the console, make args the last option - it'll usually be the longest and might get cut short otherwise.

## Math Problem Solving 101

Have you ever given your students a money word problem where someone buys an item from a store, but your students come up with an answer where the person that bought the item ends up with more money than he or she came in with?

Word problem solving is one of those things that many of our children struggle with. When used effectively, questioning and dramatization can be powerful tools for our students to use when solving these types of problems.

I came up with this approach after co-teaching a lesson with a 3rd grade teachers. Her kids were having extreme difficulty comprehending a word problem she presented. So we devised a lesson that would help students better understand problem solving.

The approach we took included the use of several literacy skills, like reading comprehension and writing. First, we started the lesson with a “think aloud” modeled by the teacher.We read and displayed the problem below but excluded ALL of the numbers. See the images below:

The purpose of reading the problem without the numbers is to get the students to understand what is actually happening in the problem. Typically some students focus solely on keywords when solving word problems, but I do not advise using this approach exclusively. With math problems, the context of the problem and actions in the problem determine how the child should go about solving it.

After reading the problem (without numbers) to the students, I asked the following questions:

• Can you describe what is happening in your own words?
• What is the main idea of the problem?
• How could you act this out?

### Make a Plan & Ask Questions:

After the students articulated what was happening in the problem, we made a plan to solve the problem. I used the following guiding questions:

• What information do we know?
• Sample Answers include-We know that Kai has some goldfish. Kai donated or gave away some of the goldfish.
• Sample Answers includeWe need to know how many goldfish Kai has. We also need to know how many he gave anyway. We also need to know how many bowls there are.
• Sample Answers include-We need to find out how many fish belong in each bowl.

The class discussed the answers to the questions above. As we discussed the questions above the responses were written out on a problem solving template.

As part of this process, we clarified student understanding of the problem and determined what we needed to find and do to solve the problem. Next, we walked the students through the process of showing their work using pictures. Lastly, we checked our answers by writing an equation that matched the pictures to finally solve the problem.

### Team Work Counts

After going through the process with the class, we decided to split the students into small groups of 3 and 4 to solve a math problem together. The groups were expected to use the same process that we used to solve the problem. It took a while but check out one of the final products below.

### Benefits to Using this Process:

• Students understood what the problem is asking them to do
• Students are required to think and communicate as a team
• Students avoid making errors that can come with only using keywords
• Students are required to record their math reasoning using the problem solving template
• After using this process a couple of times, students get used to explaining and justifying their answers
• You become the facilitator of the learning by asking more questions, thereby making students independent thinkers

### Things to Consider Include:

• This process in NOT quick. It requires TIME. You should not rush the process and expect to have it completed in 20 – 30 minutes in one day.
• This process is not a one time lesson. Students may not get it the first time. It should be seen a routine that can be used when solving word problems.

Be sure to let me know how this process works in your classroom in the comments below.

Performance-based assessment can work with the curriculum, instruction, or unit that you're teaching right now. How would you design a performance-based assessment for this content? Because PBA requires students to demonstrate their knowledge and skills with the concepts that they've learned, this assessment requires them to create a product or response, or to perform a specific set of tasks.

At Hampton High School, teachers calibrate their assessments against a rigor scale with the goal of high performance. They use the common Rigor, Relevance, and Relationships framework to demonstrate that the higher levels of rigor and relevance embody higher-level cognition and application. "What's the level of performance?" teachers will ask when designing assessments. "Is the performance that we want from kids short-term memory and fragmented applications, or should they demonstrate comprehensive understanding of big ideas?" This shifts the focus from content measures to student performance measures.

For example, a performance task in history would require students to produce a piece of writing rather than answering a series of multiple-choice questions about dates or events. The value of performance assessment is that it mimics the kind of work done in real-world contexts. So an authentic performance task in environmental science might require a student to investigate the impact of fertilizer on local groundwater and then report the results through a public service campaign (like a video, a radio announcement, or a presentation to a group).

Performance assessment draws on students’ higher-order thinking skills -- evaluating the reliability of information, synthesizing data to draw conclusions, or solving a problem with deductive or inductive reasoning. Performance tasks may require students to present supporting evidence in an argument, conduct a controlled experiment, solve a complex problem, or build a model. A performance task often has more than one acceptable solution, and teachers use rubrics as a key part of assessing student work.

### Math: Disaster Relief Mission

Hampton High School's pre-calculus teachers aimed to create a performance-based assessment that asked students to demonstrate their knowledge of concepts, and apply it to circumstances unfamiliar to them. They came up with Disaster Relief Mission, a simulation where students play the role of air traffic controllers and pilots responding to crisis situations around the country. In these situations, students have to figure out what math to use in order to rescue those in need.

In the Resources tab, you'll find all the math materials that Hampton teachers created for the Disaster Relief Mission project. These materials include:

• Project directions
• Missions
• Worksheets
• Rubrics to assess the project

### Prep Work

Disaster Relief Mission is a sophisticated example of performance assessment, developed and refined over the past three years by Hampton's teachers. The prep work involved in such a project does require some time, including coming up with the missions, setting up the gymnasium with the correct coordinates, and configuring all the technology (iPods, FaceTime, and a Compass App) used in this exam. Teachers also spend some time training students on how to use the technology so that it won't be an issue during the actual work. Students are also trained for the roles of both pilot and air traffic controller, in case teams need to be reconfigured on the day of the exam.

### Disaster Relief Mission PBA

Students are split into teams of three (one air traffic controller and two pilots) and given four disaster missions to solve. Each team is distributed across two locations (air traffic controllers in one room, pilots in the gymnasium), and all communicate via FaceTime.

The teachers set up ten missions in the gymnasium, each with different coordinates. However, students have only four problems to solve, allowing multiple groups teams work in the gym at the same time but not on the same problem.

A sample disaster relief mission looked like this:

Air traffic controllers are responsible for determining the angle and distance that the pilots need to move to get them from one mission to another. They calculate these numbers and relay them to the pilots via FaceTime. If correct, the pilots in the gym reach the mission site and then have to figure out what math will help them complete the mission. For example, will their calculations require the Law of Sines, Law of Cosines, right triangle trigonometry, or bearings?

After students complete one mission, they restart the whole process for the next mission, until they complete all four. The whole PBA takes one class period to complete.

### Evaluation/Utilizing Rubrics

Teachers design a rubric to measure the performance of students. The rubric is given to students ahead of time, so that they're clear about what they will be assessed on. For Disaster Relief Mission, the rubric is designed so that each team member -- whether pilot or air traffic controller -- receives the same number of points on the exam. For a perfect score, a team receives 45 points for completing and solving all four missions. The rubric assesses the accuracy of how well students solve each mission, including:

• Looking at the accuracy of how polar coordinates were calculated
• Looking at the accuracy of math used in each mission, including all calculations (not just final answers)
• Supporting work, including maps that showed how the air traffic controllers determined the angles at which the plane would travel
• Neatness of the work
• How students collaborated and communicated as a team

If a team doesn't submit its calculations, for example, but has the correct answer, less points are given. If a team has a correct answer but the units of measure are missing, they're also given fewer points. The rubric allows teachers to grade across a spectrum, taking into consideration how accurate and complete the students' work is.