After the the last lesson, we have large 2000-point data set that we are trying to make sense out of. So we computed the maximum, minimum, and average of all of the data points. Not a bad start.

Suppose this is all you did. For the data given, you'll get an average of 541.851, a maximum of 619 and a minimim of 462. Are you really done? Do you really understand the numbers? Or, are you missing something, which is: what does the*distribution* of the numbers look like.

What does this mean? Well, the first number in the data-set is 557. As for the distribution, one may ask how many times the 557 appears in the whole data-set. What if the 557 appeared 100 times, and all of the other numbers appear only once? Or what if each number appears 5 times each? Get what we mean by the distribution of the numbers?

Typically, we don't look how many times a single number appears (this is too detailed). Instead we look at how many a small range of numbers appears. Here's an example.

In this data, the maximum number is 619 and the minimum 462. Suppose we wanted to look at the occurrence of 10 groups of numbers. We'd do $$\Delta=\frac{619-462}{10}=15.7.$$ In other words, we'll look at groups of numbers $\Delta$ (or $15.7$) wide. This means, we'll look for numbers in the range $462$ to $462+\Delta$ or $462$ to $477.7$. Next, we'll look for numbers in the range $477.7$ to $477.7+\Delta$ or $477.7$ to $493.4$. All told, we'll count the occurrence of times a number in these ranges:

In this part of making a histogram, we'll compute the bin boundaries and display them to the screen.

Suppose this is all you did. For the data given, you'll get an average of 541.851, a maximum of 619 and a minimim of 462. Are you really done? Do you really understand the numbers? Or, are you missing something, which is: what does the

What does this mean? Well, the first number in the data-set is 557. As for the distribution, one may ask how many times the 557 appears in the whole data-set. What if the 557 appeared 100 times, and all of the other numbers appear only once? Or what if each number appears 5 times each? Get what we mean by the distribution of the numbers?

Typically, we don't look how many times a single number appears (this is too detailed). Instead we look at how many a small range of numbers appears. Here's an example.

In this data, the maximum number is 619 and the minimum 462. Suppose we wanted to look at the occurrence of 10 groups of numbers. We'd do $$\Delta=\frac{619-462}{10}=15.7.$$ In other words, we'll look at groups of numbers $\Delta$ (or $15.7$) wide. This means, we'll look for numbers in the range $462$ to $462+\Delta$ or $462$ to $477.7$. Next, we'll look for numbers in the range $477.7$ to $477.7+\Delta$ or $477.7$ to $493.4$. All told, we'll count the occurrence of times a number in these ranges:

- 462.0 to 477.7
- 477.7 to 493.4
- 493.4 to 509.1
- 509.1 to 524.8
- 524.8 to 540.5
- 540.5 to 556.2
- 556.2 to 571.9
- 571.9 to 587.6
- 587.6 to 603.3
- 603.3 to 619.0

In this part of making a histogram, we'll compute the bin boundaries and display them to the screen.

`bins`

variable to the numbers of bins you want to have and see if the bin boundaries come out right.
Type your code here:

See your results here:

Here's a breakdown of the code thus far:

- Part 1: Sets up a few things.

- Part 2: Sets up things needed for the bins, including the number of bins, $\Delta$, and some arrays we'll need.
`count`

will be used when we actually make the histogram in the next part, and`bin_low`

and`bin_high`

will be the low and high boundaries of each bin. So for example, for the 10 bins we proposed above,`bin_low[1`

] will be equal to 462.0 and`bin_high[1`

] to 477.7.

- Part 3: This is where we actually compute the bin boundaries. Here we count through the number of bins with variable
`i`

. The logic we use is that`bin_low[0]`

should be 462 (or the minimum number in the data set) and`bin_high[1]`

should be $462+\Delta$. With this`bin_low[2]`

should be the minimum + $\Delta$, and`bin_low[3]`

should be the minimum + $2\Delta$ and so on. (Notice the pattern is bin_low[n]=min+(n-1)$\Delta$.)

- Any
`bin_high`

value is just it's corresponding`bin_low`

+ $\Delta$.