If you've made it through the last few lessons, you're doing great. You know how to open a file, read from it, and even do a bit of data mining (meaning, figuring out things from the data).
It turns out that one of the big challenges in data science is in dealing with data files, and getting them into your code so you can analyze them. In the last lessons, a file called
was used. It was a one column data file of numbers. Easy, huh?
Suppose now we have a two column
data file, where each line is a pair of numbers, separated by a comma. (These are usually called "CSV files." CSV=comma separated value.)
So here's the next challenge: dealing with a two column data file. Let's start using a file called
, which has two columns. The first is the year, and the second is the CO$_2$ concentration in our atmosphere. Here are the first few lines of the file
At this point, see if you can simply display the file to the screen. This is not a bad thing to do even at a 'real' data mining job.
But here's the issue: what do you do with the two columns? You can't use
, because the file doesn't contain straight numbers. It's a CSV file in the format of
. Luckily, we have another function called
that just reads in a line from a file into a string
. It doesn't try to process it into a number. It just reads in what it sees, and gives it to you.
We'll show you how to work with a string of numbers in the next lesson. At this point, just see if you can display the data in
to the screen.
Move the mouse over a dotted box for more information.