### 1 Descriptive statistics

For the exercises below download the friendsAndCars.csv file and save it on your I:-drive. Make sure that you don't have any empty lines at the end of the file.

#### 1.1 Preprocessing

The friendsAndCars.csv file contains a relation between people and the cars they own. In order to use descriptive statistics it is best to calculate frequencies:
• people and how many cars they own, or
• cars and how many people own these cars.

The following Python script counts how often each type of car is in the list:

```from networkx import *
from operator import *
from sets import Set
a = [item for item in sorted(F.edges(),key=itemgetter(1))]
for item in Set(a):
print item + "," + str(a.count(item))
```
You can save this as countFreq.py and run it on the command-line using
```python countFreq.py > CarsCounted.csv
```

#### 1.2 Using Excel (or OpenOffice)

If you double click on CarsCounted.csv, it will open in Excel.

Measures of central tendency and dispersion are functions in Excel. For example, AVERAGE(B1:B6) calculates the average (arithmetic mean) of the values in cells B1 to B6.

measureformula
modeMODE()
medianMEDIAN()
mean AVERAGE()
varianceVAR()
standard deviationSTDEV()

#### 1.3 Exercises

1) Calculate the measures in the table above for the Cars data.

2) How can you interpret the data: what is the central value? Is this a normal distribution?

3) Produce a chart (diagram) of the data. In order to do this, you should highlight the data and then select the chart wizzard. You may want to create a label for each column first.