In two previous posts, we examined the annual rainfall data in Los Angeles (see Looking at LA Rainfall Data and LA Rainfall Time Plot). The data we examined in these two post contain 132 years worth of annual rainfall data collected at the Los Angeles Civic Center from 1877 to 2009 (data found in Los Angeles Almanac). These annual rainfall data represent an excellent opportunity to learn the techniques from a body data analysis methods grouped under the broad topic of descriptive statistics (i.e. using graphs and numerical summaries to answer questions or find meaning in data).

Here’s two graphics presented in Looking at LA Rainfall Data.

**Figure 1**

**Figure 2**

These charts are called histograms and they look the same (i.e. have the same shape). But they present slightly different information. Figure 1 shows the frequency of annual rainfall. Figure 2 shows the relative frequency of rainfall.

For example, Figure 1 indicates that there were only 3 years (out of the last 132 years) with annual rainfall under 5 inches. On the other hand, there were only 2 years with annual rainfall above 35 inches. So drought years did happen but not very often (only 3 out of 132 years). Extremely wet seasons did happen but not very often. Based on Figure 1, we see that in most years, annual rainfall records range from 5 to about 25 inches. The most likely range is 10 to 15 inches (45 years out of the last 132 years). In Los Angeles, annual rainfall above 25 inches are rare (only happened 12 years out of 132 years).

Figure 1 is all about count. It tells you how many of the data points are in a certain range (e.g. 45 years in between 10 to 15 inches). For this reason, it is called a frequency histogram. Figure 2 gives the same information in terms of proportions (or relative frequency). For example, looking at Figure 2, we see that about 34% of the time, annual rainfall is from 10 to 15 inches. Thus, Figure 2 is called a relative frequency histogram.

Keep in mind raw data usually are not informative until they are summarized. The first step in summarization should be a graph (if possible). After we have graphs, we can look at the data further using numerical calculation (i.e. using various numerical summaries such as mean, median, standard deviation, 5-number summary, etc). To see how this is done, see the previous post Looking at LA Rainfall Data.

What kind of information can we get from graphics such as Figure 1 and Figure 2 above? For example, we can tell what data points are most likely (e.g. annual rainfall of 10 to 15 inches). What data points are considered rare or unlikely? Where do most of the data points fall?

This last question should be expanded upon. Looking at Figure 2, we see that about 60% of the data are under 15 inches (0.023+0.242+0.341=0.606). So for close to 80 years out of the last 132 years, the annual rainfall records were 15 inches or less. About 81% of the data are 20 inches or less. So in the overhelming majority of the years, the annual rainfall records are 20 inches or less. So annual rainfall of more than 20 inches are relatively rare (only happened about 20% of the time).

We have a name of the data situation we see in Figure 1 and Figure 2. The annual rainfall data in Los Angeles have a skewed right distribution. This is because most of the data points are on the left side of the histogram. Another way to see this is that the tallest bar in the histogram is the one at 10 to 15 inches. Note that the side to the right of the peak of the histogram is longer than the side to the left of the peak. In other words, when the right tail of the histogram is longer, it is a skewed right distribution. See the figure below.

**Figure 3**

Besides the look of the histogram, skewed right distribution has another characteristic. The mean is always a lot larger than the median in a skewed right distribution. For example, the mean of the annual rainfall data is 14.98 inches (essentially 15 inches). Yet the median is only 13.1 inches, almost two inches lower. Whenever, the mean and the median are significantly far apart, we have a skewed distribution on hand. When the mean is a lot higher, it is a skewed right distribution. When the opposite situation occurs (the mean is a lot lower than the median), it is a skewed left distribution. When the mean and median are roughly equal, it is likely a symmetric distribution.

## Is College Worth It?

Is college worth it? This was the question posed by the authors of the report called College Majors, Unemployment and Earnings, which was produced recently by The Center on Education and the Workforce. We do not plan on giving an detailed reporting on this report. Any interested reader can read the report here. Instead, we would like to look at two graphics in this reports, which are reproduced below. These two graphics are very interesting, which capture all the main points of the report. The data used in the report came from American Community Survey for the years 2009 and 2010.

Figure 1Figure 2Figure 1 shows the unemployment rates by college major for three groups of college degree holders, namely the recent college graduates (shown with green marking), the experienced college graduates (blue marking) and the college graduates who hold graduate degrees (red marking). Figure 2 shows the median earnings by major for the same three groups of college graduates (using the same colored markings).

Figure 1 ranks the unemployment rates for recent college graduates from highest to the lowest. You can see the descending of green markings from 13.9% (architecture) to 5.4% (education and health). So this graphic shows clearly that the employment prospects of college graduates depend on their majors, which is one of the main points of the report.

The graphic in Figure 1 shows that all recent college graduates are having a hard time finding work. The unemployment rate for recent college graduate is 8.9% (not shown in Figure 1). The employment picture for recent college architecture graduates is especially bleak, which is due to the collapse of the construction and home building industry in the recession. The unemployment rates for recent college graduates who majored in education and healthcare are relatively low, reflecting the reality that these fields are either stable or growing.

Everyone is feeling the pinch in this tough economic environment. Even the recent graduates in technical fields are experiencing higher than usual unemployment rates. For example, the unemployment rates for recent college graduates in engineering and science, though relatively low comparing to architecture, are at 7.5% and 7.7%, respectively. For computers and mathematics recent graduates, the unemployment rate is 8.2%, approaching the average rate of 8.9% for recent college graduates.

The experienced college graduates fare much better than recent graduates. It is much more likely for experienced college graduates to be working. Looking at Figure 1, another observation is that graduate degrees make a huge difference in employment prospects across all majors.

The graphic in Figure 2 suggests that earnings of college graduates also depend on the subjects they study, which is another main point of the report. The technical majors earn the most. For example, median earning among recent engineering college graduates is $55,000 and the median for arts majors is $30,000. Aside from the high technical, business and healthcare majors, the median earnings of recent college graduates are in the low $30,000s (just look at the green markings in Figure 2).

Figure 2 also shows that people with graduate degrees have higher earnings across all majors. The premium in earnings for graduate degree holders is substantial and is found across the board. Though the graduate degree advantage is seen in all majors, it is especially pronounced among the technical fields (just look at the descending red markings in Figure 2).

So two of the main points are (1), employment prospects of college graduates depend on their majors, and (2) the earning potential of college graduates also depend on the subjects they study. Is college worth it? The report is not trying to persuade college bound high school seniors not to go to college. On the contrary, the authors of the report answer the question in the affirmative. The authors of the report are merely providing the facts that all prospective college students should consider before they pick their majors. The two graphics shown above are effective demonstration of the facts presented by the report. According to the authors, students “should do their homework before picking a major, because, when it comes to employment prospects and compensation, not all college degrees are created equal.”