# 1.5.2: Displaying Categorical Variables

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$

( \newcommand{\kernel}{\mathrm{null}\,}\) $$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\id}{\mathrm{id}}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\kernel}{\mathrm{null}\,}$$

$$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$

$$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$

$$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\AA}{\unicode[.8,0]{x212B}}$$

$$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$$

$$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$$

$$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vectorC}[1]{\textbf{#1}}$$

$$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$$

$$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$$

$$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$$

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

$$\newcommand{\avec}{\mathbf a}$$ $$\newcommand{\bvec}{\mathbf b}$$ $$\newcommand{\cvec}{\mathbf c}$$ $$\newcommand{\dvec}{\mathbf d}$$ $$\newcommand{\dtil}{\widetilde{\mathbf d}}$$ $$\newcommand{\evec}{\mathbf e}$$ $$\newcommand{\fvec}{\mathbf f}$$ $$\newcommand{\nvec}{\mathbf n}$$ $$\newcommand{\pvec}{\mathbf p}$$ $$\newcommand{\qvec}{\mathbf q}$$ $$\newcommand{\svec}{\mathbf s}$$ $$\newcommand{\tvec}{\mathbf t}$$ $$\newcommand{\uvec}{\mathbf u}$$ $$\newcommand{\vvec}{\mathbf v}$$ $$\newcommand{\wvec}{\mathbf w}$$ $$\newcommand{\xvec}{\mathbf x}$$ $$\newcommand{\yvec}{\mathbf y}$$ $$\newcommand{\zvec}{\mathbf z}$$ $$\newcommand{\rvec}{\mathbf r}$$ $$\newcommand{\mvec}{\mathbf m}$$ $$\newcommand{\zerovec}{\mathbf 0}$$ $$\newcommand{\onevec}{\mathbf 1}$$ $$\newcommand{\real}{\mathbb R}$$ $$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$$ $$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$$ $$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$$ $$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$$ $$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$$ $$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$$ $$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$$ $$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$$ $$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$$ $$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$$ $$\newcommand{\bcal}{\cal B}$$ $$\newcommand{\ccal}{\cal C}$$ $$\newcommand{\scal}{\cal S}$$ $$\newcommand{\wcal}{\cal W}$$ $$\newcommand{\ecal}{\cal E}$$ $$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$$ $$\newcommand{\gray}[1]{\color{gray}{#1}}$$ $$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$$ $$\newcommand{\rank}{\operatorname{rank}}$$ $$\newcommand{\row}{\text{Row}}$$ $$\newcommand{\col}{\text{Col}}$$ $$\renewcommand{\row}{\text{Row}}$$ $$\newcommand{\nul}{\text{Nul}}$$ $$\newcommand{\var}{\text{Var}}$$ $$\newcommand{\corr}{\text{corr}}$$ $$\newcommand{\len}[1]{\left|#1\right|}$$ $$\newcommand{\bbar}{\overline{\bvec}}$$ $$\newcommand{\bhat}{\widehat{\bvec}}$$ $$\newcommand{\bperp}{\bvec^\perp}$$ $$\newcommand{\xhat}{\widehat{\xvec}}$$ $$\newcommand{\vhat}{\widehat{\vvec}}$$ $$\newcommand{\uhat}{\widehat{\uvec}}$$ $$\newcommand{\what}{\widehat{\wvec}}$$ $$\newcommand{\Sighat}{\widehat{\Sigma}}$$ $$\newcommand{\lt}{<}$$ $$\newcommand{\gt}{>}$$ $$\newcommand{\amp}{&}$$ $$\definecolor{fillinmathshade}{gray}{0.9}$$

## Graphs for Categorical Data

### E-Waste and Bar Graphs

We live in an age of unprecedented access to increasingly sophisticated and affordable personal technology. Cell phones, computers, and televisions now improve so rapidly that, while they may still be in working condition, the drive to make use of the latest technological breakthroughs leads many to discard usable electronic equipment. Much of that ends up in a landfill, where the chemicals from batteries and other electronics add toxins to the environment. Approximately 80% of the electronics discarded in the United States is also exported to third world countries, where it is disposed of under generally hazardous conditions by unprotected workers. The following table shows the amount of tonnage of the most common types of electronic equipment discarded in the United States in 2005.

Electronic Equipment Thousands of Tons Discarded
Cathode Ray Tube (CRT) TV's 7591.1
CRT Monitors 389.8
Printers, Keyboards, Mice 324.9
Desktop Computers 259.5
Laptop Computers 30.8
Projection TV's 132.8
Cell Phones 11.7
LCD Monitors 4.9

Figure: Electronics Discarded in the US (2005). Source: National Geographic, January 2008. Volume 213 No.1, pg 73.

The type of electronic equipment is a categorical variable, and therefore, this data can easily be represented using a bar graph.

### Creating a Bar Graph

Make a bar graph for the E-waste data.

While this looks very similar to a histogram, the bars in a bar graph usually are separated slightly. The graph is just a series of disjoint categories.

Please note that discussions of shape, center, and spread have no meaning for a bar graph, and it is not, in fact, even appropriate to refer to this graph as a distribution. For example, some students misinterpret a graph like this by saying it is skewed right. If we rearranged the categories in a different order, the same data set could be made to look skewed left. Do not try to infer any of these concepts from a bar graph!

### Pie Graphs

Usually, data that can be represented in a bar graph can also be shown using a pie graph (also commonly called a circle graph or pie chart). In this representation, we convert the count into a percentage so we can show each category relative to the total. Each percentage is then converted into a proportionate sector of the circle. To make this conversion, simply multiply the percentage by 3.6, which represents 360 (the total number of degrees in a circle) divided by 100% (the total percentage available).

Here is a table with the percentages and the approximate angle measure of each sector for the E-waste data:

Electronic Equipment Thousands of Tons Discarded Percentage of Total Discarded Angle Measure of Circle Sector
Cathode Ray Tube (CRT) TV's 7591.1 86.8 312.5
CRT Monitors 389.8 4.5 16.2
Printers, Keyboards, Mice 324.9 3.7 13.4
Desktop Computers 259.5 3.0 10.7
Laptop Computers 30.8 0.4 1.3
Projection TV's 132.8 1.5 5.5
Cell Phones 11.7 0.1 0.5
LCD Monitors 4.9 ∼0 0.2

And here is the completed pie graph:

### Deciding Which Kind of Graph to Use

A pie graph is not always the best way to display categorical data. In the following example, you will see a pie graph that is not necessarily very helpful for analyzing data.

There are only some states in the USA which have the death penalty. Executions carried out under the death penalty are considered legal executions. The following table gives the number of legal executions in each state with the death penalty in 2005.

State Number of Legal Executions in 2005
AL 4
AR 1
CA 2
CT 1
DE 1
FL 1
GA 3
IN 5
MD 1
MO 5
MS 1
NC 5
OH 4
OK 4
SC 3
TX 19
Total 60

### Creating a Pie Graph

Create a pie graph for the data on the number of legal executions per state in 2005.

State Number of Legal Executions in 2005 Percentage of Legal Executions in 2005 Angle Measure of Circle Sector
AL 4 6.67 24
AR 1 1.67 6
CA 2 3.33 12
CT 1 1.67 6
DE 1 1.67 6
FL 1 1.67 6
GA 3 5.00 18
IN 5 8.33 30
MD 1 1.67 6
MO 5 8.33 30
MS 1 1.67 6
NC 5 8.33 30
OH 4 6.67 24
OK 4 6.67 24
SC 3 5.00 18
TX 19 31.67 114
Total 60 100.00 360

And here is the completed pie graph:

By looking at this pie graph, it is hard to make comparisons; there are too many categories and it is hard to get a good sense of the relative size of the categories. This is a case where a bar graph is easier to analyze visually. Making a bar graph for this data is left as an exercise for the Examples.

Pie graphs are better when you have fewer categories. Bar graphs have a moderate amount of categories, when you want to quickly visualize the different values for each category. If you had many categories, so that there were many repeats in the data, it would be better to use a histogram - you would be more interested in how many categories had the same value, rather than the value of each category.

## Example

### Example 1

Make a bar graph of the data on legal executions for each state in 2005 (under "Creating a Pie Graph") .

Start by labeling each state as a category along the horizontal axis. Then, draw a bar that has a height that represents the number of legal executions in that state.

In this graph, you can easily compare the number of executions for each state.

## Review

For 1-4, Computer equipment contains many elements and chemicals that are either hazardous, or potentially valuable when recycled. The following data set shows the contents of a typical desktop computer weighing approximately 27 kg. Some of the more hazardous substances, like Mercury, have been included in the 'other' category, because they occur in relatively small amounts that are still dangerous and toxic.

Material Kilograms
Plastics 6.21
Aluminum 3.83
Iron 5.54
Copper 2.12
Tin 0.27
Zinc 0.60
Nickel 0.23
Barium 0.05
Other elements and chemicals 6.44

Figure: Weight of materials that make up the total weight of a typical desktop computer.

1. Create a bar graph for this data.
2. Complete the chart below to show the approximate percentage of the total weight for each material.
Material Kilograms Approximate Percentage of Total Weight
Plastics 6.21
Aluminum 3.83
Iron 5.54
Copper 2.12
Tin 0.27
Zinc 0.60
Nickel 0.23
Barium 0.05
Other elements and chemicals 6.44
1. Create a circle graph for this data.
2. Which graph do you think makes a better visual representation of the data?

For 5-8, a statistics class of 30 students had the following grades:

A 14
B 7
C 4
D 3
F 1
1. Create a bar graph for this data.
2. Complete the chart below to show the approximate percentage of the total number of grades.
A 14
B 7
C 4
D 3
F 1
1. Create a pie graph for this data.
2. Which graph do you think makes a better visual representation of the data?

For 9-13, the median income of persons age 25+ is given by highest level of education achieved.

Highest Level of Education Median Income of Persons age 25+
High School $20,321 High school graduate$26,505
Some college $31,056 Associate's degree$35,009
Bachelor's degree or higher $49,303 Bachelor's degree$43,143
Master's degree $52,390 Professional degree$82,473
Doctorate degree $69,432 1. Create a bar graph for this data. 2. Complete the chart below to show the approximate percentage of the total median income. Highest Level of Education Median Income of Persons age 25+ Approximate Percentage of Total Median Income. High School$20,321
High school graduate $26,505 Some college$31,056
Associate's degree $35,009 Bachelor's degree or higher$49,303
Bachelor's degree $43,143 Master's degree$52,390
Professional degree $82,473 Doctorate degree$69,432
1. Create a pie graph for this data.
2. Which graph do you think makes a better visual representation of the data?
3. Why is the median used in this example, rather than other measure of center such as the mean, mode, or midrange?