Sometimes we need to compare two different variables with two different units, like the total grants awarded by an agency (*dollars*) compared to the number of different projects that they supported (*number of projects*).

Your poor viewer would read the regular axis on the left. They’d probably understand that just fine–Total Grants Awarded. But then they’d have to figure out which line corresponds to that axis. And the separate legend below the graph isn’t helping much because it requires zig-zagging eye movements (no worries–solution here). Their eyes would eventually shift over to the right side of the graph. They’d they’d encounter another y axis. Another one? Wait, what? With different units? But I thought this graph was talking about grant dollars? Or is it about the number of projects? This is getting confusing. Okay, it’s not just the viewer confused here. It’s me. Every single time. I ~~used to~~ think these graphs were some kind of sick joke, or at the very least, had some type of virus that accidentally arranged two separate graphs right on top of each other.

Unless your viewer looks at graphs with double y axes every day (unlikely–be honest), this is the wrong graph type.

First things first. Present the first variable, Total Grants Awarded. You viewer can skim the shape of the line and think about the implications of the graph.

Nudge that graph over to the side:

And add the second variable beside it.

In my remake, I’m presenting the data piecemeal. Think about the order in which you want viewers to make sense of your graph. First, I want viewers to see the variable on the left. Second, I want them to see the variable on the right.

The beauty of this revision is that I’ve got plenty of space for subtitles. I find interesting nuggets, like the minimum values, maximum values, averages, and totals, and add that background information in the form of a subtitle. I don’t hide the cool stuff in the paragraph above my graph or below my graph. I put the cool stuff in plain sight, right above the graph.

With my remake, I’ve got plenty of space to add a third variable if needed. First, my graph talks about the dollar amount given out. Then, my graph talks about how many projects were funded. The logical next step is to consider how the average project size has shifted over the years. That’s probably what my viewers would start wondering: *If the agency’s total funding has remained pretty constant over the years… but the agency is funding fewer grantees… then, the average project size must be increasing, right?* I try to anticipate what my viewers will be curious about and then I place the answers to their questions right alongside the other graphs. Not on the the next page. Not on the next slide. On the same page as the other related information.

The original graph with the double y axis didn’t let me show the graph in the correct order. I couldn’t show the graph in *any* order, actually. Both the variables were smushed on top of each other on the same graph. Now, my story has a beginning, middle, and end.

Join me at an upcoming training session to learn additional storytelling techniques. *And bonus! Download my Excel file and practice adjusting my template with your own numbers.*

]]>

The first step is inserting a pivot table from scratch. Click on the cell in the upper left-hand corner of your tabular data. In this case, we would click on cell A5 because that cell is the upper left-most cell this table. Then, go to the *Insert *tab and click *Pivot Table*.

Your pivot table will appear in a new sheet. To keep my workbook clutter-free, I give each sheet a descriptive name. Rename your new pivot table sheet (something easy like “pivot” is fine) by right-clicking on the sheet and clicking on *Rename Sheet.*

Now, let’s take a closer look at that pivot table that popped up in your new sheet.

In our original data sheet (named *Pivot Table Data* in my example), the columns are named *Employee, Gender, Age, Industry Experience, *and *State.*

Next, let’s check out the pivot table’s sheet (named *Pivot Table* in my example). Some of the important pivot table features appear along the right side of my screen. The boxes say *Pivot Table Fields, Filters, Columns, Rows, *and* Values*.

The *Pivot Table Fields* box, in the upper right, contains all the variables that we get to play around with. Notice how each of the columns of data from our Pivot Table Data sheet show up here: *Employee, Gender, Age, Industry Experience, *and *State.*

Now, on to the fun part, dragging and dropping variables! This feature is what makes a pivot table a pivot table.

Let’s start with simple math: Figuring out how many males and how many females are listed in our spreadsheet.

Click on *Gender *in the *Pivot Table Fields* list and drag it downwards into the *Rows* box. The pivot table, located off to the left in the main spreadsheet area, will say *Row Labels, Female, Male, *and* Grand Total*. It’s starting to build a table for us, which will eventually contain tallies of males and females.

Then, drag *Gender* into the *Columns* box. Watch how the table *pivots*, or switches from rows to columns accordingly. Now, *Female *and *Male* are listed across the top of the table rather than down the side.

Let’s figure out how many males and females are in the dataset. We’ve got the outline of our table but we need to fill in the body of the table with the actual tallies.

Watch below as I drag* Employee* into the *Values* box. The pivot table on the left-hand side of my screen will automatically update to show that there are 6 female employees and 4 male employees in my spreadsheet.

On to the really, really fun part: cross tabulations! *Cross tabulations*, or *crosstabs *for short, is a fancy way of saying that pivot tables give us the ability to stack multiple variables on top of each other. Figuring out how many males and how many females are in our spreadsheet is a good starting point, but crosstabs let us dig even deeper into the information.

Watch as I drag the *State* variable from the *Pivot Table Fields* box into the *Columns* box. The *Gender *categories (*Female *and *Male*) are listed along the left side of my table and the *State* categories (*DC, MD, *and *VA*) are listed along the top of my table.

The inner body of the table shows how many people fall into each category. For example, there are 3 females who live in DC, 3 males who live in DC, 1 female who lives in Virginia, 1 male who lives in Virginia, and 2 females who live in Maryland.

The *Grand Total* section reminds us that we’re talking about 10 people altogether.

Another way to cross tabulate your data is to double-stack two or more variables into a single box.

Before, we had *Gender* in *Rows *and *State* in *Columns*.

Now, I’ll drag *State* into the *Rows *box, right below *Gender*. Watch how the tallies in the pivot table on the left update themselves accordingly.

Within a single box, like the *Rows* box, you can also re-order the variables. You can have *Gender* on the top, or, drag *State* above *Gender*. There’s no single right answer here; practice dragging and dropping your variables into whichever order is most interesting and useful for you.

We’ve explored the *Pivot Table Fields*, the* Columns, *the *Rows, *and the* Values.*

Now, let’s take a look at the remaining box, *Filters*. Just like its name implies, this option lets you sift out certain categories so that you can focus exclusively on the information that’s most helpful for you.

Watch as I drag the *State* variable into the *Filters* section. *State* now appears in the first row of my spreadsheet. Do you see the little arrow beside the word *All*? The arrow reminds us that we’ve just created a filter. When I click on that arrow, a drop-down menu appears. I can choose to only look at people who live in DC, Maryland, and/or Virginia.

We haven’t explored the employees’ ages yet. The other categories were simple: Female/Male, DC/MD/VA, and so on. But with *Age*, there are as many different possibilities as there are employees.

First, watch as I drag *Age *into *Rows* and *Employee* into *Values*. The tallies aren’t very helpful here. My resulting pivot table on the left side of the screen just tells me that there is 1 employee for each of the ages listed. For example, there’s 1 employee who’s 24, 1 employee who’s 28, 1 employee who’s 35, and so on.

We need to condense the 10 different ages into a small number of categories. Let’s pretend that we’re interested in dividing the employees into two groups, the younger half and the older half.

Select or highlight the first five ages (the younger half of the employees). Then, right-click and select *Group*. Those first five ages are now clustered together in a brand new category called *Group1*.

Now let’s do the same thing for the five employees who fall into the older half of the bigger group. Highlight or select those five ages (from 50 down to 65), right-click, and select *Group*. We’ve got two groups: *Group1 *and *Group2.*

It’s pretty obvious to us what each group represents—the first group contains the younger employees and the second group contains the older employees. But, this grouping might not be so obvious if we email our spreadsheet to a coworker, classmate, or friend.

To stay organized, let’s give each of the groups a descriptive name. Removing the guesswork will keep everything neater in the long run.

Click on *Group1 *and simply begin typing *Younger Half*. Then, click on *Group2* and type *Older Half*.

Finally, let’s take a moment to explore the toggle feature of our *Younger Half* and *Older Half* groups. Click on the plus and minus signs to expand or collapse the menu.

I hope you enjoyed this gift of knowledge!

*Want to learn more? Join me on August 4, 2016 as I lead a 90-minute webinar on pivot tables. In this hands-on session, you’ll practice using pivot tables with three different datasets. I’ll talk about the pros and cons of applying pivot tables to each dataset, so you’ll have a solid understanding of when to use ’em and when to stick with regular old formulas. And speaking of formulas, we’ll conclude by analyzing survey data with the =countif function, the function I demo’d in my most popular YouTube video to date.*

]]>

I intentionally separated the year-by-year bars from the total bar with a little extra space. I don’t want all the bars mushed together.

Easy to sketch on paper.

But easy in Excel?

Yes!

Here’s the default Excel graph.

Yuck.

Here’s the mostly edited version: reduced clutter; custom color; labels directly beside the data; reduced gap width.

The secret strategy for nudging one of the bars over to the right: *Add an empty column to your data table.*

Bonus: Download my spreadsheet. Apply my secret to your own projects. Post a comment and let me know how you’ve applied this technique!

]]>

A few weeks ago I was invited to speak at Chicago’s Harris Theater – definitely one of the coolest places I’ve ever explored in Chicago!

The attendees specialized in all different aspects of the performing arts – writing grants, collecting data to demonstrate how their organization is reaching outcomes, monitoring their group’s performance, and so on.

During the chart-choosing segment of the workshop, we thought about different ways of displaying fictional ticket sales data. In this example, I’m pretending that one of the performing arts groups is tracking how many tickets they’ve sold online, over the phone, and at their in-person box office for an upcoming show:

I write about chart-choosing and sketching a lot and wanted to share these ideas with you, too.

Sketching goes like this.

You grab your already-tallied data table, like the one shown above. You’ve already done a little number-crunching, simple stuff like sums and averages.

Then, you set your cell phone’s timer for 15 minutes.

*And you step away from your computer. *

Your job is to draw all the different versions of this dataset before you sit down to your computer. Draw, draw, draw. Aim for 5, 10, or 15 different types of graphs. The more you learn about data visualization, the more versions you’ll be able to draw. What would your dataset look like as a bar chart? As a stacked bar chart? A line graph? A pie chart? A tree map? I advise workshop participants to even draw the bad graphs, the really bad stuff, like 3D exploding pie charts, if it’s on their mind and taking up precious mental space. Get those thoughts out of your mind and onto the paper. Put a big **X** through the awful graphs if you need to.

Once your rough sketching is complete, take your drafts down the hall to your coworker. Think aloud. Talk about how *this graph emphasizes this one thing*, and *that graph highlights that other thing*. What’s the message your team is going for? Which graph matches that message the closest? Sometimes you know your message ahead of time; other times, you fine-tune your message during this sketching process.

And finally, I give you permission to return to the computer and make the most promising graph in your software program of choice. If you design graphs on your computer before sketching on paper, I guarantee that you’ll overlook a few options. You’ll be boxed-in by the software program’s limited chart gallery. Explore everything on paper first and figure out the software later.

Here’s what my sketches looked like. I’m starting with the most basic sketch:** a regular ol’ line graph** that just focuses on online ticket sales. When I draw, I often go through my data table methodically, often starting with just the first row of data — online sales — and peeking at the shape of those numbers. And what did I see? A tall, flat line.

Once I’ve got a handle on the first row in the table, I might add the second row, the third row, and so on, so that my brain can compare the categories to each other one at a time. **Here’s another regular ol’ line graph that shows all three ticket sales types together**. More contextual data = more background information available for decision-making thought processes.

Or, how about a **slope graph** for those audiences that don’t need to see all the peaks and valleys? Some people just want to see the big-picture, starting-and-ending points. The higher-ups, like donors and some supervisors, might fall into this category. I’m pretending that a supervisor knocked on my door and said, *Hey, how are we doing this year? And what about five years ago, when we launched that new sales strategy?* Slope graphs cut to the chase and make before/after comparisons easy.

If we’re aiming for big-picture findings, how about **a bar chart** that only displays the five-year sums? We could ignore the year-by-year numbers and only display the total sales numbers.

Returning to the multi-year version again… This fictional dataset is semi-spaghetti, meaning that the three lines started to intersect a little when they were all displayed in the same graph. Not so crowded that the criss-crossing gets in the way of interpreting the data, but, borderline. If your real dataset gets too zig-zaggy and criss-crossy, try breaking the single graph into three separate graphs with **a small multiples layout. **

Small multiples graphs let my brain interpret the graph piecemeal. I can check out the online sales and think about the implications of that pattern. Then, I shift my gaze a couple inches to the right and check out the phone sales. Finally, I shift my gaze to the right a bit more and examine the in-person box office sales. The layout guides my attention through the graph slowly, rather than overwhelming me by throwing all three lines on the page at once. I see the online, phone, and in-person patterns *both individually and as a whole.*

At this point in the sketching process, I began daydreaming about having a more interesting dataset and wishing that I would’ve included *goal* sales numbers alongside those *actual* ticket sales numbers. **A target line** might be dotted and/or in a lighter color to add much-needed context.

Or, maybe the viewers need to see part-to-whole patterns in a **stacked bar chart.** I transformed my table’s counts into percentages to see what proportion of tickets were sold online, over the phone, or in-person. The five-year total would be nudged to the right a bit.

Finally, a sketch that’ll make the purists cringe, **a pie chart**. Don’t worry, I teach my workshop participants about alternatives to pie charts. I might use a pie chart when I want my fictional viewers to see the part-to-whole comparisons. I’d use a darker color to draw their eyes towards one slice and add a sentence or two beside the chart to make sure their attention stays focused on that same slice.

One dataset, many correct options.

*Did you come up with additional sketches? *

]]>

And here’s another thought-starter question to consider: **Do your viewers want to see the data presented as-is, or do they want you to cut to the chase and interpret the data?**

Sometimes we need to serve as unbiased collectors and disseminators of information. This is especially true for my workshop participants in research-y roles who publish their data in places like peer-reviewed journal articles or formal research reports with lengthy appendices.

Other times, we need to get a message across in our graphs — and fast! This is especially true for my workshop participants who are consultants or those who work in communications-y roles. Their viewers are busy, busy, busy. Their viewers are hoping that someone else — *you!* — will dig through mountains of data and uncover the handful of nuggets worth paying attention to.

It’s not that one visualization style is better or worse than the other. They’re apples and oranges. I want you to figure out *when* your viewers are expecting to see each style and then learn *how* to switch back and forth.

The as-is approach is the easy one. You create a graph. You clean up the default settings a little, especially those cruddy parts like borders or too-thick grid lines. You select colors from the viewers’ color palette. I’ve been doing a lot of design projects with USAID contractors lately; this blog post has USAID’s exact shades of blue and red.

The storytelling approach can seem like the harder one. But! It’s not impossible! This is just a newer style for most of us.

I want to make it easier for you.** Here are four design strategies you can use to tell a story in your graph:**

**Descriptive titles****Descriptive subtitles****Annotations****Saturation**

Use one technique or all four, it’s up to you. Let’s check out a few examples. The graphs on the left *present the data as-is* while the graphs on the right* interpret the data. *

This first bar chart uses a **descriptive title** and **saturation** to show how chocolate is the preferred ice cream flavor.

The **descriptive title** and **saturation** emphasize how Project A is performing particularly well.

A **descriptive title**, a **descriptive subtitle**, and an **annotation** explain how the agency is funding more studies to measure the effectiveness of their programming, which is due to their new policy. Annotations are call-out boxes that give viewers more background information about a specific data point or two, like why we’re seeing a sudden increase or decrease.

The as-is version on the left gives equal emphasis to both subgroups of students because the red and blue are both relatively dark colors. This is USAID’s exact shade of red (with data that is obviously not from USAID). The red is tricky because we’re accustomed to stoplight color-coding in which green means “good” and red means “caution!” or “bad!” In addition to the red and blue being equally saturated, we also have to be careful with the cultural connotations of using red here.

The interpreted-version on the right uses **saturation** to highlight the percentage of students who qualify for free and reduced meals.

Finally, this dot plot uses a **descriptive subtitle** and** saturation** to draw viewers’ eyes towards the teachers’ survey responses.

*Let me know: Which approach do you follow most often? And who are your viewers? Their preferences drive every decision about how you’ll format your graph, after all. Are your viewers expecting you to present the data as-is, or do they prefer that you offer interpretations through titles, subtitles, annotations, or saturation*?

]]>