Statistical Analysis with Excel For Dummies

Learn all of Excel's statistical tools

Test your hypotheses and draw conclusions

Use Excel to give meaning to your data

Use Excel to interpret stats

Statistical analysis with Excel is incredibly useful—and this book shows you that it can be easy, too! You'll discover how to use Excel's perfectly designed tools to analyze and understand data, predict trends, make decisions, and more. Tackle the technical aspects of Excel and start using them to interpret your data!

Inside...

  • Covers Excel 2016 for Windows® & Mac® users
  • Check out new Excel stuff
  • Make sense of worksheets
  • Create shortcuts
  • Tool around with analysis
  • Use Quick Statistics
  • Graph your data
  • Work with probability
  • Handle random variables
"1100256850"
Statistical Analysis with Excel For Dummies

Learn all of Excel's statistical tools

Test your hypotheses and draw conclusions

Use Excel to give meaning to your data

Use Excel to interpret stats

Statistical analysis with Excel is incredibly useful—and this book shows you that it can be easy, too! You'll discover how to use Excel's perfectly designed tools to analyze and understand data, predict trends, make decisions, and more. Tackle the technical aspects of Excel and start using them to interpret your data!

Inside...

  • Covers Excel 2016 for Windows® & Mac® users
  • Check out new Excel stuff
  • Make sense of worksheets
  • Create shortcuts
  • Tool around with analysis
  • Use Quick Statistics
  • Graph your data
  • Work with probability
  • Handle random variables
34.99 In Stock
Statistical Analysis with Excel For Dummies

Statistical Analysis with Excel For Dummies

by Joseph Schmuller
Statistical Analysis with Excel For Dummies

Statistical Analysis with Excel For Dummies

by Joseph Schmuller

Paperback

$34.99 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

Learn all of Excel's statistical tools

Test your hypotheses and draw conclusions

Use Excel to give meaning to your data

Use Excel to interpret stats

Statistical analysis with Excel is incredibly useful—and this book shows you that it can be easy, too! You'll discover how to use Excel's perfectly designed tools to analyze and understand data, predict trends, make decisions, and more. Tackle the technical aspects of Excel and start using them to interpret your data!

Inside...

  • Covers Excel 2016 for Windows® & Mac® users
  • Check out new Excel stuff
  • Make sense of worksheets
  • Create shortcuts
  • Tool around with analysis
  • Use Quick Statistics
  • Graph your data
  • Work with probability
  • Handle random variables

Product Details

ISBN-13: 9781119271154
Publisher: Wiley
Publication date: 07/25/2016
Series: For Dummies Books
Pages: 560
Sales rank: 1,108,081
Product dimensions: 7.40(w) x 9.20(h) x 1.10(d)

About the Author

Joseph Schmuller, PhD, is a Research Scholar at the University of North Florida. He is a former member of the American Statistical Association and has taught statistics at the undergraduate, honors undergraduate, and graduate levels.

Read an Excerpt

Statistical Analysis with Excel For Dummies


By Joseph Schmuller

JOHN WILEY & SONS

ISBN: 0-7645-7594-5


Chapter One

Evaluating Data in the Real World

In This Chapter

* Introducing statistical concepts

* Generalizing from samples to populations

* Getting into probability

* Making decisions

* Understanding important Excel fundamentals

The field of statistics is all about decision-making - decision-making based on groups of numbers. Statisticians constantly ask questions: What do the numbers tell us? What are the trends? What predictions can we make?

To answer these questions, statisticians have developed an impressive array of analytical tools. These tools help us to make sense of the mountains of data that are out there waiting for us to delve into, and to understand the numbers we generate in the course of our own work.

The Statistical (and Related) Notions You Just Have to Know

Because intensive calculation is often part and parcel of the statistician's toolset, many people have the misconception that statistics is about number crunching. Number crunching is just one small part of the path to sound decisions, however.

By shouldering the number-crunching load, software increases our speed of traveling down that path. Some software packages are specialized for statistical analysis and contain many of the tools that statisticians use. Although not marketed specifically as a statistical package, Excelprovides a number of these tools, which is the reason I wrote this book.

I said that number crunching is a small part of the path to sound decisions. The most important part is the concepts statisticians work with, and that's what I'll talk about for most of the rest of this chapter.

After that, I'll tell you about some important Excel fundamentals.

Samples and populations

On election night, TV commentators routinely predict the outcome of elections before the polls close. Most of the time they're right. How do they do that?

The trick is to interview a sample of voters after they cast their ballots. Assuming the voters tell the truth about whom they voted for, and assuming the sample truly represents the population, network analysts use the sample data to generalize to the population of voters.

This is the job of a statistician - to use the findings from a sample to make a decision about the population from which the sample comes. But sometimes those decisions don't turn out the way the numbers predicted. Flawed preelection polling led to the memorable picture of President Harry Truman holding up a copy of the Chicago Daily Tribune with the famous, but wrong, headline "Dewey Defeats Truman" after the 1948 election. Part of the statistician's job is to express how much confidence he or she has in the decision. Another election-related example speaks to the idea of confidence in a decision. Pre-election polls (again, assuming a representative sample of voters) tell you the percentage of sampled voters who prefer each candidate. The polling organization adds how accurate they believe the polls are. When you hear a newscaster say something like "accurate to within 3 percent," you're hearing a judgment about confidence.

Here's another example. Suppose you've been assigned to find the average reading speed of all fifth-grade children in the U.S., but you haven't got the time or the money to test them all. What would you do?

Your best bet is to take a sample of fifth-graders, measure their reading speeds (in words per minute), and calculate the average of the reading speeds in the sample. You can then use the sample average as an estimate of the population average.

Estimating the population average is one kind of inference that statisticians make from sample data. I discuss inference in more detail in the section "Inferential Statistics."

REMEMBER

Now for some terminology you have to know: Characteristics of a population (like the population average) are called parameters, and characteristics of a sample (like the sample average) are called statistics. When you confine your field of view to samples, your statistics are descriptive. When you broaden your horizons and concern yourself with populations, your statistics are inferential.

REMEMBER

Now for a notation convention you have to know: Statisticians use Greek letters ([mu], [alpha], [rho]) to stand for parameters, and English letters ([bar]x, s, r) to stand for statistics. Figure 1-1 summarizes the relationship between populations and samples, and parameters and statistics.

Variables: Dependent and independent

Simply put, a variable is something that can take on more than one value. (Something that can have only one value is called a constant.) Some variables you might be familiar with are today's temperature, the Dow Jones Industrial Average, your age, and the value of the dollar against the euro.

Statisticians care about two kinds of variables, independent and dependent. Each kind of variable crops up in any study or experiment, and statisticians assess the relationship between them.

For example, imagine a new way of teaching reading that's intended to increase the reading speed of fifth-graders. Before putting this new method into schools, it would be a good idea to test it. To do that, a researcher would randomly assign a sample of fifth-grade students to one of two groups: One group receives instruction via the new method; the other receives instruction via traditional methods. Before and after both groups receive instruction, the researcher measures the reading speeds of all the children in this study. What happens next? I'll get to that in the upcoming section entitled "Inferential Statistics: Testing Hypotheses."

For now, understand that the independent variable here is Method of Instruction. The two possible values of this variable are New and Traditional. The dependent variable is reading speed.

REMEMBER

In general, the idea is to try to find out if changes in the independent variable are associated with changes in the dependent variable.

REMEMBER

In the examples that appear throughout the book, I'll show you how to use Excel to calculate various characteristics of groups of scores. I'd like you to bear in mind that each time I show you a group of scores, I'm really talking about the values of a dependent variable.

Types of data

Data come in four kinds. When you work with a variable, the way you work with it depends on what kind of data it is.

The first variety is called nominal data. If a number is a piece of nominal data, it's just a name. Its value doesn't signify anything. A good example is the number on an athlete's jersey. It's just a way of identifying the athlete and distinguishing him or her from teammates. The number doesn't indicate the athlete's level of skill.

Next comes ordinal data. Ordinal data are all about order, and numbers begin to take on meaning over and above just being identifiers. A higher number indicates the presence of more of a particular attribute than a lower number. One example is Moh's Scale. Used since 1822, it's a scale whose values are 1 through 10. Mineralogists use this scale to rate the hardness of substances. Diamond, rated at 10, is the hardest. Talc, rated at 1, is the softest. A substance that has a given rating can scratch any substance that has a lower rating.

What's missing from Moh's Scale (and from all ordinal data) is the idea of equal intervals and equal differences. The difference between a hardness of 10 and a hardness of 8 is not the same as the difference between a hardness of 6 and a hardness of 4.

Interval data provides equal differences. Fahrenheit temperatures provide an example of interval data. The difference between 60 degrees and 70 degrees is the same as the difference between 80 degrees and 90 degrees.

Here's something that might surprise you about Fahrenheit temperatures: A temperature of 100 degrees is not twice as hot as a temperature of 50 degrees. For ratio statements (twice as much as, half as much as) to be valid, zero has to mean the complete absence of the attribute you're measuring. A temperature of 0 degrees F doesn't mean the absence of heat - it's just an arbitrary point on the Fahrenheit scale.

The last data type, ratio data, includes a meaningful zero point. For temperatures, the Kelvin scale gives us ratio data. One hundred degrees Kelvin is twice as hot as 50 degrees Kelvin. This is because the Kelvin zero point is absolute zero, where all molecular motion (the basis of heat) stops. Another example is a ruler. Eight inches is twice as long as 4 inches. A length of zero means a complete absence of length.

REMEMBER

Any of these types can form the basis for an independent variable or a dependent variable. The analytical tools you use depend on the type of data you're dealing with.

A little probability

When statisticians make decisions, they express their confidence about those decisions in terms of probability. They can never be certain about what they decide. They can only tell you how probable their conclusions are.

So what is probability? The best way to attack this is with a few examples. If you toss a coin, what's the probability that it comes up heads? Intuitively, you know that if the coin is fair, you have a 50-50 chance of heads and a 50-50 chance of tails. In terms of the kinds of numbers associated with probability, that's 1/2.

How about rolling a die (one member of a pair of dice)? What's the probability that you roll a 3? Hmmm ... a die has six faces and one of them is 3, so that ought to be 1/6, right? Right.

Here's one more. You have a standard deck of playing cards. You select one card at random. What's the probability that it's a club? Well, a deck of cards has four suits, so that answer is 1/4.

I think you're getting the picture. If you want to know the probability that an event occurs, figure out how many ways that event can happen and divide by the total number of events that can happen. In each of the three examples, the event we were interested in (head, 3, or club) only happens one way.

Things can get a bit more complicated. When you toss a die, what's the probability that you'll roll a 3 or a 4? Now you're talking about two ways the event you're interested in can occur, so that's (1 + 1)/6 = 2/6 = 1/3. What about the probability of rolling an even number? That has to be 2, 4, or 6, and the probability is (1 + 1 + 1)/6 = 3/6 = 1/2.

On to another kind of probability question. Suppose you roll a die and toss a coin at the same time. What's the probability you roll a 3 and the coin comes up heads? Consider all the possible events that could occur when you roll a die and toss a coin at the same time. Your outcome could be a head and 1-6, or a tail and 1-6. That's a total of 12 possibilities. The head-and-3 combination can only happen one way. So, the answer is 1/12.

In general the formula for the probability that a particular event occurs is

Pr(event) = Number of ways the event can occur/Total number of possible events

I began this section by saying that statisticians express their confidence about their decisions in terms of probability, which is really why I brought up this topic in the first place. This line of thinking leads us to conditional probability - the probability that an event occurs given that some other event occurs. For example, suppose I roll a die, take a look at it (so that you can't see it), and I tell you that I've rolled an even number. What's the probability that I've rolled a 2? Ordinarily, the probability of a 2 is 1/6, but I've narrowed the field. I've eliminated the three odd numbers (1, 3, and 5) as possibilities. In this case, only the three even numbers (2, 4, and 6) are possible, so now the probability of rolling a 2 is 1/3.

Exactly how does conditional probability play into statistical analysis? Read on.

Inferential Statistics: Testing Hypotheses

In advance of doing a study, a statistician draws up a tentative explanation - a hypothesis - as to why the data might come out a certain way. After the study is complete and the sample data are all tabulated, he or she faces the essential decision a statistician has to make - whether or not to reject the hypothesis.

That decision is wrapped in a conditional probability question: What's the probability of obtaining the data, given that this hypothesis is correct? Statistical analysis provides tools to calculate the probability. If the probability turns out to be low, the statistician rejects the hypothesis.

Here's an example. Suppose you're interested in whether or not a particular coin is fair - whether it has an equal chance of coming up heads or tails. To study this issue, you'd take the coin and toss it a number of times - say 100. These 100 tosses make up your sample data. Starting from the hypothesis that the coin is fair, you'd expect that the data in your sample of 100 tosses would show 50 heads and 50 tails.

If it turns out to be 99 heads and one tail, you'd undoubtedly reject the fair coin hypothesis. Why? The conditional probability of getting 99 heads and one tail given a fair coin is very low. Wait a second. The coin could still be fair and you just happened to get a 99-1 split, right? Absolutely. In fact, you never really know. You have to gather the sample data (the results from 100 tosses) and make a decision. Your decision might be right, or it might not.

Juries face this all the time. They have to decide among competing hypotheses that explain the evidence in a trial. (Think of the evidence as data.) One hypothesis is that the defendant is guilty. The other is that the defendant is not guilty. Jury members have to consider the evidence and, in effect, answer a conditional probability question: What's the probability of the evidence given that the defendant is not guilty? The answer to this question determines the verdict.

Null and alternative hypotheses

Consider once again that coin-tossing study I just mentioned. The sample data are the results from the 100 tosses. Before tossing the coin, you might start with the hypothesis that the coin is a fair one, so that you expect an equal number of heads and tails. This starting point is called the null hypothesis. The statistical notation for the null hypothesis is [H.sub.0]. According to this hypothesis, any heads-tails split in the data is consistent with a fair coin. Think of it as the idea that nothing in the results of the study is out of the ordinary.

An alternative hypothesis is possible - that the coin isn't a fair one, and it's loaded to produce an unequal number of heads and tails. This hypothesis says that any heads-tails split is consistent with an unfair coin. The alternative hypothesis is called, believe it or not, the alternative hypothesis. The statistical notation for the alternative hypothesis is [H.sub.1].

With the hypotheses in place, toss the coin 100 times and note the number of heads and tails. If the results are something like 90 heads and 10 tails, it's a good idea to reject [H.sub.0]. If the results are around 50 heads and 50 tails, don't reject [H.sub.0].

Similar ideas apply to the reading-speed example I gave earlier. One sample of children receives reading instruction under a new method designed to increase reading speed, the other learns via a traditional method. Measure the children's reading speeds before and after instruction, and tabulate the improvement for each child. The null hypothesis, [H.sub.0], is that one method isn't different from the other. If the improvements are greater with the new method than with the traditional method - so much greater that it's unlikely that the methods aren't different from one another - reject [H.sub.0]. If they're not, don't reject [H.sub.0].

REMEMBER

Notice that I didn't say "accept [H.sub.0]." The way the logic works, you never accept a hypothesis. You either reject [H.sub.0] or don't reject [H.sub.0].

Notice also that in the coin-tossing example I said around 50 heads and 50 tails. What does around mean? Also, I said if it's 90-10, reject [H.sub.0]. What about 85-15? 80-20? 70-30? Exactly how much different from 50-50 does the split have to be for you to reject [H.sub.0]? In the reading-speed example, how much greater does the improvement have to be to reject [H.sub.0]?

I won't answer these questions now. Statisticians have formulated decision rules for situations like this, and we'll explore those rules throughout the book.

Two types of error

Whenever you evaluate the data from a study and decide to reject [H.sub.0] or to not reject [H.sub.0], you can never be absolutely sure. You never really know what the true state of the world is. In the context of the coin-tossing example, that means you never know for certain if the coin is fair or not. All you can do is make a decision based on the sample data you gather. If you want to be certain about the coin, you'd have to have the data for the entire population of tosses - which means you'd have to keep tossing the coin until the end of time.

(Continues...)



Excerpted from Statistical Analysis with Excel For Dummies by Joseph Schmuller Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

Introduction1
About This Book1
What You Can Safely Skip2
Foolish Assumptions2
How This Book Is Organized2
Icons Used in This Book4
Where to Go from Here4
Part IStatistics and Excel: A Marriage Made in Heaven5
Chapter 1Evaluating Data in the Real World7
The Statistical (and Related) Notions You Just Have to Know7
Samples and populations8
Variables: Dependent and independent9
Types of data10
A little probability11
Inferential Statistics: Testing Hypotheses12
Null and alternative hypotheses13
Two types of error14
Some Things About Excel You Absolutely Have to Know15
Autofilling cells15
Referencing cells17
Chapter 2Understanding Excel's Statistical Capabilities21
Getting Started21
Setting Up for Statistics22
Worksheet functions23
Those oldies but goodies26
Array functions28
Data analysis tools31
Part IIDescribing Data35
Chapter 3Show and Tell: Graphing Data37
Why Use Graphs?37
Some Fundamentals39
Excel's Graphics Capabilities39
The Chart Wizard40
Becoming a Columnist41
Stacking the columns46
One more thing47
Slicing the Pie48
Pulling the slices apart50
A word from the wise50
Drawing the Line51
Passing the Bar53
The Plot Thickens56
Chapter 4Finding Your Center61
Means: The Lore of Averages61
Calculating the mean61
AVERAGE and AVERAGEA63
TRIMMEAN64
Other means to an end66
Medians: Caught in the Middle67
Finding the median68
MEDIAN68
Statistics A La Mode69
Finding the mode69
MODE70
Chapter 5Deviating from the Average71
Measuring Variation72
Averaging squared deviations: Variance and how to calculate it72
VARP and VARPA75
Sample variance76
VAR and VARA77
Back to the Roots: Standard Deviation78
Population standard deviation78
STDEVP and STDEVPA78
Sample standard deviation79
STDEV and STDEVA79
Related Functions80
DEVSQ80
Average deviation81
AVEDEV82
Chapter 6Meeting Standards and Standings83
Catching Some Zs83
Characteristics of z-scores84
Bonds vs. The Bambino84
Exam scores85
STANDARDIZE86
Where Do You Stand?89
RANK89
LARGE and SMALL91
PERCENTILE and PERCENTRANK91
Data analysis tool: Rank and Percentile93
Chapter 7Summarizing It All97
Counting Out97
COUNT, COUNTA, COUNTBLANK, and COUNTIF97
The Long and Short of It99
MAX, MAXA, MIN, and MINA99
Getting Esoteric101
SKEW102
KURT104
Tuning in the Frequency106
FREQUENCY106
Data analysis tool: Histogram107
Can You Give Me a Description?109
Data analysis tool: Descriptive Statistics109
Instant Statistics111
Chapter 8What's Normal?113
Hitting the Curve113
Digging deeper114
Parameters of a normal distribution115
NORMDIST116
NORMINV118
A Distinguished Member of the Family119
NORMSDIST120
NORMSINV122
Part IIIDrawing Conclusions from Data123
Chapter 9The Confidence Game: Estimation125
What Is a Sampling Distribution?125
An EXTREMELY Important Idea: The Central Limit Theorem127
Simulating the Central Limit Theorem128
The Limits of Confidence133
Finding confidence limits for a mean133
CONFIDENCE135
Fit to a t137
TINV138
Chapter 10One-Sample Hypothesis Testing141
Hypotheses, Tests, and Errors141
Hypothesis tests and sampling distributions142
Catching Some Zs Again145
Ztest147
t for One149
Tdist150
Testing a Variance151
Chidist152
Chiinv153
Chapter 11Two-Sample Hypothesis Testing155
Hypotheses Built for Two155
Sampling Distributions Revisited156
Applying the Central Limit Theorem157
Zs once more159
Data analysis tool: z-Test: Two Sample for Means160
t for Two162
Like peas in a pod: Equal variances163
Like p's and q's: Unequal variances164
TTEST165
Data analysis tools: t-Test: Two Sample166
A Matched Set: Hypothesis Testing for Paired Samples169
TTEST for matched samples171
Data analysis tool: t-Test: Paired Two Sample for Means172
Testing Two Variances174
Using F in conjunction with t176
FTEST177
FDIST178
FINV179
Data analysis tool: F-Test Two-Sample for Variances180
Chapter 12Testing More Than Two Samples183
Testing More Than Two183
A thorny problem184
A solution185
Meaningful relationships189
After the F-test190
Data analysis tool: Anova: Single Factor193
Comparing the means194
Another Kind of Hypothesis, Another Kind of Test197
Working with repeated measures ANOVA197
Getting trendy199
Data analysis tool: Anova: Two-Factor Without Replication202
Analyzing trend204
Chapter 13Slightly More Complicated Testing207
Cracking the Combinations207
Breaking down the variances208
Data analysis tool: Anova: Two-Factor Without Replication209
Cracking the Combinations Again211
Rows and columns212
Interactions213
The analysis213
Data analysis tool: Anova: Two-Factor With Replication215
Chapter 14Regression: Linear and Multiple219
The Plot of Scatter219
Graphing Lines221
Regression: What a Line!223
Using regression for forecasting225
Variation around the regression line225
Testing hypotheses about regression227
Worksheet Functions for Regression232
SLOPE, INTERCEPT, and STEYX232
FORECAST234
Array function: TREND234
Array function: LINEST238
Data Analysis Tool: Regression239
Tabled output241
Graphic output242
Juggling Many Relationships at Once: Multiple Regression244
Excel Tools for Multiple Regression245
TREND revisited245
LINEST revisited247
Regression data analysis tool revisited249
Chapter 15Correlation: The Rise and Fall of Relationships253
Scatterplots Again253
Understanding Correlation254
Correlation and Regression256
Testing Hypotheses About Correlation259
Is a correlation coefficient greater than zero?259
Do two correlation coefficients differ?260
Worksheet Functions for Correlation261
CORREL and PEARSON262
RSQ263
COVAR263
Data Analysis Tool: Correlation264
Tabled output265
Data Analysis Tool: Covariance268
Testing Hypotheses About Correlation269
Worksheet Functions: FISHER, FISHERINV269
Part IVWorking with Probability271
Chapter 16Introducing Probability273
What Is Probability?273
Experiments, trials, events, and sample spaces274
Sample spaces and probability274
Compound Events275
Union and intersection275
Intersection again276
Conditional Probability277
Working with the probabilities277
The foundation of hypothesis testing278
Large Sample Spaces278
Permutations279
Combinations280
Worksheet Functions280
FACT280
PERMUT281
COMBIN281
Random Variables: Discrete and Continuous282
Probability Distributions and Density Functions282
The Binomial Distribution284
Worksheet Functions286
BINOMDIST286
NEGBINOMDIST287
Hypothesis Testing with the Binomial Distribution288
CRITBINOM289
More on hypothesis testing290
The Hypergeometric Distribution291
HYPERGEOMDIST292
Chapter 17More on Probability295
Beta295
BETADIST297
BETAINV298
Poisson300
POISSON301
Gamma302
GAMMADIST303
GAMMAINV304
Exponential305
EXPONDIST305
Chapter 18A Career in Modeling307
Modeling a Distribution307
Plunging into the Poisson distribution308
Using POISSON309
Testing the model's fit310
A word about CHITEST312
Playing ball with a model313
A Simulating Discussion315
Taking a chance: The Monte Carlo method315
Loading the dice316
Simulating the Central Limit Theorem319
Part VThe Part of Tens323
Chapter 19Ten Statistical and Graphical Tips and Traps325
Significant Doesn't Always Mean Important325
Trying to Not Reject a Null Hypothesis Has a Number of Implications326
Regression Isn't Always Linear326
Extrapolating Beyond a Sample Scatterplot Is a Bad Idea327
Examine the Variability Around a Regression Line327
A Sample Can Be Too Large327
Consumers: Know Your Axes328
Graphing a Categorical Variable as Though It's a Quantitative Variable Is Just Wrong328
Whenever Appropriate, Include Variability in Your Graph329
Be Careful When Relating Statistics-Book Concepts to Excel330
Chapter 20Ten (Or So) Things That Didn't Fit in Any Other Chapter331
Some Forecasting331
A moving experience331
How to be a smoothie, exponentially333
Graphing the Standard Error of the Mean335
Probabilities and Distributions337
PROB337
WEIBULL338
Drawing Samples339
Testing Independence: The True Use of CHITEST340
Logarithmica Esoterica342
What is a logarithm?342
What is e? 
From the B&N Reads Blog

Customer Reviews