IMPORTANT COVID-19 UPDATES

9781119337065

Statistical Analysis With R for Dummies

by
  • ISBN13:

    9781119337065

  • ISBN10:

    1119337062

  • Format: Paperback
  • Copyright: 2017-03-20
  • Publisher: For Dummies

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $29.99 Save up to $16.49
  • Rent Book $13.50
    Add to Cart Free Shipping

    TERM
    PRICE
    DUE
    IN STOCK USUALLY SHIPS IN 24 HOURS.
    *This item is part of an exclusive publisher rental program and requires an additional convenience fee. This fee will be reflected in the shopping cart.

Supplemental Materials

What is included with this book?

Summary

Understanding the world of R programming and analysis has never been easier

Most guides to R, whether books or online, focus on R functions and procedures. But now, thanks to Statistical Analysis with R For Dummies, you have access to a trusted, easy-to-follow guide that focuses on the foundational statistical concepts that R addresses—as well as step-by-step guidance that shows you exactly how to implement them using R programming.

People are becoming more aware of R every day as major institutions are adopting it as a standard. Part of its appeal is that it's a free tool that's taking the place of costly statistical software packages that sometimes take an inordinate amount of time to learn. Plus, R enables a user to carry out complex statistical analyses by simply entering a few commands, making sophisticated analyses available and understandable to a wide audience. Statistical Analysis with R For Dummies enables you to perform these analyses and to fully understand their implications and results.

  • Gets you up to speed on the #1 analytics/data science software tool
  • Demonstrates how to easily find, download, and use cutting-edge community-reviewed methods in statistics and predictive modeling
  • Shows you how R offers intel from leading researchers in data science, free of charge
  • Provides information on using R Studio to work with R

Get ready to use R to crunch and analyze your data—the fast and easy way!

Author Biography

Joseph Schmuller, PhD, has taught undergraduate and graduate statistics, and has 25 years of IT experience. The author of four editions of Statistical Analysis with Excel For Dummies and three editions of Teach Yourself UML in 24 Hours (SAMS), he has created online coursework for Lynda.com and is a former Editor in Chief of PC AI magazine. He is a Research Scholar at the University of North Florida.

Table of Contents

Introduction 1

About This Book 1

Similarity with This Other For Dummies Book 2

What You Can Safely Skip 2

Foolish Assumptions 2

How This Book Is Organized 3

Part 1: Getting Started with Statistical Analysis with R 3

Part 2: Describing Data 3

Part 3: Drawing Conclusions from Data 3

Part 4: Working with Probability 3

Part 5: The Part of Tens 4

Online Appendix A: More on Probability 4

Online Appendix B: Non-Parametric Statistics 4

Online Appendix C: Ten Topics That Just Didn’t Fit in Any Other Chapter 4

Icons Used in This Book 4

Where to Go from Here 5

Part 1: Getting Started With Statistical Analysis with R 7

Chapter 1: Data, Statistics, and Decisions 9

The Statistical (and Related) Notions You Just Have to Know 10

Samples and populations 10

Variables: Dependent and independent 11

Types of data 12

A little probability 13

Inferential Statistics: Testing Hypotheses 14

Null and alternative hypotheses 14

Two types of error 15

Chapter 2: R: What It Does and How It Does It 17

Downloading R and RStudio 18

A Session with R 21

The working directory 21

So let’s get started, already 22

Missing data 26

R Functions 26

User-Defined Functions 28

Comments 29

R Structures 29

Vectors 30

Numerical vectors 30

Matrices 31

Factors 33

Lists 34

Lists and statistics 35

Data frames 36

Packages 39

More Packages 42

R Formulas 43

Reading and Writing 44

Spreadsheets 44

CSV files 46

Text files 47

Part 2: Describing Data 49

Chapter 3: Getting Graphic 51

Finding Patterns 51

Graphing a distribution 52

Bar-hopping 53

Slicing the pie 54

The plot of scatter 55

Of boxes and whiskers 56

Base R Graphics 57

Histograms 57

Adding graph features 59

Bar plots 60

Pie graphs 62

Dot charts 62

Bar plots revisited 64

Scatter plots 67

Box plots 71

Graduating to ggplot2 71

Histograms 72

Bar plots 74

Dot charts 75

Bar plots re-revisited 78

Scatter plots 82

Box plots 86

Wrapping Up 89

Chapter 4: Finding Your Center 91

Means: The Lure of Averages 91

The Average in R: mean() 93

What’s your condition? 93

Eliminate $-signs forth with() 94

Exploring the data 95

Outliers: The flaw of averages 96

Other means to an end 97

Medians: Caught in the Middle 99

The Median in R: median() 100

Statistics à la Mode 101

The Mode in R 101

Chapter 5: Deviating from the Average 103

Measuring Variation 104

Averaging squared deviations: Variance and how to calculate it 104

Sample variance 107

Variance in R 107

Back to the Roots: Standard Deviation 108

Population standard deviation 108

Sample standard deviation 109

Standard Deviation in R 109

Conditions, Conditions, Conditions   110

Chapter 6: Meeting Standards and Standings 111

Catching Some Z’s 112

Characteristics of z-scores 112

Bonds versus the Bambino 113

Exam scores 114

Standard Scores in R 114

Where Do You Stand? 117

Ranking in R 117

Tied scores 117

Nth smallest, Nth largest 118

Percentiles 118

Percent ranks 120

Summarizing 121

Chapter 7: Summarizing It All 123

How Many? 123

The High and the Low 125

Living in the Moments 125

A teachable moment 126

Back to descriptives 126

Skewness 127

Kurtosis 130

Tuning in the Frequency 131

Nominal variables: table() et al 131

Numerical variables: hist() 132

Numerical variables: stem() 138

Summarizing a Data Frame 139

Chapter 8: What’s Normal? 143

Hitting the Curve 143

Digging deeper 144

Parameters of a normal distribution 145

Working with Normal Distributions 147

Distributions in R 147

Normal density function 147

Cumulative density function 152

Quantiles of normal distributions 155

Random sampling 156

A Distinguished Member of the Family 158

Part 3: Drawing Conclusions from Data 161

Chapter 9: The Confidence Game: Estimation 163

Understanding Sampling Distributions 164

An EXTREMELY Important Idea: The Central Limit Theorem 165

(Approximately) Simulating the central limit theorem 167

Predictions of the central limit theorem 171

Confidence: It Has Its Limits! 173

Finding confidence limits for a mean 173

Fit to a t 175

Chapter 10: One-Sample Hypothesis Testing 179

Hypotheses, Tests, and Errors 179

Hypothesis Tests and Sampling Distributions 181

Catching Some Z’s Again 183

Z Testing in R 185

t for One 187

t Testing in R 188

Working with t-Distributions 189

Visualizing t-Distributions 190

Plotting t in base R graphics 191

Plotting t in ggplot2 192

One more thing about ggplot2 197

Testing a Variance 198

Testing in R 199

Working with Chi-Square Distributions 201

Visualizing Chi-Square Distributions 201

Plotting chi-square in base R graphics 202

Plotting chi-square in ggplot2 203

Chapter 11: Two-Sample Hypothesis Testing 205

Hypotheses Built for Two 205

Sampling Distributions Revisited 206

Applying the central limit theorem 207

Z’s once more 208

Z-testing for two samples in R 210

t for Two 212

Like Peas in a Pod: Equal Variances 212

t-Testing in R 214

Working with two vectors 214

Working with a data frame and a formula 215

Visualizing the results 216

Like p’s and q’s: Unequal variances 219

A Matched Set: Hypothesis Testing for Paired Samples 220

Paired Sample t-testing in R 222

Testing Two Variances 222

F-testing in R 224

F in conjunction with t 225

Working with F-Distributions 226

Visualizing F-Distributions 226

Chapter 12: Testing More than Two Samples 231

Testing More Than Two 231

A thorny problem 232

A solution 233

Meaningful relationships 237

ANOVA in R 237

Visualizing the results 239

After the ANOVA 239

Contrasts in R 242

Unplanned comparisons 243

Another Kind of Hypothesis, Another Kind of Test 244

Working with repeated measures ANOVA 245

Repeated measures ANOVA in R 247

Visualizing the results 249

Getting Trendy 250

Trend Analysis in R 254

Chapter 13: More Complicated Testing 255

Cracking the Combinations 255

Interactions 257

The analysis 257

Two-Way ANOVA in R 259

Visualizing the two-way results 261

Two Kinds of Variables  at Once 263

Mixed ANOVA in R 266

Visualizing the Mixed ANOVA results 268

After the Analysis 269

Multivariate Analysis of Variance 270

MANOVA in R 271

Visualizing the MANOVA results 273

After the analysis 275

Chapter 14: Regression: Linear, Multiple, and the General Linear Model 277

The Plot of Scatter 277

Graphing Lines 279

Regression: What a Line! 281

Using regression for forecasting 283

Variation around the regression line 283

Testing hypotheses about regression 285

Linear Regression in R 290

Features of the linear model 292

Making predictions 292

Visualizing the scatter plot and regression line 293

Plotting the residuals 294

Juggling Many Relationships at Once: Multiple Regression 295

Multiple regression in R 297

Making predictions 298

Visualizing the 3D scatter plot and regression plane 298

ANOVA: Another Look 301

Analysis of Covariance: The Final Component of the GLM 305

But wait — there’s more 311

Chapter 15: Correlation: The Rise and Fall of Relationships 313

Scatter plots Again 313

Understanding Correlation 314

Correlation and Regression 316

Testing Hypotheses About Correlation 319

Is a correlation coefficient greater than zero? 319

Do two correlation coefficients differ? 320

Correlation in R 322

Calculating a correlation coefficient 322

Testing a correlation coefficient 322

Testing the difference between two correlation coefficients 323

Calculating a correlation matrix 324

Visualizing correlation matrices 324

Multiple Correlation 326

Multiple correlation in R 327

Adjusting R-squared 328

Partial Correlation 329

Partial Correlation in R 330

Semipartial Correlation 331

Semipartial Correlation in R 332

Chapter 16: Curvilinear Regression: When Relationships Get Complicated 335

What Is a Logarithm? 336

What Is e? 338

Power Regression 341

Exponential Regression 346

Logarithmic Regression 350

Polynomial Regression: A Higher Power 354

Which Model Should You Use? 358

Part 4: Working with Probability 359

Chapter 17: Introducing Probability 361

What Is Probability? 361

Experiments, trials, events, and sample spaces 362

Sample spaces and probability 362

Compound Events 363

Union and intersection 363

Intersection again 364

Conditional Probability 365

Working with the probabilities 366

The foundation of hypothesis testing 366

Large Sample Spaces 366

Permutations 367

Combinations 368

R Functions for Counting Rules 369

Random Variables: Discrete and Continuous 371

Probability Distributions and Density Functions 371

The Binomial Distribution 374

The Binomial and Negative Binomial in R 375

Binomial distribution 375

Negative binomial distribution 377

Hypothesis Testing with the Binomial Distribution 378

More on Hypothesis Testing: R versus Tradition 380

Chapter 18: Introducing Modeling 383

Modeling a Distribution 383

Plunging into the Poisson distribution 384

Modeling with the Poisson distribution 385

Testing the model’s fit 388

A word about chisq.test() 391

Playing ball with a model 392

A Simulating Discussion 396

Taking a chance: The Monte Carlo method 396

Loading the dice 396

Simulating the central limit theorem 401

Part 5: The Part of Tens 405

Chapter 19: Ten Tips for Excel Emigrés 407

Defining a Vector in R Is Like Naming a Range in Excel 407

Operating on Vectors Is Like Operating on Named Ranges 408

Sometimes Statistical Functions Work the Same Way 412

  And Sometimes They Don’t 412

Contrast: Excel and R Work with Different Data Formats 413

Distribution Functions Are (Somewhat) Similar 414

A Data Frame Is (Something) Like a Multicolumn Named Range 416

The sapply() Function Is Like Dragging 417

Using edit() Is (Almost) Like Editing a Spreadsheet 418

Use the Clipboard to Import a Table from Excel into R 419

Chapter 20: Ten Valuable Online R Resources 421

Websites for R Users 421

R-bloggers 421

Microsoft R Application Network 422

Quick-R 422

RStudio Online Learning 422

Stack Overflow 422

Online Books and Documentation 423

R manuals 423

R documentation 423

RDocumentation 423

YOU CANanalytics 423

The R Journal 424

Index 425

Rewards Program

Write a Review