An Introduction to Contemporary Educational Testing And Measurement | p. 1 |
Tests Are Only Tools: Their Usefulness Can Vary | p. 1 |
Why We Developed This Text: Enhancing Test Usefulness | p. 2 |
Technical Adequacy | p. 2 |
Test User Competency | p. 3 |
Matching the Test's Intended Purpose | p. 3 |
Matching Diverse Test-Takers to the Test | p. 5 |
Test Results and Diversity Considerations | p. 6 |
Tests Are Only Tools: A Video Beats a Photo | p. 6 |
Defining Some Test-Related Terms | p. 8 |
Tests, Assessments, and the Assessment Process | p. 8 |
Types of Tests/Assessments | p. 10 |
Recent Developments: Impact on Classroom Testing and Measurement | p. 13 |
Education Reform Meets Special Education Reform: NCLB and IDEIA | p. 14 |
The Impact on Regular Education Teachers of the IDEIA and NCLB | p. 15 |
Other Trends: Technology, Globalization, and International Competitiveness | p. 16 |
Competency Testing for Teachers | p. 17 |
Increased Interest from Professional Groups | p. 17 |
A Professional Association-Book Publisher Information Initiative | p. 18 |
Effects on the Classroom Teacher | p. 19 |
About the Text | p. 21 |
What if You're ôNo Good in Mathö | p. 22 |
Summary | p. 22 |
For Discussion | p. 23 |
High-Stakes Testing | p. 25 |
Comparing NCLB and State High-Stakes Testing Programs | p. 25 |
High-Stakes Testing: A Nationwide Phenomenon | p. 27 |
High-Stakes Tests Are Only Tools | p. 28 |
Why Does High-Stakes Testing Matter? | p. 29 |
Promotion and Graduation Decisions Affect Students | p. 30 |
Principal and Teacher Incentives Are Linked to HST Performance | p. 32 |
Property Values, Business Decisions, and Politics and HST | p. 32 |
The Lake Wobegon Effect and HST | p. 32 |
The History of High-Stakes Testing | p. 33 |
Education Reform | p. 33 |
Standards-Based Reform | p. 33 |
Types of High-Stakes Tests | p. 36 |
Criterion-Referenced High-Stakes Tests | p. 36 |
Norm-Referenced High-Stakes Tests | p. 41 |
Benchmark Tests and High-Stakes Tests | p. 41 |
The High-Stakes Testing Backlash | p. 42 |
Is There Really a High-Stakes Testing Backlash? | p. 44 |
What Do National Organizations Say About High-Stakes Tests? | p. 45 |
AERA's Twelve Conditions for HST Programs | p. 46 |
How Can a Teacher Use the Twelve Conditions? | p. 48 |
Helping Students (and Yourself) Prepare for High-Stakes Tests | p. 49 |
Focus on the Task, Not Your Feelings About It | p. 49 |
Inform Students and Parents About the Importance of the Test | p. 50 |
Teach Test-Taking Skills as Part of Regular Instruction | p. 51 |
As the Test Day Approaches, Respond to Student Questions Openly and Directly | p. 53 |
Take Advantage of Whatever Preparation Materials Are Available | p. 53 |
Summary | p. 53 |
For Discussion | p. 55 |
Response-to-Intervention (RTI) and the Regular Classroom Teacher | p. 56 |
What Is RTI? | p. 56 |
What if You Have Not Heard of RTI Before? | p. 57 |
How New Is RTI? | p. 57 |
Do Regular Education Teachers Need to Know About RTI? | p. 57 |
An RTI Scenario | p. 58 |
How Important Is RTI to Regular Education Teachers? | p. 60 |
Can a Special Education Law Reform Regular Education? | p. 61 |
How Is RTI Supposed to Help Students and Schools? | p. 61 |
RTI Definitions, Components, and Implementation Approaches | p. 62 |
RTI Definitions | p. 62 |
RTI Components | p. 63 |
RTI Implementation Approaches | p. 68 |
How Widely Is RTI Being Implemented1? | p. 71 |
Some Benefits of RTI | p. 72 |
RTI: The Promise and Some Controversies | p. 72 |
Technical Issues: Reliability, Validity, and Fairness | p. 72 |
Implementation Issues | p. 73 |
The Purpose Of Testing | p. 76 |
Testing, Accountability, and the Classroom Teacher | p. 77 |
Types of Educational Decisions | p. 79 |
A Pinch of Salt | p. 82 |
ôPinchingö in the Classroom | p. 83 |
What to Measure | p. 84 |
How to Measure | p. 85 |
Written Tests | p. 86 |
Summary | p. 87 |
For Discussion | p. 87 |
Norm-Referenced and Criterion-Referenced Tests and Content Validity Evidence | p. 89 |
Defining Norm-Referenced and Criterion-Referenced Tests | p. 89 |
Comparing Norm-Referenced and Criterion-Referenced Tests | p. 93 |
Differences in the Construction of Norm-Referenced and Criterion-Referenced Tests | p. 94 |
Norm- and Criterion-Referenced Tests and Linguistic and Cultural Diversity | p. 95 |
Norm- and Criterion-Referenced Tests and Validity Evidence | p. 97 |
A Three-Stage Model of Classroom Measurement | p. 98 |
Why Objectives? Why Not Just Write Test Items? | p. 100 |
Where Do Goals Come From? | p. 101 |
Are There Different Kinds of Goals and Objectives? | p. 102 |
How Can Instructional Objectives Make a Teacher's Job Easier? | p. 106 |
Summary | p. 107 |
For Discussion | p. 108 |
Measuring Learning Outcomes | p. 110 |
Writing Instructional Objectives | p. 110 |
Identifying Learning Outcomes | p. 110 |
Identifying Observable and Directly Measurable Learning Outcomes | p. 111 |
Stating Conditions | p. 112 |
Stating Criterion Levels | p. 113 |
Keeping It Simple and Straightforward | p. 114 |
Matching Test Items to Instructional Objectives | p. 115 |
Taxonomy of Educational Objectives | p. 117 |
Cognitive Domain | p. 117 |
Affective Domain | p. 120 |
The Psychomotor Domain | p. 123 |
The Test Blueprint | p. 123 |
Content Outline | p. 125 |
Categories | p. 126 |
Number of Items | p. 126 |
Functions | p. 126 |
Summary | p. 128 |
For Practice | p. 128 |
Writing Objective Test Items | p. 130 |
Which Format? | p. 130 |
True-False Items | p. 132 |
Suggestions for Writing True-False Items | p. 134 |
Matching Items | p. 135 |
Faults Inherent in Matching Items | p. 135 |
Suggestions for Writing Matching Items | p. 138 |
Multiple-Choice Items | p. 139 |
Higher-Level Multiple-Choice Questions | p. 144 |
Suggestions for Writing Multiple-Choice Items | p. 147 |
Completion Items | p. 148 |
Suggestions for Writing Completion Items | p. 151 |
Gender and Racial Bias in Test Items | p. 151 |
Guidelines for Writing Test Items | p. 152 |
Advantages and Disadvantages of Different Objective Item Formats | p. 153 |
Summary | p. 155 |
For Practice | p. 156 |
Writing Essay Test Items | p. 157 |
What Is an Essay Item? | p. 158 |
Essay Items Should Measure Complex Cognitive Skills or Processes | p. 158 |
Essay Items: Extended or Restricted Response | p. 159 |
Examples of Restricted Response Essays | p. 161 |
Pros and Cons of Essay Items | p. 162 |
Advantages of the Essay Item | p. 163 |
Disadvantages of the Essay Item | p. 163 |
Suggestions for Writing Essay Items | p. 164 |
Scoring Essay Questions | p. 166 |
Scoring Extended Response and Higher Level Questions | p. 168 |
General Essay Scoring Suggestions | p. 172 |
Assessing Knowledge Organization | p. 172 |
Open-Book Questions and Exams | p. 175 |
Some Open-Book Techniques | p. 178 |
Guidelines for Planning Essays, Knowledge Organization, and Open-Book Questions and Exams | p. 182 |
Summary | p. 183 |
For Practice | p. 184 |
Performance-Based Assessment | p. 185 |
Performance Tests: Direct Measures of Competence | p. 185 |
Performance Tests Can Assess Processes and Products | p. 186 |
Performance Tests Can Be Embedded in Lessons | p. 186 |
Performance Tests Can Assess Affective and Social Skills | p. 188 |
Developing Performance Tests for Your Learners | p. 189 |
Deciding What to Test | p. 190 |
Designing the Assessment Context | p. 192 |
Specifying the Scoring Rubrics | p. 195 |
Specifying Testing Constraints | p. 201 |
A Final Word | p. 202 |
Summary | p. 202 |
For Discussion and Practice | p. 203 |
Portfolio Assessment | p. 205 |
Ensuring Validity of the Portfolio | p. 206 |
Developing Portfolio Assessments | p. 207 |
Deciding on the Purposes for a Portfolio | p. 207 |
Identifying Cognitive Skills and Dispositions | p. 208 |
Deciding Who Will Plan the Portfolio | p. 208 |
Deciding Which Products to Put in the Portfolio and How Many Samples of Each Product | p. 208 |
Building the Portfolio Rubrics | p. 209 |
Developing a Procedure to Aggregate All Portfolio Ratings | p. 214 |
Determining the Logistics | p. 217 |
Summary | p. 220 |
For Practice | p. 221 |
Administering, Analyzing, And Improving The Test Or Assessment | p. 222 |
Assembling the Test | p. 222 |
Packaging the Test | p. 223 |
Reproducing the Test | p. 225 |
Administering the Test | p. 225 |
Scoring the Test | p. 227 |
Analyzing the Test | p. 227 |
Quantitative Item Analysis | p. 228 |
Qualitative Item Analysis | p. 234 |
Item Analysis Modifications for the Criterion-Referenced Test | p. 235 |
Debriefing | p. 240 |
Debriefing Guidelines | p. 241 |
The Process of Evaluating Classroom Achievement | p. 242 |
Summary | p. 243 |
For Practice | p. 245 |
Marks And Marking Systems | p. 246 |
What Is the Purpose of a Mark? | p. 246 |
Why Be Concerned About Marking? | p. 246 |
What Should a Mark Reflect? | p. 247 |
Marking Systems | p. 248 |
Types of Comparisons | p. 248 |
Types of Symbols | p. 253 |
Combining and Weighting the Components of a Mark | p. 254 |
Who Is the Better Teacher? | p. 255 |
Combining Grades into a Single Mark | p. 256 |
Practical Approaches to Equating Before Weighting in the Busy Classroom | p. 259 |
Front-End Equating | p. 260 |
Back-End Equating | p. 260 |
Summary | p. 263 |
For Practice | p. 264 |
Summarizing Data And Measures Of Central Tendency | p. 265 |
What Are Statistics? | p. 265 |
Why Use Statistics? | p. 266 |
Tabulating Frequency Data | p. 267 |
The List | p. 267 |
The Simple Frequency Distribution | p. 263 |
The Grouped Frequency Distribution | p. 268 |
Steps in Constructing a Grouped Frequency Distribution | p. 270 |
Graphing Data | p. 273 |
The Bar Graph, or Histogram | p. 274 |
The Frequency Polygon | p. 274 |
The Smooth Curve | p. 276 |
Measures of Central Tendency | p. 280 |
The Mean | p. 281 |
The Median | p. 282 |
The Mode | p. 287 |
The Measures of Central Tendency in Various Distributions | p. 289 |
Summary | p. 290 |
For Practice | p. 292 |
Variability, The Normal Distribution, And Converted Scores | p. 293 |
The Range | p. 293 |
The Semi-Interquartile Range (SIQR) | p. 294 |
The Standard Deviation | p. 295 |
The Deviation Score Method for Computing the Standard Deviation | p. 299 |
The Raw Score Method for Computing the Standard Deviation | p. 300 |
The Normal Distribution | p. 302 |
Properties of the Normal Distribution | p. 303 |
Converted Scores | p. 307 |
z-Scores | p. 309 |
T-Scores | p. 314 |
Summary | p. 315 |
For Practice | p. 315 |
Correlation | p. 317 |
The Correlation Coefficient | p. 318 |
Strength of a Correlation | p. 319 |
Direction of a Correlation | p. 319 |
Scatterplots | p. 320 |
Where Does r Come From? | p. 322 |
Causality | p. 323 |
Other Interpretive Cautions | p. 325 |
Summary | p. 327 |
For Practice | p. 328 |
Validity Evidence | p. 329 |
Why Evaluate Tests? | p. 329 |
Types of Validity Evidence | p. 329 |
Content Validity Evidence | p. 330 |
Criterion-Related Validity Evidence | p. 330 |
Construct Validity Evidence | p. 332 |
What Have We Been Saying? A Review | p. 333 |
Interpreting Validity Coefficients | p. 334 |
Content Validity Evidence | p. 334 |
Concurrent and Predictive Validity Evidence | p. 334 |
Summary | p. 339 |
For Practice | p. 340 |
Reliability | p. 341 |
Methods of Estimating Reliability | p. 341 |
Test-Retest or Stability | p. 341 |
Alternate Forms or Equivalence | p. 343 |
Internal Consistency | p. 343 |
Interpreting Reliability Coefficients | p. 346 |
Summary | p. 349 |
For Practice | p. 350 |
Error-What Is It? | p. 351 |
The Standard Error of Measurement | p. 353 |
Using the Standard Error of Measurement | p. 354 |
More Applications | p. 357 |
Standard Deviation or Standard Error of Measurement? | p. 359 |
Why All the Fuss About Error? | p. 360 |
Error Within Test-Takers | p. 360 |
Error Within the Test | p. 360 |
Error in Test Administration | p. 361 |
Error in Scoring | p. 361 |
Sources of Error Influencing Various Reliability Coefficients | p. 362 |
Test-Retest | p. 362 |
Alternate Forms | p. 362 |
Internal Consistency | p. 363 |
Band Interpretation | p. 364 |
Steps: Band Interpretation | p. 365 |
A Final Word | p. 369 |
Summary | p. 369 |
For Practice | p. 371 |
Standardized Tests | p. 372 |
What Is a Standardized Test? | p. 373 |
Do Test Stimuli, Administration, and Scoring Have to Be Standardized? | p. 374 |
Standardized Testing: Effects of Accommodations and Alternative Assessments | p. 374 |
Uses of Standardized Achievement Tests | p. 376 |
Will Performance and Portfolio Assessment Make Standardized Tests Obsolete? | p. 377 |
Administering Standardized Tests | p. 377 |
Types of Scores Offered for Standardized Achievement Tests | p. 379 |
Grade Equivalents | p. 379 |
Age Equivalents | p. 380 |
Percentile Ranks | p. 381 |
Standard Scores | p. 382 |
Interpreting Standardized Tests: Test and Student Factors | p. 384 |
Test-Related Factors | p. 384 |
Student-Related Factors | p. 390 |
Aptitude-Achievement Discrepancies | p. 395 |
Interpreting Standardized Tests: Parent-Teacher Conferences and Educational Decision Making | p. 398 |
An Example: Pressure to Change an Educational Placement | p. 399 |
A Second Example: Pressure from the Opposite Direction | p. 404 |
Interpreting Standardized Tests: Score Reports from Publishers | p. 407 |
The Press-On Label | p. 407 |
A Criterion-Referenced Skills Analysis or Mastery Report | p. 408 |
An Individual Performance Profile | p. 412 |
Other Publisher Reports and Services | p. 412 |
Summary | p. 413 |
For Practice | p. 415 |
Types of Standardized Tests | p. 417 |
Standardized Achievement Tests | p. 417 |
Achievement Test Batteries, or Survey Batteries | p. 418 |
Single-Subject Achievement Tests | p. 419 |
Diagnostic Achievement Aptitude Tests | p. 420 |
Standardized Academic Aptitude Tests 420 | |
The History of Academic Aptitude Testing | p. 420 |
Stability of IQ Scores | p. 421 |
What Do IQ Tests Predict? | p. 422 |
Individually Administered Academic Aptitude Tests | p. 423 |
Group-Administered Academic Aptitude Tests | p. 424 |
Standardized Personality Assessment Instruments | p. 425 |
What Is Personality? | p. 425 |
Objective Personality Tests | p. 426 |
Projective Personality Tests | p. 427 |
Summary | p. 428 |
For Discussion | p. 429 |
In The Classroom: A Summary Dialogue | p. 430 |
High-Stakes Testing and NCLB | p. 435 |
Response-to-intervention (RTI) | p. 436 |
Criterion-Referenced Versus Norm-Referenced Tests | p. 436 |
New Responsibilities for Teachers Under IDEIA | p. 437 |
Instructional Objectives | p. 437 |
The Test Blueprint | p. 438 |
Essay Items and the Essay Scoring Guides | p. 438 |
Reliability, Validity Evidence, and Test Statistics | p. 439 |
Grades And Marks | p. 441 |
Some Final Thoughts | p. 441 |
Math Skills Review | p. 443 |
Preparing For The Praxis II: Principles Of Learning And Teaching Assessment | p. 450 |
Determining The Median When There Are Multiple Tied Middle Scores | p. 460 |
Pearson Product-Moment Correlation | p. 462 |
Statistics And Measurement Texts | p. 464 |
Answers for Practice Questions | p. 465 |
Suggested Readings | p. 471 |
References | p. 475 |
Credits | p. 481 |
Index | p. 483 |
Table of Contents provided by Ingram. All Rights Reserved. |
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.