KE is applied to the four major equating designs and to both Chain Equating and Post-Stratification Equating for the Non-Equivalent groups with Anchor Test Design. It will be an important reference for several groups: (a) Statisticians (b) Practitioners and (c) Instructors in psychometric and measurement programs. The authors assume some familiarity with linear and equipercentile test equating, and with matrix algebra.
A must-have resource for researchers, practitioners, and advanced students interested or involved in psychometric testing Over the past hundred years, psychometric testing has proved to be a valuable tool for measuring personality, mental ability, attitudes, and much more. The word ‘psychometrics’ can be translated as ‘mental measurement’; however, the implication that psychometrics as a field is confined to psychology is highly misleading. Scientists and practitioners from virtually every conceivable discipline now use and analyze data collected from questionnaires, scales, and tests developed from psychometric principles, and the field is vibrant with new and useful methods and approaches. This handbook brings together contributions from leading psychometricians in a diverse array of fields around the globe. Each provides accessible and practical information about their specialist area in a three-step format covering historical and standard approaches, innovative issues and techniques, and practical guidance on how to apply the methods discussed. Throughout, real-world examples help to illustrate and clarify key aspects of the topics covered. The aim is to fill a gap for information about psychometric testing that is neither too basic nor too technical and specialized, and will enable researchers, practitioners, and graduate students to expand their knowledge and skills in the area. Provides comprehensive coverage of the field of psychometric testing, from designing a test through writing items to constructing and evaluating scales Takes a practical approach, addressing real issues faced by practitioners and researchers Provides basic and accessible mathematical and statistical foundations of all psychometric techniques discussed Provides example software code to help readers implement the analyses discussed
This proceedings book highlights the latest research and developments in psychometrics and statistics. Featuring contributions presented at the 82nd Annual Meeting of the Psychometric Society (IMPS), organized by the University of Zurich and held in Zurich, Switzerland from July 17 to 21, 2017, its 34 chapters address a diverse range of psychometric topics including item response theory, factor analysis, causal inference, Bayesian statistics, test equating, cognitive diagnostic models and multistage adaptive testing. The IMPS is one of the largest international meetings on quantitative measurement in psychology, education and the social sciences, attracting over 500 participants and 250 paper presentations from around the world every year. This book gathers the contributions of selected presenters, which were subsequently expanded and peer-reviewed.
Generalized Kernel Equating is a comprehensive guide for statisticians, psychometricians, and educational researchers aiming to master test score equating. This book introduces the Generalized Kernel Equating (GKE) framework, providing the necessary tools and methodologies for accurate and fair score comparisons. The book presents test score equating as a statistical problem and covers all commonly used data collection designs. It details the five steps of the GKE framework: presmoothing, estimating score probabilities, continuization, equating transformation, and evaluating the equating transformation. Various presmoothing strategies are explored, including log-linear models, item response theory models, beta4 models, and discrete kernel estimators. The estimation of score probabilities when using IRT models is described and Gaussian kernel continuization is extended to other kernels such as uniform, logistic, epanechnikov and adaptive kernels. Several bandwidth selection methods are described. The kernel equating transformation and variants of it are defined, and both equating-specific and statistical measures for evaluating equating transformations are included. Real data examples, guiding readers through the GKE steps with detailed R code and explanations are provided. Readers are equipped with an advanced knowledge and practical skills for implementing test score equating methods.
This book provides an introduction to test equating, scaling and linking, including those concepts and practical issues that are critical for developers and all other testing professionals. In addition to statistical procedures, successful equating, scaling and linking involves many aspects of testing, including procedures to develop tests, to administer and score tests and to interpret scores earned on tests. Test equating methods are used with many standardized tests in education and psychology to ensure that scores from multiple test forms can be used interchangeably. Test scaling is the process of developing score scales that are used when scores on standardized tests are reported. In test linking, scores from two or more tests are related to one another. Linking has received much recent attention, due largely to investigations of linking similarly named tests from different test publishers or tests constructed for different purposes. In recent years, researchers from the education, psychology and statistics communities have contributed to the rapidly growing statistical and psychometric methodologies used in test equating, scaling and linking. In addition to the literature covered in previous editions, this new edition presents coverage of significant recent research. In order to assist researchers, advanced graduate students and testing professionals, examples are used frequently and conceptual issues are stressed. New material includes model determination in log-linear smoothing, in-depth presentation of chained linear and equipercentile equating, equating criteria, test scoring and a new section on scores for mixed-format tests. In the third edition, each chapter contains a reference list, rather than having a single reference list at the end of the volume The themes of the third edition include: * the purposes of equating, scaling and linking and their practical context * data collection designs * statistical methodology * designing reasonable and useful equating, scaling, and linking studies * importance of test development and quality control processes to equating * equating error, and the underlying statistical assumptions for equating
By providing an introduction to test equating which both discusses the most frequently used equating methodologies and covering many of the practical issues involved, this volume expands upon the coverage of the first edition by providing a new chapter on test scaling and a second on test linking.
This book is open access under a CC BY-NC 2.5 license. This book describes the extensive contributions made toward the advancement of human assessment by scientists from one of the world’s leading research institutions, Educational Testing Service. The book’s four major sections detail research and development in measurement and statistics, education policy analysis and evaluation, scientific psychology, and validity. Many of the developments presented have become de-facto standards in educational and psychological measurement, including in item response theory (IRT), linking and equating, differential item functioning (DIF), and educational surveys like the National Assessment of Educational Progress (NAEP), the Programme of international Student Assessment (PISA), the Progress of International Reading Literacy Study (PIRLS) and the Trends in Mathematics and Science Study (TIMSS). In addition to its comprehensive coverage of contributions to the theory and methodology of educational and psychological measurement and statistics, the book gives significant attention to ETS work in cognitive, personality, developmental, and social psychology, and to education policy analysis and program evaluation. The chapter authors are long-standing experts who provide broad coverage and thoughtful insights that build upon decades of experience in research and best practices for measurement, evaluation, scientific psychology, and education policy analysis. Opening with a chapter on the genesis of ETS and closing with a synthesis of the enormously diverse set of contributions made over its 70-year history, the book is a useful resource for all interested in the improvement of human assessment.
This book describes how to use test equating methods in practice. The non-commercial software R is used throughout the book to illustrate how to perform different equating methods when scores data are collected under different data collection designs, such as equivalent groups design, single group design, counterbalanced design and non equivalent groups with anchor test design. The R packages equate, kequate and SNSequate, among others, are used to practically illustrate the different methods, while simulated and real data sets illustrate how the methods are conducted with the program R. The book covers traditional equating methods including, mean and linear equating, frequency estimation equating and chain equating, as well as modern equating methods such as kernel equating, local equating and combinations of these. It also offers chapters on observed and true score item response theory equating and discusses recent developments within the equating field. More specifically it covers the issue of including covariates within the equating process, the use of different kernels and ways of selecting bandwidths in kernel equating, and the Bayesian nonparametric estimation of equating functions. It also illustrates how to evaluate equating in practice using simulation and different equating specific measures such as the standard error of equating, percent relative error, different that matters and others.
This proceedings volume compiles and expands on selected and peer reviewed presentations given at the 81st Annual Meeting of the Psychometric Society (IMPS), organized by the University of North Carolina at Greensboro, and held in Asheville, North Carolina, July 11th to 17th, 2016. IMPS is one of the largest international meetings focusing on quantitative measurement in psychology, education, and the social sciences, both in terms of participants and number of presentations. The meeting built on the Psychometric Society's mission to share quantitative methods relevant to psychology, addressing a diverse set of psychometric topics including item response theory, factor analysis, structural equation modeling, time series analysis, mediation analysis, cognitive diagnostic models, and multi-level models. Selected presenters were invited to revise and expand their contributions and to have them peer reviewed and published in this proceedings volume. Previous volumes to showcase work from the Psychometric Society’s meetings are New Developments in Quantitative Psychology: Presentations from the 77th Annual Psychometric Society Meeting (Springer, 2013), Quantitative Psychology Research: The 78th Annual Meeting of the Psychometric Society (Springer, 2015), Quantitative Psychology Research: The 79th Annual Meeting of the Psychometric Society, Madison, Wisconsin, 2014 (Springer, 2015), and Quantitative Psychology Research: The 80th Annual Meeting of the Psychometric Society, Beijing, 2015 (Springer, 2016).
This book describes how to use test equating methods in practice. The non-commercial software R is used throughout the book to illustrate how to perform different equating methods when scores data are collected under different data collection designs, such as equivalent groups design, single group design, counterbalanced design and non equivalent groups with anchor test design. The R packages equate, kequate and SNSequate, among others, are used to practically illustrate the different methods, while simulated and real data sets illustrate how the methods are conducted with the program R. The book covers traditional equating methods including, mean and linear equating, frequency estimation equating and chain equating, as well as modern equating methods such as kernel equating, local equating and combinations of these. It also offers chapters on observed and true score item response theory equating and discusses recent developments within the equating field. More specifically it covers the issue of including covariates within the equating process, the use of different kernels and ways of selecting bandwidths in kernel equating, and the Bayesian nonparametric estimation of equating functions. It also illustrates how to evaluate equating in practice using simulation and different equating specific measures such as the standard error of equating, percent relative error, different that matters and others.