KE is applied to the four major equating designs and to both Chain Equating and Post-Stratification Equating for the Non-Equivalent groups with Anchor Test Design. It will be an important reference for several groups: (a) Statisticians (b) Practitioners and (c) Instructors in psychometric and measurement programs. The authors assume some familiarity with linear and equipercentile test equating, and with matrix algebra.
KE is applied to the four major equating designs and to both Chain Equating and Post-Stratification Equating for the Non-Equivalent groups with Anchor Test Design. It will be an important reference for several groups: (a) Statisticians (b) Practitioners and (c) Instructors in psychometric and measurement programs. The authors assume some familiarity with linear and equipercentile test equating, and with matrix algebra.
In recent years, many researchers in the psychology and statistical communities have paid increasing attention to test equating as issues of using multiple test forms have arisen and in response to criticisms of traditional testing techniques. This book provides a practically oriented introduction to test equating which both discusses the most frequently used equating methodologies and covers many of the practical issues involved. The main themes are: - the purpose of equating - distinguishing between equating and related methodologies - the importance of test equating to test development and quality control - the differences between equating properties, equating designs, and equating methods - equating error, and the underlying statistical assumptions for equating. The authors are acknowledged experts in the field, and the book is based on numerous courses and seminars they have presented. As a result, educators, psychometricians, professionals in measurement, statisticians, and students coming to the subject for the first time as part of their graduate study will find this an invaluable text and reference.
This book describes how to use test equating methods in practice. The non-commercial software R is used throughout the book to illustrate how to perform different equating methods when scores data are collected under different data collection designs, such as equivalent groups design, single group design, counterbalanced design and non equivalent groups with anchor test design. The R packages equate, kequate and SNSequate, among others, are used to practically illustrate the different methods, while simulated and real data sets illustrate how the methods are conducted with the program R. The book covers traditional equating methods including, mean and linear equating, frequency estimation equating and chain equating, as well as modern equating methods such as kernel equating, local equating and combinations of these. It also offers chapters on observed and true score item response theory equating and discusses recent developments within the equating field. More specifically it covers the issue of including covariates within the equating process, the use of different kernels and ways of selecting bandwidths in kernel equating, and the Bayesian nonparametric estimation of equating functions. It also illustrates how to evaluate equating in practice using simulation and different equating specific measures such as the standard error of equating, percent relative error, different that matters and others.
This book provides an introduction to test equating, scaling and linking, including those concepts and practical issues that are critical for developers and all other testing professionals. In addition to statistical procedures, successful equating, scaling and linking involves many aspects of testing, including procedures to develop tests, to administer and score tests and to interpret scores earned on tests. Test equating methods are used with many standardized tests in education and psychology to ensure that scores from multiple test forms can be used interchangeably. Test scaling is the process of developing score scales that are used when scores on standardized tests are reported. In test linking, scores from two or more tests are related to one another. Linking has received much recent attention, due largely to investigations of linking similarly named tests from different test publishers or tests constructed for different purposes. In recent years, researchers from the education, psychology and statistics communities have contributed to the rapidly growing statistical and psychometric methodologies used in test equating, scaling and linking. In addition to the literature covered in previous editions, this new edition presents coverage of significant recent research. In order to assist researchers, advanced graduate students and testing professionals, examples are used frequently and conceptual issues are stressed. New material includes model determination in log-linear smoothing, in-depth presentation of chained linear and equipercentile equating, equating criteria, test scoring and a new section on scores for mixed-format tests. In the third edition, each chapter contains a reference list, rather than having a single reference list at the end of the volume The themes of the third edition include: * the purposes of equating, scaling and linking and their practical context * data collection designs * statistical methodology * designing reasonable and useful equating, scaling, and linking studies * importance of test development and quality control processes to equating * equating error, and the underlying statistical assumptions for equating
Generalized Kernel Equating is a comprehensive guide for statisticians, psychometricians, and educational researchers aiming to master test score equating. This book introduces the Generalized Kernel Equating (GKE) framework, providing the necessary tools and methodologies for accurate and fair score comparisons. The book presents test score equating as a statistical problem and covers all commonly used data collection designs. It details the five steps of the GKE framework: presmoothing, estimating score probabilities, continuization, equating transformation, and evaluating the equating transformation. Various presmoothing strategies are explored, including log-linear models, item response theory models, beta4 models, and discrete kernel estimators. The estimation of score probabilities when using IRT models is described and Gaussian kernel continuization is extended to other kernels such as uniform, logistic, epanechnikov and adaptive kernels. Several bandwidth selection methods are described. The kernel equating transformation and variants of it are defined, and both equating-specific and statistical measures for evaluating equating transformations are included. Real data examples, guiding readers through the GKE steps with detailed R code and explanations are provided. Readers are equipped with an advanced knowledge and practical skills for implementing test score equating methods.
The goal of this book is to emphasize the formal statistical features of the practice of equating, linking, and scaling. The book encourages the view and discusses the quality of the equating results from the statistical perspective (new models, robustness, fit, testing hypotheses, statistical monitoring) as opposed to placing the focus on the policy and the implications, which although very important, represent a different side of the equating practice. The book contributes to establishing “equating” as a theoretical field, a view that has not been offered often before. The tradition in the practice of equating has been to present the knowledge and skills needed as a craft, which implies that only with years of experience under the guidance of a knowledgeable practitioner could one acquire the required skills. This book challenges this view by indicating how a good equating framework, a sound understanding of the assumptions that underlie the psychometric models, and the use of statistical tests and statistical process control tools can help the practitioner navigate the difficult decisions in choosing the final equating function. This book provides a valuable reference for several groups: (a) statisticians and psychometricians interested in the theory behind equating methods, in the use of model-based statistical methods for data smoothing, and in the evaluation of the equating results in applied work; (b) practitioners who need to equate tests, including those with these responsibilities in testing companies, state testing agencies, and school districts; and (c) instructors in psychometric, measurement, and psychology programs.
This proceedings book highlights the latest research and developments in psychometrics and statistics. Featuring contributions presented at the 82nd Annual Meeting of the Psychometric Society (IMPS), organized by the University of Zurich and held in Zurich, Switzerland from July 17 to 21, 2017, its 34 chapters address a diverse range of psychometric topics including item response theory, factor analysis, causal inference, Bayesian statistics, test equating, cognitive diagnostic models and multistage adaptive testing. The IMPS is one of the largest international meetings on quantitative measurement in psychology, education and the social sciences, attracting over 500 participants and 250 paper presentations from around the world every year. This book gathers the contributions of selected presenters, which were subsequently expanded and peer-reviewed.
This book describes how to use test equating methods in practice. The non-commercial software R is used throughout the book to illustrate how to perform different equating methods when scores data are collected under different data collection designs, such as equivalent groups design, single group design, counterbalanced design and non equivalent groups with anchor test design. The R packages equate, kequate and SNSequate, among others, are used to practically illustrate the different methods, while simulated and real data sets illustrate how the methods are conducted with the program R. The book covers traditional equating methods including, mean and linear equating, frequency estimation equating and chain equating, as well as modern equating methods such as kernel equating, local equating and combinations of these. It also offers chapters on observed and true score item response theory equating and discusses recent developments within the equating field. More specifically it covers the issue of including covariates within the equating process, the use of different kernels and ways of selecting bandwidths in kernel equating, and the Bayesian nonparametric estimation of equating functions. It also illustrates how to evaluate equating in practice using simulation and different equating specific measures such as the standard error of equating, percent relative error, different that matters and others.