new directions in exam development

Wendy Myers, Certification Program Manager

Testing and measurement are evolutionary processes. The ways in which abstract constructs and performances are quantified are ever-dependent upon our knowledge of these constructs and the technology with which they can be measured. Standardized testing has come quite a long way from its traceable origin, China's civil service exam, circa 1500 B.C.

The Project Management Professional Examination has gone through its own evolution since its premiere administration in 1984. After a decade of phenomenal growth, increased recognition, and technical refinement, the membership of PMI realized that it could no longer leave the maintenance and development of the examination in the hands of volunteers. As manager of the certification program I am honored to lead the way as PMI raises the PMP certification program to world-class standards. While we at PMI should be proud of our PMP examination and the countless hours of volunteer work that have sustained it thus far, there is much to be done.

What follows is a broad and by no means definitive listing of issues that must be addressed in any attempt to improve the certification effort. Before I begin with my explanations, let me clarify what I mean by “addressed.” In the testing community, it is known that no test can ever be perfect. Therefore, a testing program is not judged solely on the basis of the product it delivers—the exam. An even greater emphasis is placed upon the procedures that it follows and the standards that it adopts. When I speak of “addressing issues” I mean just this; committing PMI to such standards and practices that will, in the long run, establish its certification program as one that is second to none.

Validity and Reliability

The establishment of an ongoing validity and reliability study, the results of which are to be kept in a examination manual, available to any interested party.

Validity is the end all and be all of testing. If the inferences drawn from an exam are inappropriate and without basis, the exam itself is meaningless. Over the years the PMP Certification Examination has been subjected to validity checks. However, with the tremendous increase in the number of examinees and administrations both domestically and internationally, the validation process has mushroomed into what could be considered a full-time job. My goal is to implement an ongoing validation program that continually measures the construct, content, and criterion-related validity of the exam. This will assure ourselves, our employers, and the industry that the PMP Examination is measuring what it intends—project management professionalism. Additionally, as PMI membership continues to flourish overseas, a whole other dimension will be added to this validation process. At the same time we will need to examine the reliability of the examination. Is our test measuring project management professionalism consistently? Are our administration procedures standardized and consistent with industry-recognized standards? I have already set the wheels in motion on some of these activities, including the framework for our test manual. The open availability of this manual will assure those who may question our methods that we have nothing to hide and that we stand by our exam.

Test Item Bank

The development of a test item bank, which will be subject to continuous scrutiny for bias, effectiveness, and measurement precision.

The PMP Certification Examination is in desperate need of an item bank that can be continuously updated and refined. The current examination is composed of 320 items, some of which are better than others. The existing item bank contains a couple hundred more. An initial item analysis will differentiate between well- and poor-performing items. By shortening the exam to 240 items—240 of the best performing items—we can improve our exam and add to our item bank the 80 items that we remove. Remember, longer does not necessarily mean better. A clean, well-designed 240-item examination is preferable to a longer examination that is just, well, long. The item reserve can then be examined, reworded and, in effect, recycled. The support of volunteers will then be crucial in the ongoing development of new test items.

Over the course of the next year, the PMP Certification Exam will be streamlined. Effective items will replace those that provide little information and items will be examined for bias. Effective items, for our purposes, will be those that discriminate well between masters of the PMBOK and nonmasters. Research has shown that measurement precision increases when items are targeted to discriminate at the cut point (Hambleton, Swaminathan, and Rogers, 1991). Bias, in measurement theory, is often misunderstood. A better term for bias is differential item functioning (DIF). The accepted definition of DIF is that “an item shows DIF if individuals having the same ability, but from different groups, do not have the same probability of getting the item right” (Hambleton, Swaminathan, and Rogers, 1991, p. 110). In our analysis of items, we will check for DIF and employ an item sensitivity review, following guidelines provided by the Educational Testing Service.

Analysis Techniques and Exam Development

The use of state-of-the-art statistical techniques for analysis purposes and exam development.

The PMP examination has already been subjected to analysis based on what is referred to as classical test theory (CTT). These procedures are currently used by most, if not all, major testing companies. However, most measurement statisticians also utilize item response theory (IRT) in their analysis of test items and in the development of examinations. The advent of computer technology in the field of measurement has allowed test developers to employ stronger models in the analysis of test items. Before the era of supercomputers and pentium processors, most item analysis was limited to correlations between test-items and test-takers and test-items and the entire test—the aforementioned CTT. Item response theory, a “family of mathematical descriptions of what happens when an examinee meets an item” (Wainer, p.67, 1990), has enabled measurement specialists to analyze items and examinees independently. Today, most leading testing companies use IRT in their analysis of both test-items and examinee performance. PMI will soon be joining the ranks of the leaders in the testing field by using IRT to analyze the PMP Certification Exam. Additionally, statistical methods such as factor analysis and discriminant analysis will be used to analyze the exam in ways previously unexplored.

Testing and Measurement Technology

A commitment to keep our examination in step with the current flow of testing and measurement technology, including but not limited to computer technology.

As the advent of computers has allowed for the practical application of IRT models, the use of IRT models has ushered in the development of adaptive testing technologies, generally in the form of computerized adaptive examinations. Adaptive testing is the new frontier in measurement. It is currently in use at such places as the Educational Testing Service. In a computerized adaptive testing situation—the typical environment— the examinee is presented with an item known to be of average difficulty. The response of the examinee then determines the level of difficulty for the following item. Items are presented until the computer can accurately pinpoint the examinee's ability level. The computer is of course programmed to identify ability levels on a continuum. The beauty of the adaptive test is its measurement precision and its length. By using known information—the item difficulty—the computer program does not present items to examinees that they will undoubtedly pass or fail. The exam is extremely efficient. Of course, in order to have such tremendous faith in test items, they must undergo close scrutiny. Therefore adaptive testing, though tremendously “hot” today, will not be used at PMI until it is time. There are a number of issues related to adaptive testing that even the major testing companies have not yet resolved, including the effects of guessing and the ability to skip items. In the meantime, we at PMI will analyze and develop our items and benefit from the lessons learned by the major testing firms.

Continual Improvement

A commitment by the body of members of PMI to the continual process of defining project management.

In order for the certification program to adequately represent project management professionalism, our examination must reflect that which is state-of-the-art in the industry. It is here that the participation of the membership is vital. As the manager of the certification program I will need the support of the membership-at-large in pointing out areas that should be included or excluded in the examination; in developing items for the exam that I can review, edit and pilot in subsequent examination administrations; and, by all means, in pointing out to their employers, colleagues and peers the lengths that PMI is going to professionalize itself, its examination, and the field of project management.

All future modifications and changes to the certification program are aimed at improving the quality of the program and hence its reputation. Major corporations will be thrilled with our goals—we have to let them know what we are doing!

Over the following weeks and months, PMI will need to choose between software companies, measurement contractors, and mathematical models, among other things. Each decision will reflect careful thought in terms of expense, benefit, and utility. Each decision will act as a stepping stone in the path that will lead PMI to 21st century testing practices. Rest assured that no decision will be made lightly, numerous people will be involved in the decision-making process, and the image and reputation of PMI and the PMP certification program will be the overriding concerns. ∎


Hambleton, R., Swaminathan, H., and Rogers, H. 1991. Fundamentals of Item Response Theory. Newbury Park: Sage Publications.

Wainer, H. 1990. Computerized Adaptive Testing: A Primer. Hillsdale: Lawrence Erlbaum Associates.

This material has been reproduced with the permission of the copyright owner. Unauthorized reproduction of this material is strictly prohibited. For permission to reproduce this material, please contact PMI.

PM Network • October 1995



Related Content