We hypothesize that our framework will be equally amenable to the construction of items in areas outside of Pediatrics, and specifically in Internal Medicine, and that respondents with EBM training will be sensitive both to differences in the methodological validity and to differences in the size of reported effects in clinical research evidence. We further hypothesize that the skills measured by these items are not domain-specific - that Pediatrics and Internal Medicine residents will perform equally well on items outside of and within their own specialty.

Numerical examples for construction of confidence intervals are given in  section.

This project’s goal is to construct a multidimensional program of outcome measures which can serve as a valid and reliable assessment of medical students’ non-cognitive attributes. To achieve this goal, the program will employ a systems approach in which faculty, students, and administrators utilize a variety of tools and methods to evaluate the non-cognitive aspects of student performance, as recommended by the AAMC in its clinical evaluation program. The non-cognitive attributes listed in the MSOP Report will serve as a basis for the assessment. These attributes will be operationally defined in behavioral terms by conducting structured interviews with full-time faculty physicians, community physicians who serve as voluntary faculty, students, patients, and medical school administrators. In the project, the new tool will be designed by formulating objectives for the non-cognitive attributes students should possess prior to graduation and developing the instruments and methods of assessment, including evaluation forms, paper cases and video "trigger" tapes, objective structured clinical examination (OSCE) stations, and a summary feedback form for the students. In the second year of the project, a pilot study of the new program will be done, and the data collected will be analyzed to determine its validity and reliability.

Let C be the probability for a two-sided confidence interval (CI) constructed for an estimate.

In our previous project, we successfully developed and validated 12 items designed to measure the ability of Pediatrics residents and clerks to appropriately incorporate new evidence into clinical decision making in hypothetical vignettes. In this project, our objectives are (1) to use our established framework to develop additional items spanning Pediatrics and Internal Medicine, and unconfounding evidence validity from strength of results presented in evidence, (2) to establish the construct validity of these items in Pediatrics and Internal Medicine residents, and (3) to determine the degree of domain specificity in responding to these items.

Clerkship students at two medical schools will be invited to participate in 12 in-depth focus groups. The discussions will (and can, according to preliminary focus groups) elicit students' observations of peer professional behavior, their sense of responsibility for providing feedback, and the conditions under which they would feel such observations could be incorporated into acceptable peer feedback. Transcriptions of the focus groups will be subjected to qualitative analysis, using the principles of grounded theory. The reliability of coding will be assessed, and the opinions of student participants will be sought to verify the investigators' interpretations of the focus group material. Based on the analysis of these initial focus groups, one or more systems for observing, reporting, and using peer observations will be constructed in detail. Perceptions and opinions of these systems will be sought via a survey administered to all clerks at the two schools. Responses to the survey will be used to generate a final set of recommendations characterizing the most acceptable system(s) for peer observations of professional behaviors among students. A key next step -- ascertaining the reliability and the external validity of the system(s) -- would await future funding.

Sixteen new items will be constructed from a 2 x 2 x 2 x 2 (case domain x decision type x methodological validity x importance of results) factorial design. The case domain (Pediatrics or Internal Medicine) and decision type (diagnosis or therapy) factors define four basic cases. For each of these four basic cases, four variant items are created by manipulating the methodological validity of the evidence (powerful or weak) and the importance of the results (large effect size or small effect size).

In contrast to all the issues listed above, namely selection, information and confounding, which are biases, interaction is not a bias due to problems in study design or analysis, but reflects reality and its complexity. An example of this phenomenon is the following: exposure to radon is a risk factor for lung cancer, as is smoking. In addition, smoking and radon exposure have different effects on lung cancer risk depending on whether they act together or in isolation. Most of the occupational studies on this topic have been conducted among underground miners and at times have provided conflicting results. Overall, there seem to be arguments in favour of an interaction of smoking and radon exposure in producing lung cancer. This means that lung cancer risk is increased by exposure to radon, even in non-smokers, but that the size of the risk increase from radon is much greater among smokers than among non-smokers. In epidemiological terms, we say that the effect is multiplicative. In contrast to confounding, described above, interaction needs to be carefully analysed and described in the analysis rather than simply controlled, as it reflects what is happening at the biological level and is not merely a consequence of poor study design. Its explanation leads to a more valid interpretation of the findings from a study.

    In prediction by regression often one or more of the following constructions are of interest:

    Therefore, it is useful to understand how index numbers are constructed and how to interpret them.

    factorial validity because such validity evidence is gathered through factor analysis.

However, provided the cohort has a broad range of exposure experience, the nested case-control approach is very attractive. One gathers all the cases arising in the cohort over the follow-up period to form the case series, while only a sample of the non-cases is drawn for the control series. The researchers then, as in the traditional case-control design, gather detailed information on the exposure experience by interviewing cases and controls (or, their close relatives), by scrutinizing the employers’ personnel rolls, by constructing a job exposure matrix, or by combining two or more of these approaches. The controls can either be matched to the cases or they can be treated as an independent series.

To complete the project successfully, [they] will use [their] skills evaluation tool to test OB-GYN residents at Madigan Army Medical Center, Oregon Health Sciences University, Harvard Medical School, and Pennsylvania State College of Medicine. Examiners from the University of Washington will be blinded as to resident level and prior performance. Examiners from the host institution will participate as well. Reliability and validity of the instrument will be established, and comparisons will be made between blinded and unblinded evaluators to establish objectivity. The intraoperative skills assessment tool will also be subjected to analysis of reliability and validity. In addition, faculty and residents will be surveyed about the usefulness of this instrument in providing prompt and constructive feedback.

Lifelong learning is required in medicine to stay abreast of scientific advances and rapid developments in the medical sciences and biomedical technology. Despite the importance of physicians' lifelong learning, no psychometrically sound instrument has been developed to assess it. The purpose of this project is to develop an operational tool for assessing physicians' lifelong learning habits, activities and professional outcomes. In particular, we plan to address several psychometric aspects of a lifelong learning scale such as face and content validities, construct validity (underlying components of the lifelong learning scale), criterion-related validity (convergent and discriminant validities), internal consistency aspect of reliability (Cronbach's coefficient alpha), stability of the scores over-time (test-retest reliability), and relationship with outcomes associated with scores of the lifelong learning scale.

The cases and controls can be sampled and analysed either as independent series or matched groups. Matching means that controls are selected for each case based on certain characteristics or attributes, to form pairs (or sets, if more than one control is chosen for each case). Matching is usually done based on one or more such factors, as age, vital status, smoking history, calendar time of case diagnosis, and the like. In our example, cases and controls are then matched on age and vital status. (Vital status is important, because patients themselves usually give a more accurate exposure history than close relatives, and symmetry is essential for validity reasons.) Today, the recommendation is to be restrictive with matching, because this procedure can introduce negative (effect-masking) confounding.

