Background: Presently, most medical educators rely exclusively on item difficulty and discrimination indices to investigate an item’s psychometric quality and functioning. We argue “instructional familiarity” effects should also be of primary concern for persons attempting to discern the quality and meaning of a set of test scores. Aim: There were four primary objectives of this study: (1) Revisit Haladyna and Roid’s conceptualization of “instructional sensitivity” within the context of criterionâreferenced assessments, (2) provide an overview of “instructional familiarity” and its importance, (3) reframe the concept for a modern audience concerned with medical school assessments, and (4) conduct an empirical evaluation of a medical school examination in which we attempt to investigate the instructional effects on person and item measures. Subjects and Methods: This study involved a medical school course instructor providing ratings of instructional familiarity (IF) for each midâterm examination item, and a series of psychometric analyses to investigate the effects of IF on students’ scores and item statistics. The methodology used in this study is based primarily on a mixedâmethod, “action research” design for a medical school course focusing on endocrinology. Rasch measurement model; correlation analysis. Results: The methodology presented in this article was evidenced to better discern authentic learning than traditional approaches that ignore valuable contextual information about students’ familiarity with exam items. Conclusions: The authors encourage other medical educators to adopt this straightforward methodology so as to increase the likelihood of making valid inferences about learning.