Objective: To identify current trends in the use of validity frameworks in surgical simulation, to provide an overview of the evidence behind the assessment of technical skills in all surgical specialties, and to present recommendations and guidelines for future validity studies. Summary of Background Data: Validity evidence for assessment tools used in the evaluation of surgical performance is of paramount importance to ensure valid and reliable assessment of skills. Methods: We systematically reviewed the literature by searching 5 databases (PubMed, EMBASE, Web of Science, PsycINFO, and the Cochrane Library) for studies published from January 1, 2008, to July 10, 2017. We included original studies evaluating simulation-based assessments of health professionals in surgical specialties and extracted data on surgical specialty, simulator modality, participant characteristics, and the validity framework used. Data were synthesized qualitatively. Results: We identified 498 studies with a total of 18,312 participants. Publications involving validity assessments in surgical simulation more than doubled from 2008 to 2010 (∼30studies/year) to 2014 to 2016 (∼70 to 90studies/year). Only 6.6% of the studies used the recommended contemporary validity framework (Messick). The majority of studies used outdated frameworks such as face validity. Significant differences were identified across surgical specialties. The evaluated assessment tools were mostly inanimate or virtual reality simulation models. Conclusion: An increasing number of studies have gathered validity evidence for simulation-based assessments in surgical specialties, but the use of outdated frameworks remains common. To address the current practice, this paper presents guidelines on how to use the contemporary validity framework when designing validity studies.