Background: The reporting and interpretation of data from clinical trials of proximal humeral fractures are hampered by the use of two partly incommensurable fracture classification systems: the Neer classification and the AO/OTA classification. It remains difficult to interpret and generalize results, to conduct prognostic studies, and to obtain consensus on treatment recommendations when concise definitions and a common 'fracture language' are lacking. Thus, we compared both classifications systems using primary data from large clinical studies to assess how thoroughly both systems conveyed clinically important classification information. Methods. Classification data from each study were organized in a cross-table covering the 432 theoretically possible combinations between the 16 Neer categories and the 27 AO/OTA subgroups, and the plausibility of all observed combinations were assessed and discussed by the authors until consensus. Results: We analyzed primary data from 2530 observations from seven studies providing primary data from both classification systems. Thirty-five percent (151 out of 432) of the combinations were considered 'not plausible' and thirty-four percent (149 out of 432) were considered 'problematic'. Conclusions: Clinically important information was lost within both classification systems. Most important, the varus/valgus distinction was not found within the Neer classification and a clear definition of displacement was lacking in the AO/OTA classification. We encourage surgeons and researches to report data from both classification systems for a more thorough description of the fracture patterns and to enable cross-checking of the coding. A suitable table for cross-checking of the coding is provided herein.