Resistance Training/Periodization
Jacob Goodin
Associate Professor of Kinesiology
Point Loma Nazarene University
San Diego, California, United States
Ryan Nokes
Associate Professor of Kinesiology
Point Loma Nazarene University
San Diego, California, United States
Adam Whisler
Certified Strength & Conditioning Specialist
US Space Force
Colorado Springs, Colorado, United States
Purpose: Increasingly, strength and conditioning professionals are tasked with navigating the evolving technological terrain to discern the efficacy of new tools in improving training results and performance. One such tool is generative artificial intelligence (AI), such as OpenAI’s Chat Generative Pre-trained Transformer (ChatGPT), a large language model (LLM) with a text-based interface allowing users to submit queries in natural language across a broad range of topics. This study investigates prompt-engineering techniques across three pre-trained GPT models, evaluating resultant sport-specific programs for effectiveness against NSCA guidelines.
Methods: We tasked ChatGPT with designing a resistance training program tailored for a hypothetical male collegiate soccer player. Three prompts were engineered with increasing levels of granularity according to a needs analysis outlined by the NSCA. These were entered into three distinct ChatGPT platforms: version 3.5 (GPT3.5), version 4.0 (GPT4.0), and a custom publicly available strength and conditioning GPT (GPTS&C). The components of the nine resultant programs were assessed by three CSCS-certified individuals using Likert scales according to an 18-item rubric. Composite scores were analyzed across levels of model and prompt using a factorial ANOVA with Tukey post hoc analysis.
Results: The main effects of GPT Model (F(2, 4) = 10.35, p=0.026, ηp2 = 0.84) and Prompt (F(2, 4) = 12.56, p=0.019, ηp2 = 0.86) were significant. Post hoc analysis revealed that for the main effect of Model, the mean composite score for GPT4.0 (M=44.89, SD=15.39) and GPTS&C (M=44.33, SD=23.24) were significantly larger than for GPT3.5 (M=20.00, SD=8.69), with p-values of 0.035 and 0.038, respectively. For the main effect of Prompt, the mean composite score for Prompt 3 (M=51.33, SD=20.00) was significantly larger than for Prompt 1 (M=20.11, SD=9.71), with a p-value of 0.016. No other significant effects were found.
Conclusion: GPT4.0 models and their derivatives are superior to GPT3.5 models for creating comprehensive, sport-specific strength and conditioning programs. However, even with high prompt granularity, these generative models do not meet the standards of an elite coach, so caution must be taken before implementing prompt output.
Practical Applications: AI boasts advantages in terms of timeliness, generalized knowledge, and cost, but the output quality must be critically analyzed before application within the field of strength and conditioning. Further, prompt engineering—including architecture, granularity, and specificity—and the LLM model utilized play critical roles in the depth and value of the output. The authors suggest that generative AI be used as a dynamic learning tool, a supplementary aid to enhance idea generation, and a quick recall tool to support program creation. It should be integrated into the coaching framework to support, but not replace, the nuanced decision-making of human experts.
Acknowledgements: None