Loading [Contrib]/a11y/accessibility-menu.js
1.
Freeman B, Freeman C. AI-Enhanced Simulation for Leadership Communication in Nurse Anesthesia Education: A Mixed-Methods Pilot Study. JNAE. Published online October 27, 2025. doi:10.63524/​jnae.2845731
Download all (1)
  • Figure 1. Distribution of Theme Emergence in Qualitative Data

Abstract

Importance

Leadership communication is essential to safe and effective nurse anesthesia practice, as reflected in the Council on Accreditation graduate competencies.1 However, structured training in this domain remains limited in nurse anesthesia programs.

Objective

The objective of this study is to evaluate the feasibility, perceived educational value, and preliminary outcomes of an artificial intelligence (AI) enhanced simulation, grounded in Intelligent Tutoring Systems principles, designed to strengthen structured communication skills among doctoral nurse anesthesia students.

Design, Setting, and Participants

This mixed-methods pilot study embedded a single-session virtual simulation into a graduate nurse anesthesia leadership course at a U.S. university, where 94 doctoral nurse anesthesia students engaged in an interactive scenario with a hospital executive powered by ChatGPT-4.2

Main Outcomes and Measures

Pre- and post-simulation surveys assessed changes in confidence, preparedness, and perceived communication skill. Qualitative data from individual reflections (n = 78) and group debrief surveys (n = 12) were thematically analyzed to explore applied strategy use.

Results

Students demonstrated significant gains across all domains (P < .001; r = 0.26–0.40), with the greatest improvement in applying structured communication tools. Qualitative analysis reinforced these findings, with leadership growth, listening to adapt, and strategic reframing emerging as dominant themes.

Conclusion

AI-enhanced simulation, requiring minimal technology or training, is a feasible and valuable approach to leadership communication training in nurse anesthesia education, with potential for scalable curriculum development.

Introduction

Every interaction in healthcare has the potential to shape both patient outcomes and professional relationships. For Certified Registered Nurse Anesthetists (CRNA), the ability to communicate with confidence and empathy is essential to navigating complex, hierarchical teams while caring for patients in their most vulnerable state. Nurse anesthesia programs dedicate extensive time to technical expertise, but leadership communication training is often limited, leaving students with few opportunities to rehearse and refine these skills before entering clinical practice. Addressing this gap requires training that is realistic, adaptable, and accessible to diverse learners.

Advances in educational technology facilitate new approaches to address these needs. Intelligent tutoring systems (ITS) have long supported personalized learning through rule-based logic and cognitive models, offering a scalable approach to skill development.3 Large language models now add generative and dialogic capabilities, accelerating the integration of artificial intelligence (AI) into education.3–5 When grounded in sound ITS design, these tools can create emotionally responsive agents that support realistic, adaptive communication training, an approach already leveraged in sectors such as business, aviation, and the military.6,7

Despite these capabilities, most AI applications for healthcare education target diagnostic reasoning or virtual patient encounters, with less attention to communication and leadership skill development.3,8 This lack of emphasis is especially consequential for CRNAs, whose role frequently necessitates high-stakes interdisciplinary conversations. When these skills are taught, practiced, and mastered, they have the power to directly impact patient safety, timely escalation during emergencies, and effective advocacy for resources across interprofessional teams.9,10 Prior research synthesized in a recent scoping review centered around AI-based communication training in healthcare shows that effective communication is associated with higher patient satisfaction.8 This scoping review found only a small number of AI-based systems for healthcare communication training, which were often narrowly focused and descriptive in nature.8 AI tools for teaching interprofessional communication remain rare, and many struggle with natural language generation, contextual responsiveness, and practical implementation demands.8,11–13 Together, these findings highlight both the promise and current shortcomings of AI-enhanced communication training.

To explore this potential, the authors developed and pilot-tested “Pat Maxwell,” a ChatGPT-42 powered (virtual hospital executive who simulates high-pressure leadership conversations. Built on ITS principles and emotionally adaptive design, the simulation responds in real time to the learner’s communication strategies.14 Unlike standardized patients, scripted simulations, or conventional AI chatbots, Pat adapts to the learner’s professionalism, empathy, and strategic clarity by adjusting tone, openness, and resistance accordingly.

This exploratory pilot study evaluated the implementation, acceptability, and early educational value of the AI-enhanced simulation within a doctoral nurse anesthesia leadership course. Specifically, we examined: (1) the feasibility of implementing AI-based communication simulation in a graduate curriculum; (2) student perceptions of realism, educational value, and relevance; (3) changes in self-reported confidence and preparedness; and (4) qualitative insights into how students applied structured communication strategies during the simulation. Findings from this pilot are intended to guide the development of future AI-enhanced simulations for leadership, advocacy, and communication training in nurse anesthesia education, while also informing broader applications across disciplines and adapting to varied learning styles.

Methods

Study Design

A mixed-methods pilot design was used, combining pre- and post-simulation surveys with thematic analysis of individual reflections and group debrief surveys. The approach emphasized pragmatic implementation and learner experience over experimental control, consistent with the study’s exploratory objectives.

Participants and Setting

Participants were first-year nurse anesthesia residents enrolled in a doctoral leadership course at a single U.S. university in Summer 2025. The research team obtained Institutional Review Board approval prior to data collection. Recruitment occurred through an in-class announcement, and consent was obtained electronically via a pre-simulation survey. Of the 99 enrolled students, 94 consented to participate and completed the simulation as well as both the pre- and post-simulation surveys. Although the simulation was a required component of the coursework, participation in the study was voluntary and did not affect students’ course grades. Data were collected only from students who provided informed consent. Students were informed they could withdraw from the study at any time without penalty. Data were de-identified and stored in accordance with institutional protocols.

Learners worked in self-selected teams of 4–8 individuals for the simulation and debriefing, consistent with the structure used throughout the course. Team composition was recorded during the consent process. Two teams with non-consenting members were excluded from qualitative analysis, and one learner was absent on the day of the simulation.

The final dataset included:

  • 94 pre- and post-simulation surveys (unmatched)

  • 78 individual reflections

  • 12 group debrief surveys from fully consenting teams

Intervention

The simulation featured a 30-minute interaction with “Pat Maxwell,” a virtual hospital executive powered by ChatGPT-4. Integrated into the leadership course’s professional communication module, the experience was designed to move students beyond theoretical knowledge by providing deliberate practice in structured advocacy and communication techniques during a realistic leadership scenario.

Prior to the simulation, students received a one hour lecture on 8 evidence-based communication strategies relevant to clinical leadership and professional negotiation: Situation, Background, Assessment, Recommendation (SBAR); Describe, Express, Suggest, Consequences (DESC); Ask–Tell–Ask; Advocacy & Inquiry; Refocusing on Shared Goals; Framing; key Team Strategies and Tools to Enhance Performance and Patient Safety (TeamSTEPPS) principles; and the Seven Steps for Difficult Conversations (see Appendix A for brief definitions). These strategic frameworks have been shown to streamline information delivery, improve escalation of concerns, and increase mutual understanding.9,15–17 Several, such as TeamSTEPPS and SBAR, originated as U.S. military initiatives and were subsequently adapted for healthcare settings, where they have contributed to gains in teamwork and patient safety.10 All frameworks were intentionally selected by the research team for alignment with course and program objectives. They were introduced through a lecture with brief case discussions, laying the groundwork for applied use during the simulation.

The simulation scenario mirrored common professional challenges, including organizational resistance, hierarchical dynamics, and emotional complexity. Each team was tasked with responding to a fictional hospital announcement proposing significant changes to CRNA staffing, framed as a contract renegotiation scenario. All teams received an identical, standardized email prompt from Pat Maxwell and were instructed to treat the simulation as a formal stakeholder meeting.

Simulation Design

Pat Maxwell, the AI-powered hospital executive, was designed as a socially and emotionally responsive conversational agent. To support psychological fidelity, Pat’s persona was deliberately constrained to reflect a risk-averse, cost-conscious healthcare leader. Guardrails were embedded to prevent character drift and ensure consistent behavior throughout the simulation, with a concealed scoring rubric guiding these adaptive responses. This goal-directed persona design aligns with emerging literature on emotionally grounded, instructionally aligned agents.4

The simulation incorporated key constructs from ITS, including domain knowledge (what is being assessed), tutoring knowledge (how the system responds), and student modeling (how the system adapts to learner behavior). These principles were operationalized through a concealed scoring rubric that interpreted learner tone, strategy, and professionalism to adjust Pat’s stance, tone, and resistance throughout the simulation.3,4,6,7 The rubric directed when to reward effective strategies or introduce resistance and was designed to discourage vague statements, flattery, references to communication frameworks without application, and stalling tactics. This structure kept Pat’s responses both instructionally consistent and strategically challenging, prompting precise, purposeful communication from learners.14,18 Although the rubric itself was not analyzed due to limited external validation, it played a critical role in shaping conversation flow and reinforcing the simulation’s learning objectives.

Consistent with ITS best practices, the agent did not provide direct corrective feedback during the conversation, preserving dialogue flow and learner agency.3,4,14,18 However, participants could request tailored, strategy-based feedback at the conclusion by typing “evaluation,” which triggered a summary generated from the same concealed rubric that guided Pat’s responses.

The simulation was designed in alignment with the Healthcare Simulation Standards of Best Practice™ from the International Nursing Association for Clinical Simulation and Learning (INACSL), emphasizing psychological fidelity, outcome alignment, and realism.19 These standards informed scenario authenticity, learner-centered debriefing, and clearly defined objectives, ensuring that both AI design and educational intent were integrated from the outset.

Simulation Procedures

Pre-briefing and Instructional Preparation

Students participated in a standardized pre-briefing session that facilitated psychological safety, clarified procedures, and introduced the context of the simulation.19 Each team received: (1) the scenario prompt; (2) a summary handout of structured communication strategies; (3) a professional bio of Pat Maxwell; and (4) background materials outlining the fictional hospital and proposed staffing model. Clarifying questions were permitted, but no coaching or tactical guidance was provided.

Team Preparation

Teams had 20 minutes to develop a communication strategy using course materials and publicly available sources. Generative AI tools outside of the simulation parameters were explicitly prohibited.

Simulation Session

Each team engaged in a 30-minute live negotiation with the AI agent via the ChatGPT interface, conducted in an instructor-monitored classroom over secure university Wi-Fi. Teams engaged with the simulation until consensus was reached or the time limit expired. The session was designed to reflect the time-pressured, collaborative nature of real-world professional interactions.

Debriefing and Reflection

Following the simulation, each team submitted their transcript log and participated in a facilitated group debriefing session. These discussions focused on communication strategies, team dynamics, and interpretation of Pat’s responses, in alignment with INACSL’s best practices for reflective debriefing.19

Data Collection and Instruments

Both quantitative and qualitative data were collected from multiple sources to capture learner perceptions and communication behaviors and evaluate the simulation’s educational impact.

Pre- and Post-Simulation Surveys

All consenting participants completed anonymous electronic surveys. Four core 5-point Likert items were administered before and after the simulation to assess: (1) confidence initiating difficult conversations, (2) understanding of communication tools, (3) ability to balance empathy and assertiveness, and (4) preparedness to advocate professionally. Three additional post-only items measured realism, educational value, and likelihood of recommending the simulation. Surveys were anonymous and not individually matched.

Open-Ended Survey Items

The same post-simulation survey included 3 open-response prompts that asked students to describe: (1) the most significant communication challenges they encountered, (2) strategies used during the simulation, and (3) how the experience might influence their future practice.

Individual Written Reflections

As a required course activity, each student submitted a 100–200 word reflection after the simulation. Prompts emphasized team performance, the AI agent’s behavior, and lessons learned about leadership communication. Only reflections from consenting students on fully consenting teams were analyzed (n = 78).

Group Debrief Surveys

After facilitated debriefing sessions, teams submitted brief narrative surveys describing their communication approach and team strategy. Twelve group surveys from fully consenting teams were analyzed.

Simulation Transcripts and AI Scores

Teams submitted transcripts of their real-time AI interactions. These transcripts, along with the AI-generated rubric scores, were reviewed by the instructional team to verify simulation fidelity. Because the AI rubric lacked established validity, automated scores were excluded from research analysis and treated only as instructional artifacts.

Data Analysis

Quantitative Analysis

Data was exported from electronic surveys and analyzed in Jamovi (version 2.7.4).20 Descriptive statistics (means, standard deviations, medians, and ranges) were calculated to summarize each item and guide interpretation. Assumptions of normality and homogeneity of variance were assessed prior to inferential testing. Shapiro–Wilk tests revealed non-normal distributions across all items (P < .001), and Levene’s test indicated unequal variances for 2 of the 4 pre- and post-simulation items. Accordingly, nonparametric methods were used throughout.

Because surveys were anonymous and lacked individual identifiers, pre- and post-simulation responses were treated as independent samples. Group comparisons were conducted using the Mann–Whitney U test. Effect sizes were calculated using rank biserial correlation, with significance defined as P < .05. All analyses were exploratory and not intended to support causal inference.

Qualitative Analysis

Qualitative data sources (open-ended survey items, individual reflections, group debrief surveys, and transcripts) were analyzed thematically using a hybrid deductive–inductive approach. The analysis explored how students described their use of communication strategies during the simulation. It followed principles of framework analysis, including matrix-based coding, operational definitions, and team-based consensus building.21

Coding Procedures

Theme categories were initially derived deductively from the structured communication strategies emphasized in the course. Additional patterns emerged through iterative review of the qualitative data and were refined collaboratively to ensure instructional relevance and clarity. The analysis prioritized observable behaviors over abstract impressions to support consistency across reflections and group responses.

Six behavior-based themes were finalized through iterative team review. These themes represented practical communication behaviors aligned with course objectives and served as the coding framework:

  • Tone Adjustment – modifying tone, affect, or delivery style during the conversation

  • Strategic Reframing – pivoting or rephrasing in response to the agent’s stance

  • Leadership Growth – describing insight, confidence, or development in leadership skills

  • Professionalism Under Pressure – maintaining professionalism during moments of challenge

  • Listening to Adapt – adjusting responses based on cues or feedback from the agent

  • Team Dynamics – describing collaborative planning, shared strategy, or role division

Each source was independently coded by 2 raters using a binary scheme (1 = present; 0 = absent), with coding guided by operational definitions and clarified presence criteria.

Reliability

Interrater agreement was assessed using percent agreement. Discrepancies were resolved through consensus discussion. Final theme frequencies were derived from the adjudicated dataset.

Findings

Quantitative Results

Pre/Post Comparison: Confidence and Preparedness

Analyses were based on survey responses from 94 consenting participants. Descriptive statistics are provided for interpretability, and group comparisons were analyzed using nonparametric tests. Post-simulation participants reported significantly higher scores on all 4 pre- and post-simulation items: confidence initiating difficult conversations, balancing assertiveness and empathy, preparedness to advocate professionally, and understanding structured communication tools (all P < .001; Table 1). Effect sizes ranged from small (r = 0.26) to moderate (r = 0.40), with the largest improvement observed in understanding structured communication tools.

Post-Simulation Evaluation: Realism and Educational Value

In addition to pre- and post-simulation comparisons, 3 post-only items assessed students’ perceptions of the simulation. Ratings were consistently high across all domains, with mean scores above 4.3 on the 5-point Likert scale (Table 1). Students indicated that the simulation was realistic, enhanced their understanding of evidence-based communication strategies, and was highly recommendable for future cohorts. These results suggest that the simulation was both educationally valuable and strongly endorsed for continued integration into the curriculum.

Table 1.Descriptive Statistics for Pre-, Post-, and Post-Only Survey Items (N = 94)
Survey Item Group Mean SD U P r
I feel confident initiating a difficult conversation with a healthcare superior.a Pre 3.53 (1.002) 3009 <.001 0.319
Post 4.11 (0.754)
I understand how to apply structured communication tools in a professional context.a Pre 3.62 (0.893) 2641 <.001 0.402
Post 4.27 (0.625)
I can effectively balance assertiveness and empathy in leadership conversations.a Pre 3.79 (0.815) 3099 <.001 0.299
Post 4.22 (0.721)
I feel prepared to advocate for myself and my role in a professional setting.a Pre 3.81 (0.871) 3259 <.001 0.262
Post 4.21 (0.774)
The simulation felt realistic and applicable to real-world leadership challenges.b Pre -- -- -- -- --
Post 4.31 (0.672) -- -- --
The simulation helped me better understand evidence-based communication strategies.b Pre -- -- -- -- --
Post 4.43 (0.613) -- -- --
I would recommend this simulation for future students.b Pre -- -- -- -- --
Post 4.63 (0.622) -- -- --

a. Measured pre- and post-simulation.
b. Measured post-simulation.
Abbreviations: U, Mann-Whitney U statistic; r, rank biserial correlation.

Qualitative Results

Ninety qualitative sources (78 individual reflections and 12 group debrief surveys) were analyzed using the 6 behavior-based themes. The analysis emphasized observable communication behaviors, adaptive decision-making, and reflections on leadership development. Patterns across responses revealed frequent use of strategic reframing, listening to adapt, and leadership growth. Students described modifying their approach in response to Pat’s resistance, cueing from tone or content, and developing confidence in navigating difficult conversations. While learners did not explicitly name communication frameworks, their language reflected practical uptake of strategies such as SBAR, DESC, and Ask–Tell–Ask.

Three themes emerged as most prevalent:

  • Leadership Growth (88.9%) captured student-reported increases in confidence, adaptability, and self-advocacy. One student wrote, “Sometimes, you have to be bold and stand up. They’re not always going to be kind.” Another noted, “I realized the importance of being clear, confident, and focused when speaking with people in authority.”

  • Listening to Adapt (76.7%) highlighted the importance of responding to Pat’s tone and cues: “We noticed it became apparent he did not want pleasantries but wanted blunt honesty.” Similarly, another explained, “We adapted to the answers he provided in order to best provide the answers he was looking for.”

  • Strategic Reframing (67.8%) described how students modified their language or rationale to align with institutional priorities such as safety, efficiency, and collaboration. One group described how they “were able to strategize differently in later responses, tailoring a solution for [Pat] rather than an argument against his ideas. After using this approach, he accepted [their] new model suggestion.”

In contrast, the remaining themes were observed less frequently but still provide important insight into how students managed delivery and group interaction.

  • Tone Adjustment and Team Dynamics each appeared in 40.0% of responses, suggesting moderate attention to delivery style and group strategy. One student noted, “Pat Maxwell caught me off guard when he did not even want to have any small talk from the beginning… this tells me he wants to go straight to the point,” illustrating how learners adjusted tone in response to perceived cues. At the team level, one group noted, “We went back and forth and ultimately had to compromise on what we were asking for so that we were all aligned in our response.” This highlights how collaborative adjustment shaped outcomes and reinforced the need for unified messaging under pressure.

  • Professionalism Under Pressure (21.1%) was the least frequent theme, highlighting instances in which students explicitly noted maintaining composure under tension. For example, one participant shared, “The AI negotiation taught me to stay calm and professional even when the conversation felt tense and intimidating.”

Overall, the simulation engaged students in the structured communication behaviors targeted by the instructional design. Patterns of adaptation, reframing, and growth highlight how participants navigated resistance and reflected on their development as emerging leaders. Table 2 and Figure 1 depict the frequency of theme presence across all qualitative sources.

Table 2.Frequency of Theme Presence Across Qualitative Sources (n = 90)
Theme Individual Reflections
(n = 78)
Group Reflections
(n = 12)
Total Sources
(n = 90)
Percent of All Sources
Tone Adjustment 32 (41.0%) 4 (33.3%) 36 40%
Strategic Reframing 52 (66.7%) 9 (75%) 61 67.8%
Leadership Growth 68 (87.2%) 12 (100%) 80 88.9%
Professionalism 15 (19.2%) 4 (33.3%) 19 21.1%
Listening to Adapt 61 (78.2%) 8 (66.7%) 69 76.7%
Team Dynamics 32 (41.0%) 4 (33.3%) 36 40.0%

Abbreviation: Professionalism, Professionalism Under Pressure.

Figure 1
Figure 1.Distribution of Theme Emergence in Qualitative Data

Thematic network of participant experiences during the AI-enhanced leadership simulation. Bubble size represents the percentage frequency with which each theme appeared in the qualitative data.
Abbreviation: PUP, Professionalism Under Pressure.

Discussion

This exploratory pilot study demonstrates that an AI-enhanced simulation grounded in ITS principles can feasibly support structured communication training in nurse anesthesia education. Quantitative results showed statistically significant improvements in students’ self-reported confidence and preparedness, while qualitative analysis highlighted students’ strategic adaptation, reframing, and responsiveness to resistance. Recent literature suggests that generative AI enhancements to ITS, such as dynamic scenario generation and real-time adaptability, can further personalize learning experiences.4,5

Survey findings revealed small-to-moderate, yet educationally meaningful, effect sizes across 4 domains: initiating difficult conversations, balancing assertiveness with empathy, advocating for one’s role, and applying structured communication tools (P < .001; r = 0.262–0.402). Students also rated the simulation as realistic and relevant to their future practice, reinforcing its acceptability and perceived value within the graduate curriculum. These gains align with Council on Accreditation of Nurse Anesthesia Educational Programs (COA) graduate competencies in interprofessional communication and leadership, suggesting that such simulations can address both curriculum objectives and recognized gaps in applied communication practice.1

Thematic analysis of student reflections supported and expanded these outcomes. Rather than simply naming techniques, students described adjusting tone, modifying framing, and interpreting feedback embedded in the AI agent’s responses, without realizing their behavior was being scored. Such adaptability mirrors the dynamic decision-making required in clinical environments, where providers must pivot strategies in response to situational cues and interpersonal dynamics. The AI-generated rubric scores, although excluded from formal analysis, provided educational artifacts suggesting that the most effective teams demonstrated professionalism, flexibility, and alignment with shared goals. These behaviors elicited more collaborative responses from the AI-driven character, advancing negotiation and modeling professional strategies for overcoming hierarchy and building trust. Collectively, the findings indicate that the simulation fostered both knowledge application and real-time strategic reasoning, a core objective of ITS-based instruction.6,14

The simulation design deliberately preserved psychological safety, realism, and instructional alignment. These principles are essential for trustworthy, pedagogically sound AI agents.22 Learners were briefed that the exercise was a no-fault learning experience, and the AI was programmed to reward effective communication, thereby maintaining trust in the process. Notably, the AI functioned in dual roles: role-player and tutor. While students were unaware of its internal scoring, these assessments shaped the AI’s responses: challenging timid learners to be more assertive or encouraging tact when interactions became too forceful. In this way, the AI’s adaptive feedback provided embedded formative assessment, as learners indicated they could assess their progress from the agent’s responses. This approach reflects scaffolding principles in ITS design and demonstrates how generative AI can deliver real-time, context-sensitive coaching at scale.23

From an educational perspective, this work contributes to a growing body of evidence supporting simulation as a tool for developing non-technical skills such as leadership, communication, and professional advocacy.8 While many ITS applications remain concentrated in procedural domains like science, technology, engineering, and mathematics education, this study illustrates their relevance to emotionally nuanced communication competencies such as persuasion, strategic reframing, and interprofessional dialogue. By simulating realistic hierarchies and embedding feedback into the AI responses, the system offered learners a psychologically safe space to practice assertiveness, empathy, and adaptability. As generative AI becomes more accessible in education, its potential to scale personalized, context-rich learning experiences may extend well beyond healthcare. These observations underscore the value of immersive simulation for practicing higher-order communication skills.

Business and Leadership Implications

Although the immediate focus of this simulation was communication skill development, its design also speaks to the broader business and leadership responsibilities inherent in anesthesia practice. CRNAs serve as both clinicians and organizational leaders, navigating contract negotiations, resource allocation, and interdisciplinary advocacy. These non-clinical responsibilities directly shape workforce stability, perioperative efficiency, and ultimately the safety and accessibility of patient care.

By embedding structured negotiation and advocacy exercises into the simulation, learners engaged in scenarios that mirror real-world administrative and contractual dynamics. Practicing how to balance assertiveness with diplomacy in a psychologically safe environment equips trainees with transferable skills for interacting with hospital administrators, negotiating service agreements, and advocating for policies that support safe staffing and equitable access to anesthesia services. Importantly, these leadership skills extend beyond individual career advancement; they contribute to healthier labor–management relationships, reduced professional burnout, and more sustainable anesthesia delivery models.

For healthcare systems, fostering this dual competency of clinical excellence and effective negotiation and advocacy offers tangible benefits. Providers trained in structured communication are more likely to build collaborative partnerships with administrators, mitigate conflict during contract discussions, and contribute to resource stewardship. For patients, these dynamics translate into safer, more efficient perioperative environments, improved continuity of anesthesia services, and care models that are both high quality and cost-effective.

Framed in this way, the simulation not only advances educational objectives for individual learners but also supports the resilience and adaptability of healthcare systems. By preparing anesthesia practitioners to assume leadership roles in both clinical and business domains, AI-driven simulations may help bridge the gap between professional advocacy and patient-centered outcomes, aligning the interests of providers, institutions, and the populations they serve.

Implementation and Future Directions

From an implementation standpoint, the simulation was integrated into an existing leadership course with minimal disruption. Students required no additional training, and the AI interface was reported as intuitive, supporting its feasibility for broader integration across graduate programs. Future research should explore longitudinal outcomes, skill transfer to clinical practice, and expanded applications of AI-driven simulation in professional education.

Limitations

Several limitations should be acknowledged. Pre- and post-simulation survey responses were collected anonymously, which prevented matched-pair analysis and required the data to be treated as independent samples. This limited control for individual baseline differences and reduced statistical power, though the observed effect sizes still suggest a meaningful early impact. The study also employed a single-group pre/post design without a control or comparison group, making it difficult to establish causality; it remains unclear whether the observed changes were driven by the simulation itself or by other concurrent learning experiences. To protect participant anonymity, demographic data (such as age or gender) was not collected. As a result, potential differences in outcomes or thematic patterns across demographic groups could not be examined. Additionally, outcomes reflected short-term perceptions measured immediately after the simulation, leaving questions about long-term retention and the transfer of skills to clinical practice.

All measures were self-reported, which, while demonstrating high internal consistency, are vulnerable to social desirability bias and cannot capture actual behavior change. The study was conducted with a single cohort of nurse anesthesia residents at one academic institution using a custom-developed AI simulation, which limits generalizability to other learner populations, contexts, or simulation platforms. Although the simulation produced formative feedback using an internal rubric, these automated scores were not externally validated and were therefore excluded from analysis; future research should evaluate the reliability and validity of AI-generated metrics alongside observer ratings or clinical assessments.

Conclusion

This pilot study demonstrates the feasibility and perceived educational value of AI-enhanced simulation for leadership communication training in nurse anesthesia education. Building on these results, future research should assess long-term skill retention and incorporate matched-pair designs to better capture individual growth. Comparative trials may help clarify the unique contributions of AI-based versus traditional simulation formats, while triangulated assessment methods, including observer ratings, AI-generated feedback, and learner reflection, could offer deeper insight into behavioral readiness.

By moving beyond procedural training, this approach strengthens communication, advocacy, adaptability, and leadership, which are identified in the COA’s graduate competencies as essential to safe practice and professional impact.1 AI-driven simulation also complements emerging modalities such as virtual reality, which provide scalable and immersive training opportunities. Together, these technologies can help students develop and apply critical competencies in personalized, context-rich environments that reflect the nuanced demands of modern healthcare.

Accepted: October 14, 2025 EDT

References

1.
Council on Accreditation of Nurse Anesthesia Educational Programs. Standards for Accreditation of Nurse Anesthesia Programs—Practice Doctorate. Council on Accreditation of Nurse Anesthesia Educational Programs; 2024:17-20.
2.
OpenAI. ChatGPT (version GPT-4). OpenAI. 2025. Accessed January 5, 2025. https:/​/​chat.openai.com
3.
Lin CC, Huang AYQ, Lu OHT. Artificial intelligence in intelligent tutoring systems toward sustainable education: a systematic review. Smart Learn Environ. 2023;10:41. doi:10.1186/​s40561-023-00260-y
Google Scholar
4.
Liu S, Guo X, Hu X, Zhao X. Advancing generative intelligent tutoring systems with GPT-4: design, evaluation, and a modular framework for future learning platforms. Electronics. 2024;13:4876. doi:10.3390/​electronics13244876
Google Scholar
5.
Maity S, Deroy A. Generative AI and its impact on personalized intelligent tutoring systems. arXiv. Published online October 2024. doi:10.48550/​arXiv.2410.10650
6.
Sonkar S, Liu N, Mallick DB, Baraniuk RG. CLASS: a design framework for building intelligent tutoring systems based on learning science principles. arXiv. Published online May 2023. doi:10.48550/​arXiv.2305.13272
7.
Stamper J, Xiao R, Hou X. Enhancing LLM-based feedback: insights from intelligent tutoring systems and the learning sciences. In: Olney AM, Chounta IA, Liu Z, Santos OC, Bittencourt II, eds. Artificial Intelligence in Education. Vol 2150. Springer; 2024:32-43. doi:10.1007/​978-3-031-64315-6_3
Google Scholar
8.
Stamer T, Steinhäuser J, Flägel K. Artificial intelligence supporting the training of communication skills in the education of health care professions: scoping review. J Med Internet Res. 2023;25:e43311. doi:10.2196/​43311
Google Scholar
9.
Buljac-Samardzic M, Doekhie KD, van Wijngaarden JDH. Interventions to improve team effectiveness within health care: a systematic review of the past decade. Hum Resour Health. 2020;18(1):2. doi:10.1186/​s12960-019-0411-3
Google Scholar
10.
Agency for Healthcare Research and Quality. TeamSTEPPS Module 2: Explanation of Mutual Support Key Concepts and Tools. June 2023. Accessed March 1, 2025. https:/​/​www.ahrq.gov/​teamstepps-program/​curriculum/​mutual/​tools/​index.html
11.
Hamilton A. Artificial intelligence and healthcare simulation: the shifting landscape of medical education. Cureus. 2024;16(5):e59747. doi:10.7759/​cureus.59747
Google Scholar
12.
Liaw SY, Tan JZ, Lim S, et al. Artificial intelligence in virtual reality simulation for interprofessional communication training: mixed method study. Nurse Educ Today. 2023;122:105718. doi:10.1016/​j.nedt.2023.105718
Google Scholar
13.
Merritt C, Glisson M, Dewan M, Klein M, Zackoff M. Implementation and evaluation of an artificial intelligence driven simulation to improve resident communication with primary care providers. Acad Pediatr. 2022;22:503-505. doi:10.1016/​j.acap.2021.12.013
Google Scholar
14.
Woolf BP. Building Intelligent Interactive Tutors: Student-Centered Strategies for Revolutionizing E-Learning. Morgan Kaufmann; 2009. doi:10.1016/​B978-0-12-373594-2.00006-X
Google Scholar
15.
Lo L, Rotteau L, Shojania K. Can SBAR be implemented with high fidelity and does it improve communication between healthcare workers? A systematic review. BMJ Open. 2021;11(12):e055247. doi:10.1136/​bmjopen-2021-055247
Google Scholar
16.
Agency for Healthcare Research and Quality. DESC script. July 2023. Accessed March 1, 2025. https:/​/​www.ahrq.gov/​teamstepps-program/​curriculum/​mutual/​tools/​desc.html
17.
Castillo AY, Chan JD, Lynch JB, Bryson-Cahn C. How to disagree better: utilizing advocacy-inquiry techniques to improve communication and spur behavior change. Antimicrob Steward Healthc Epidemiol. 2023;3(1):e201. doi:10.1017/​ash.2023.457
Google Scholar
18.
Park M, Kim S, Lee S, Kwon S, Kim K. Empowering personalized learning through a conversation-based tutoring system with student modeling. In: CHI Conference on Human Factors in Computing Systems Extended Abstracts. ACM; 2024:1-10. doi:10.1145/​3613905.3651122
Google Scholar
19.
INACSL Standards Committee. Healthcare Simulation Standards of Best PracticeTM: Simulation Design. Clin Simul Nurs. 2021;58:14-21. doi:10.1016/​j.ecns.2021.08.009
Google Scholar
20.
The jamovi project. jamovi. Published online 2023. Accessed August 21, 2025. https:/​/​www.jamovi.org
21.
Gale NK, Heath G, Cameron E, Rashid S, Redwood S. Using the framework method for the analysis of qualitative data in multi-disciplinary health research. Int J Nurs Stud. 2013;50(4):564-574. doi:10.1016/​j.ijnurstu.2012.09.009
Google Scholar
22.
Masters K, Benjamin J, Agrawal A, MacNeill H, Pillow MT, Mehta N. Twelve tips on creating and using custom GPTs to enhance health professions education. Med Teach. 2024;46(6):752-756. doi:10.1080/​0142159X.2024.2305365
Google Scholar
23.
As’ad M. Intelligent tutoring systems, generative artificial intelligence (AI), and healthcare agents: a proof of concept and dual-layer approach. Cureus. 2024;16(9):e69710. doi:10.7759/​cureus.69710
Google Scholar
24.
Shapiro J, Robins L, Galowitz P, Gallagher TH, Bell S. Disclosure coaching: an Ask-Tell-Ask model to support clinicians in disclosure conversations. J Patient Saf. 2021;17(8):e1364-e1370. doi:10.1097/​PTS.0000000000000491
Google Scholar
25.
Zipkin DA, Umscheid CA, Keating NL, et al. Evidence-based risk communication: a systematic review. Ann Intern Med. 2014;161(4):270-280. doi:10.7326/​M14-0295
Google Scholar
26.
Azar JM, Johnson CS, Frame AM, Perkins SM, Cottingham AH, Litzelman DK. Evaluation of interprofessional relational coordination and patients’ perception of care in outpatient oncology teams. J Interprof Care. 2017;31(2):273-276. doi:10.1080/​13561820.2016.1248815
Google Scholar
27.
Bailey S. Seven steps for having difficult conversations. Am Nurse J. 2021;16(4):14-16. Accessed July 27, 2025. https:/​/​www.myamericannurse.com/​seven-steps-for-having-difficult-conversations
Google Scholar

Appendix A: Evidence-Based Communication Frameworks

Each of the following frameworks was deliberately selected to align with course objectives emphasizing structured communication, leadership, and interprofessional collaboration.

Framework Definition
SBAR (Situation, Background, Assessment, Recommendation) A standardized communication tool used to structure information exchange, particularly in clinical handoffs or when escalating concerns.9,15
DESC Script (Describe, Express, Specify, Consequences) A structured conflict resolution tool that promotes assertive, respectful communication by guiding individuals to describe the situation, express concerns, suggest alternatives, and state potential consequences.16
Advocacy & Inquiry A strategy for raising concerns while simultaneously inviting others’ perspectives, blending assertion with curiosity to support collaborative problem-solving.17
Ask–Tell–Ask A closed-loop communication strategy that checks in before and after delivering information to promote mutual understanding and engagement.9,24
Framing (Risk Framing) A communication technique which emphasizes certain aspects of risk or benefit (e.g., using absolute vs. relative risk, or frequencies vs. percentages) to present data more clearly, support shared decision-making, and encourage value-aligned choices.25
Shared Goals A relational coordination tool that ensures decisions, actions, and responsibilities are guided by a shared objective rather than individual or departmental priorities. This alignment promotes collaboration, reduces fragmentation, and supports more efficient, adaptive teamwork.26
TeamSTEPPS (Team Strategies and Tools to Enhance Performance and Patient Safety) An evidence-based teamwork system developed by the U.S. Department of Defense and AHRQ to improve communication, situational awareness, and mutual support among healthcare professionals. Along with SBAR and DESC, CUS was a specific aspect of TeamSTEPPS that was focused on in the course. CUS enables healthcare team members to escalate safety concerns through a structured three-step process: stating their Concern, expressing why they are Uncomfortable, and identifying the Safety issue.10,15
Seven Steps for Difficult Conversations A structured, communication-focused framework designed to help professionals navigate emotionally charged or high-stakes interactions. The steps include: (1) self-reflection, (2) setting a time and place, (3) raising the issue directly, (4) seeking to understand, (5) collaborating on solutions, (6) recognizing when to pause or reschedule, and (7) engaging in self-care. This approach emphasizes empathy, clarity, and problem-solving to promote respectful, productive dialogue.27