• Medientyp: E-Artikel
  • Titel: Performance of ChatGPT on basic healthcare leadership and management questions
  • Beteiligte: Leutz-Schmidt, Patricia; Grözinger, Martin; Kauczor, Hans-Ulrich; Jang, Hyungseok; Sedaghat, Sam
  • Erschienen: Springer Science and Business Media LLC, 2024
  • Erschienen in: Health and Technology, 14 (2024) 6, Seite 1161-1166
  • Sprache: Englisch
  • DOI: 10.1007/s12553-024-00897-w
  • ISSN: 2190-7188; 2190-7196
  • Entstehung:
  • Anmerkungen:
  • Beschreibung: Abstract Purpose ChatGPT is an LLM-based chatbot introduced in 2022. This study investigates the performance of ChatGPT-3.5 and ChatGPT-4 on basic healthcare leadership and management questions. Methods ChatGPT-3.5 and -4 (OpenAI, San Francisco, CA, USA) generated answers to 24 pre-selected questions on three different areas of management and leadership in medical practice: group 1) accessing management/leadership training, group 2) management/leadership basics, group 3) department management/leadership. Three readers independently evaluated the answers provided by the two versions of ChatGPT. Three 4-digit scores were developed to assess the quality of the responses: 1) overall quality score (OQS), 2) understandibility score (US), and 3) implementability score (IS). The mean quality score (MQS) was calculated from these three scores. Results The interrater agreement was good for ChatGPT-4 (72%) and moderate for ChatGPT-3.5 (56%). The MQS of all questions reached a mean score of 3,42 (SD: 0,64) using ChatGPT-3.5 and 3,75 (SD: 0,47) using ChatGPT-4. ChatGPT-4 showed significantly higher MQS scores in group 2 and 3 questions than ChatGPT-3.5 (p = 0.039 and p < 0.001, respectively). Also, significant differences between ChatGPT-3.5 and ChatGPT-4 regarding OQS, US, and IS in group 3 questions were seen with significances reaching p < 0.001. Significant differences between the two chatbot versions were also present regarding OQS in question groups 1 and 2 (p = 0.035 each). 87.5% of the answers provided by ChatGPT-4 (21 of 24 answers) were considered superior to the answers provided by ChatGPT-3.5 for the same questions. Neither ChatGPT-3.5 nor ChatGPT-4 offered any inaccurate answers. Conclusion ChatGPT-3.5 and ChatGPT-4 performed well on basic healthcare leadership and management questions, while ChatGPT-4 was superior.