| Makale Türü | Özgün Makale (SSCI, AHCI, SCI, SCI-Exp dergilerinde yayınlanan tam makale) | ||
| Dergi Adı | MEDICINA-LITHUANIA (Q1) | ||
| Dergi ISSN | 1010-660X Wos Dergi Scopus Dergi | ||
| Dergi Tarandığı Indeksler | SCI-Expanded | ||
| Makale Dili | İngilizce | Basım Tarihi | 07-2025 |
| Cilt / Sayı / Sayfa | 61 / 8 / 1–28 | DOI | 10.3390/medicina61081342 |
| Makale Linki | https://doi.org/10.3390/medicina61081342 | ||
| UAK Araştırma Alanları |
Anatomi
|
||
| Özet |
| Background and Objectives General-purpose multimodal large language models (LLMs) are increasingly used for medical image interpretation despite lacking clinical validation. This study evaluates the diagnostic reliability of ChatGPT-4o and Claude 2 in photographic assessment of adolescent idiopathic scoliosis (AIS) against radiological standards. This study examines two critical questions: whether families can derive reliable preliminary assessments from LLMs through analysis of clinical photographs and whether LLMs exhibit cognitive fidelity in their visuospatial reasoning capabilities for AIS assessment. Materials and Methods A prospective diagnostic accuracy study (STARD-compliant) analyzed 97 adolescents (74 with AIS and 23 with postural asymmetry). Standardized clinical photographs (nine views/patient) were assessed by two LLMs and two orthopedic residents against reference radiological measurements. Primary outcomes included diagnostic accuracy (sensitivity/specificity), Cobb angle concordance (Lin’s CCC), inter-rater reliability (Cohen’s κ), and measurement agreement (Bland–Altman LoA). Results The LLMs exhibited hazardous diagnostic inaccuracy: ChatGPT misclassified all non-AIS cases (specificity 0% [95% CI: 0.0–14.8]), while Claude 2 generated 78.3% false positives. Systematic measurement errors exceeded clinical tolerance: ChatGPT overestimated thoracic curves by +10.74° (LoA: −21.45° to +42.92°), exceeding tolerance by >800%. Both LLMs showed inverse biomechanical concordance in thoracolumbar curves (CCC ≤ −0.106). Inter-rater reliability fell below random chance (ChatGPT κ = −0.039 … |
| Anahtar Kelimeler |
| adolescent | scoliosis | artificial intelligence | neural networks | diagnostic errors | clinical competence | photography |
| Atıf Sayıları | |
| Web of Science | 2 |
| Google Scholar | 4 |
| Dergi Adı | Medicina-Lithuania |
| Yayıncı | Multidisciplinary Digital Publishing Institute (MDPI) |
| Açık Erişim | Evet |
| ISSN | 1010-660X |
| CiteScore | 4,1 |
| SJR | 0,710 |
| SNIP | 0,997 |