AUTHOR=Moskatel Leon S. , Zhang Niushen TITLE=The utility of ChatGPT in the assessment of literature on the prevention of migraine: an observational, qualitative study JOURNAL=Frontiers in Neurology VOLUME=14 YEAR=2023 URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2023.1225223 DOI=10.3389/fneur.2023.1225223 ISSN=1664-2295 ABSTRACT=Background

It is not known how large language models, such as ChatGPT, can be applied toward the assessment of the efficacy of medications, including in the prevention of migraine, and how it might support those claims with existing medical evidence.

Methods

We queried ChatGPT-3.5 on the efficacy of 47 medications for the prevention of migraine and then asked it to give citations in support of its assessment. ChatGPT’s evaluations were then compared to their FDA approval status for this indication as well as the American Academy of Neurology 2012 evidence-based guidelines for the prevention of migraine. The citations ChatGPT generated for these evaluations were then assessed to see if they were real papers and if they were relevant to the query.

Results

ChatGPT affirmed that the 14 medications that have either received FDA approval for prevention of migraine or AAN Grade A/B evidence were effective for migraine. Its assessments of the other 33 medications were unreliable including suggesting possible efficacy for four medications that have never been used for the prevention of migraine. Critically, only 33/115 (29%) of the papers ChatGPT cited were real, while 76/115 (66%) were “hallucinated” not real papers and 6/115 (5%) shared the names of real papers but had not real citations.

Conclusion

While ChatGPT produced tailored answers on the efficacy of the queried medications, the results were unreliable and inaccurate because of the overwhelming volume of “hallucinated” articles it generated and cited.