The Articulate Medical Intelligence Explorer (AMIE), an artificial intelligence algorithm, has been trained to have medical conversations. In talks with "patients" who were trained actors, it did as well as or even better than human doctors.
AMIE, based on a big language model developed by Google, showed better accuracy in diagnosing respiratory issues, cardiovascular conditions, and various illnesses compared to a regular doctor. Additionally, it extracted a similar amount of information from conversations as human doctors did, showing more empathy, according to a preprint study by Tao Tu and colleagues from Google Research and Google DeepMind.
Creating the Algorithm
Talking between doctors and patients (anamnesis) is vital for diagnosing and treating illnesses. AI systems that can handle these conversations could make medical help more accessible and better.
The developers faced a challenge because there wasn't enough medical conversation data. So, they fine-tuned the LLM base model using real datasets like electronic health records and transcribed medical conversations in the first phase.
To further train the model, researchers had the LLM play the role of a person with a specific ailment or an empathetic doctor. The algorithm also acted as a critical colleague, evaluating the doctor's interaction and giving feedback.
Successful Testing
In the testing phase, 20 participants (actors simulating symptoms) had online text-based consultations with AMIE and 20 general practitioners. Participants didn't know if they were talking to a human or a bot. AMIE matched or exceeded the accuracy of doctors in all six medical specialties and outperformed human doctors in 24 out of 26 conversation quality criteria.
Alan Karthikesalingam, a clinical researcher at Google Health, noted that general practitioners might not be used to text-based chat, affecting their performance. He also mentioned doctors might tire faster than a bot with long, structured responses.
Challenges Ahead
After the successful pilot, researchers plan more studies to identify biases and ensure consistent results. The Google team is also addressing ethical requirements for tests with real patients.
Privacy is a crucial concern. Daniel Ting from Duke-NUS Medical School stressed the need for transparency about data storage and analysis.
Ensuring Good Care
Despite the potential of the chatbot, the authors highlight that it's not ready for clinical care. Adam Rodman, MD, from Harvard Medical School, emphasized that the tool should not replace doctor interaction. Medicine involves more than information; it's about human relationships.