July 1, 2025 | Jacee Robison, RN, MS
Our team recently participated in a generative artificial intelligence (AI) experiment where we set out to explore the capabilities of using generative AI models in mapping ICD-10-CM and SNOMED-CT codes to high level categories. Our goal was to determine whether these advanced AI models could accurately and reliably perform this task, which is otherwise manual for our team.
We utilized three generative AI models, ChatGPT 4, Perplexity and Gemini Advanced, to see if they could map ICD-10-CM and SNOMED-CT codes to broader categories. We prompted each model to try finding the concepts within the terminologies that related to our broad categories. One example of a prompt used was, “Return all SNOMED-CT codes including their descriptions related to the following term: Chest trauma.” We also used several follow-up prompts to see what results were generated or if further codes were revealed.
Our findings were a mix of promising results and notable challenges:
Our experiment with ChatGPT 4, Gemini Advanced and Perplexity in mapping ICD-10-CM and SNOMED-CT codes revealed both the potential and the current limitations of generative AI in the standard terminology space. While the models showed some promise, their unreliability underscores the need for further training and refinement. One of the key takeaways from our experiment is the importance of training and fine-tuning AI models for specific tasks. In our case, we did not train the models, which likely contributed to the unreliable results. We believe that incorporating a retrieval-augmented generation (RAG) approach could enhance the models' performance by providing them with more relevant context and data.
As we continue to explore these technologies, we remain optimistic about their future applications in assisting with standard terminology mapping.
Jacee Robison, RN, MS is a nurse informaticist at Solventum.