October 18, 2024 | Nathan Brake
A few years ago, I was at an indoor rock climbing gym with some friends and was introduced to another climber who happened to be a physician. He had this boxy thing clipped to his hip, and I ignorantly asked, “What is that thing?” The answer came: “It’s a pager.” Being a kid from the early ‘90s, I can only vaguely remember the days when my dad had a pager, and my mom carried around a cell phone that was the size of a brick. So, I wondered, do hospitals still use pagers?
I love a good podcast, and I’ve been a longtime listener of Planet Money by NPR. The hosts excel at explaining economic issues in terms that non-economists can understand. Late last year, they released an episode about the lingering usage of pagers in hospital systems, which I found to be insightful. Pagers have historically been the technology of choice to contact on-call physicians, initially out of necessity, due to unreliable cell networks and the lack of other technology. However, despite the widespread adoption of the smartphone and the broad availability of Wi-Fi and strong cellular reception, it largely remains a mainstay of hospital communication.
On one hand, it makes sense that the pager is still being used. Sometimes, it’s easier to continue using old technology than to learn to use something new. The Planet Money episode explains that the pager also provides something different from a smartphone: an appropriate level of friction in communication. The limitation and difficulty of communication with a pager is one of its greatest strengths since it takes some effort to send a page and get connected with the physician being paged. A recent research publication about the use of secure messaging platforms found secure messaging platforms (e.g., Microsoft Teams, Epic Secure Chat, Voalte) were widely used but were more disruptive and increased multitasking as compared to the use of pagers.
Of communication technology, the author Neil Postman writes in Amusing Ourselves to Death, “the medium is the metaphor.” In other words, the type of technology used in communication defines the kind of information that will be conveyed and highlighted.
Since the release of ChatGPT in late 2022, there has been an explosion of interest and adoption of generative artificial intelligence (AI) technologies in the health care space. Historically, the adoption of any new technology for health care is slow, but this initially is not the case with generative AI. Particularly, it’s being leveraged to automatically summarize physician-patient conversations into clinical notes. The time and cost savings made possible through this automatic summarization are appealing to both physicians and hospital systems alike.
Although generative AI has already and will continue to provide relief for physician documentation burden, it’s important to consider the ways this new medium controls the content we communicate. Just like pagers enforce a certain style of communication due to their technical limitations, large language models (LLM) similarly produce a certain style of communication due to their technical strengths.
As of early 2024, LLMs such as ChatGPT and GPT-4 are trained using a technique called Reinforcement Learning from Human Feedback (RLHF). The method involves training a human reward model as a proxy for human preferences. This proxy for human preferences is then used to align the performance of the LLM to appeal to humans further. Although truthfulness is a priority, the training mechanism aligns with what humans prefer. In other words, even if it doesn’t get the answer right, it will still generally output something believable. Similarly, an LLM’s focus on predicting the most likely next word can lead to outputs that sound reasonable but may not be factually accurate or capture the intended meaning.
If you use ChatGPT for any amount of time, you quickly realize this is the case. With this in mind, how does it affect the way that we understand clinical notes generated from this technique? The “medium” of current generative AI systems is defined and controlled by its ability to communicate believable content.
The prioritization of believability is what makes the applications being built to deliver AI technology to physicians so important. The risk of false but believable AI-generated content being created highlights the importance of good software solutions to compensate for this vulnerability. As trust and reliance on generative AI increases, guardrails and AI-powered quality analysis systems become increasingly important to defend against reasonable but false content making its way into our clinical documentation.
Nathan Brake is a machine learning engineer and researcher at Solventum.