Episode 45: ChatGPT and AI in Healthcare with Dr. David Rhew



About This Episode

David Rhew, Microsoft’s Chief Medical Officer and Vice President of Healthcare, discusses ChatGPT’s uses in healthcare.

Featured Guest: David C. Rhew, MD

Dr. David C. Rhew is the Global Chief Medical Officer and Vice President of Healthcare for Microsoft. He previously served as Chief Medical Officer and Vice President for Samsung; Senior Vice President and CMO at Zynx Health Incorporated; clinician/researcher in the Division of Infectious Diseases at the VA Greater Los Angeles Healthcare System; and Associate Clinical Professor of Medicine at UCLA. Dr. Rhew has served on the National Quality Forum’s Executive Committee for Consensus Standards and Approval. He received his medical degree from Northwestern University and completed internal medicine residency at Cedars-Sinai Medical Center. He completed fellowships in health services research at Cedars-Sinai and infectious diseases at the University of California, Los Angeles. He co-holds six U.S. technology patents that enable authoring, mapping, and integration of clinical decision support into the electronic health record. David is currently an Adjunct Professor at the Stanford University School of Medicine.


Listen to This Episode


Episode Transcript:

Dr. Maddux: Microsoft has partnered with OpenAI to develop an artificial intelligence tool called ChatGPT, a large language model that uses natural language processing to communicate. Because ChatGPT learns from extensive data sets by interacting with clinicians in the healthcare setting, there are opportunities for the tool being an asset in the clinical setting. 

David Rhew, Chief Medical Officer and Vice President of Healthcare for Microsoft joins us today to discuss the possibilities for chat GPT in healthcare. Welcome David.

David C. Rhew, M.D.: Thanks, Frank. Pleasure to be here.

Dr. Maddux: Give us a little overview of Chat GPT and just what it is and how you and Microsoft got involved with it.

David C. Rhew, M.D.: I think it's probably a worthwhile opportunity to start with just artificial intelligence because I think we oftentimes associate them as one in being the same and maybe that some people believe that AI just started. Well actually, AI has been around at least since the 1950s. At least that's when the term was coined, and it really was about a computer acting like a human. What we've learned is that there are ways to make it better and more... likely to be able to perform human-like tasks. So, one of those elements is if it can learn from not only its own mistakes, but from other algorithms that are running in parallel, we refer to that as machine learning. And AI and machine learning have been the primary mechanism where we've been trying to develop algorithms and understand how these technologies can be used to transform information into something that's actionable. Now we have also recognized that a lot of these concepts that we oftentimes tend to rely upon can get clustered together where we might have certain weighting relative to the likelihood of certain concepts being associated with one another and that has created this concept of neural networks and when we think about deep learning it really is tapping into those neural networks. Now generative AI is a subset of all of that. Generative AI is an opportunity where we go from node to node or word to word, learning or anticipating what that next word or concept is. Now in the process of doing that, what we're doing is specifically tapping into some of that knowledge that's inherent within those neural networks. And it's been extraordinary because this type of capability, when we started going to very large networks such as the internet, we realized that there was a tremendous amount of information out there that we could tap into. into and very readily be able to learn from. Now it also has the downside of the fact that there's a lot of information on the internet that is false or it's misleading and in fact the whole concept of moving from node to node based on propensity or likelihood can give you a false impression but it sure sounds relevant, it sure sounds like it's very convincing so we have found that is one of the downsides. But this type of capability is something that really wasn't something that individuals around the world could take advantage of until we put a chat feature in front of it. So that's how you get the chat with the GPT because it's really our interface to be able to tap into this amazing capability. And now anyone, consumer, a clinician, an administrator, anyone around the world can be able to tap into this. Now what we're learning is that it's more than just simply taking a look at the internet. We may be able to create secure environments where this type of information could be, we could run prompts within there. So, enterprise organizations can do this safely, securely, and then we also now have recognized that we have an opportunity to be able to populate or ground that information on documents that we want to make sure are very specific for that. So, when we're looking to interrogate that system, we can then put prompts that will specifically say, well, we'll look at this document first and make sure that we're focused on these particular ways that we want to address it. And that's given us an ability to become much more specific and narrowly focused on what we're trying to do. So, it's extraordinary because we have now a capability that we didn't have before, and it's converted a lot of this manual work that we typically associate with clinical care and even patient care to be able to understand how that new technology can alleviate that and allow us to become much more effective and efficient.

Dr. Maddux: One of the big questions when we talk about healthcare. There's lots of administrative activities where we might see generative AI assisting. But when we get into direct patient care activities in support of the clinicians in the field, the issue of trust comes up. And to what degree can we find a way that the generative AI systems are trustworthy and the, in not hallucinating, not actually developing sort of false premises that are articulated incredibly well. And I've had a lot of questions from colleagues about that and I'm curious on your perspective.

David C. Rhew, M.D.: I'll first talk about AI in general and developing a trustworthy framework and then I'll get specifically into the challenges and how we might be able to mitigate the chances of error and hallucinations or confabulations. Just in general, when we think of artificial intelligence, there are some incredible benefits, but there are also some limitations. And one of them is the fact that, based on the size of the data sets, the diversity of the data sets, we may be relying on very narrowly focused data, which is not generalizable. So one may not trust that this information is something that I would want to apply within my own patient setting. The key is to understand which data you're looking at and to be able to then understand if there's ways that we might be able to change the focus of the data sets, perhaps maybe create larger, more diverse data sets that are representative of the populations. Transparency is very important in that regard. We need to understand what's there and be able to understand how it's working. Now, as we drill down into generative AI, the challenge with generative AI is that, as I mentioned before, it may go down different paths, and those paths may not be accurate or may not lead to the right answers, which can be a problem if you're dealing with situations that you want to get high level of accuracy and high level of reliability. So what we have recognized is that the use case, selecting the use case is very important.,  like one of the most important things that we can do at the very beginning. Decide on a high-impact, low-risk use case. So even if there were an error, that the chances of it doing any type of harm would be mitigated. And that the second piece would be to put that human in the middle, because ultimately what we're trying to do is we're trying to help an individual in the task that they normally would be doing, and as we do that, it would allow us to be able to make sure that we have actually done this in a way that's safe and reliable. And then the last would be to make sure that it's done with a program or a structure such that we're assessing whether or not it actually is working as designed and we're not over-relying on it or that we're not under-relying on things and that's actually having the impact that we desired. So, all of these things have to be done to ensure that level of trust within the system to make sure that we're not missing out. Now, I want to give you an example because I talked about the importance of the use case. I’ll give you an example which the model itself is identical, but the use cases are different, and you'll find that based on the use case, you'll have a different impression in terms of how impactful it is. So this one would be, let's imagine that we are using the CHAT GPT to come up with a differential diagnosis for a specific patient. Okay, so you've entered the patient information and you're trying to say, okay, what's the most likely diagnosis? So if we were to use this in a clinical setting just as a clinician relying on it, that might be problematic because you may come up with the wrong answer. Now let's imagine another scenario. Let's imagine a patient has actually been bounced around from clinician to clinician. They've tried everything. They can't figure out what's going on. The doctors have said they can't figure out. This is something that they just don't know what's going on. They run it by the CHAT GPT. They put it into the creative mode. They come up with the fact that this is a tethered cord syndrome or this is an enzyme deficiency. These are actually both real examples. And they find, well, you know what? I never thought of that. Let me check it out. And sure enough, it's that diagnosis. These are really great examples because what you're now finding is that the same model in a different use case can have a completely different perspective in terms of how one could be applying it in a safe and effective manner. So use case is really important, but it's also important that we put that human in the middle and that we have the ability to do the QA on it.

Dr. Maddux: We've seen CHAT GPT, CHAT GPT-4 develop and talk a little bit about the data sets that are used to train these models and how the ability to expand sort of the healthcare indication is emphasized in some of the more current versions of what you anticipate these tools will develop into.

David C. Rhew, M.D.: GPT has also undergone transformation, adding more and more parameters, which has increased the capabilities and also the accuracy and the reliability of the results. So, I'd say, you know, we'll continue to see versions right now are on GPT-4, but anticipate, you know, at some point a GPT-4.5 or a GPT-5 and etc.  What we see today, the capabilities will only get better. That's an important thing to know. We can't simply look at a study and say, oh, it didn't work well, or it worked great. And the reality is it may end up being different as we start going into different versions. Now, what we're doing is we're actually adding more and more to the internet, which is the additional model that we're basing it off of. Now, as we start getting into healthcare-specific components, we may find that there are different models that we need to start looking at. Some may be very specifically focused on particular areas where we will be having it learn with that as the foundation, but what we will find is that those will have different sets of capabilities and you know we're heading down the path where very soon we will have multiple different types of large language models, some very specific for use cases and some very large in general and I think we're probably going to see an opportunity to be able to take advantage of both of those.

Dr. Maddux: We know in our public society today there are plenty of biases. For example, we have a variety of biases in the way we do randomized control trials. We have biases that narrow the data sets on what's important. How do these large language models in generative AI address the fact that they're learning off a bias data set, in many cases to start with, to begin to overcome some of the bias that's inherent in the information that exists sort of publicly today?

David C. Rhew, M.D.: That is a fantastic question and one that we are all grappling with. We're trying to figure out first and foremost, is there bias in the results? And in fact, sometimes we don't even realize that there's inherent bias in the things that we do every single day. As a clinician, we rely on these rules, these calculators, these algorithms, not recognizing that they may have been based on very limited data sets and maybe presenting some bias. If you look at the internet. The internet has biases in it, and these could be subtle, implicit biases that we don't even realize are going on, assuming that there's particular gender roles associated with particular areas. So even when we put the names of he, she, these are the types of things that we see and associate. Ethnicity is oftentimes getting associated. These are the things that we have to look at when we start identifying these outputs and we have to start figuring out how can we adjust and fix those. So that, I think this will be an ongoing challenge because implicit in everything that we do is this understanding that the knowledge and the data is going to be where it's based on and if there's you know some level of selection bias in terms of who we've, where we've  captured the data then it will be difficult for us to generalize. And we're going to need to take a lens, both the clinical lens, but we may need ethicists looking at this, people that have a different perspective. If you come from a vulnerable population or a minority population, your point of view is going to be different. And so as you look at this, then you'll say, wait a minute, that's not the way that we think about it within my community. And that's the type of opportunity for us to be able to start thinking about how we can better address this. It's not just going to be about a single data scientist trying to figure out, sort this out. It's going to be about a dingle data scientist trying to figure this out. It’s going to be groups of people with diverse experiences and points of view, all looking at the outputs and trying to figure out how we can address them as we start building these systems and making them more scalable.

Dr. Maddux: We know some countries have limited use and accessibility to generative AI compared to the public. We know that regulatory environments for privacy are very different, for example, in the United States than they are in Europe with GDPR versus HIPAA. How do you expect the regulatory environment to evolve and is the technology advancing faster than societies ability to figure out how they want to regulate this activity.

David C. Rhew, M.D.: For sure. We are definitely seeing the pace of innovation and technological advancements surpass our ability to be able to understand how it can be applied in society. And that is something that we will continue to be challenged with. We saw this with the internet, and we saw this with social media, and we saw this with a variety of other areas. But in the area of AI, what makes this particularly important is that we have an opportunity to solve some very pressing and urgent issues in healthcare. And so rather than treating this as sort of a, well, you know what, let's just pause or stop. What we have to recognize is that this needs to continue with a sense of urgency to address some of these critical issues. So like, I'll give you an example. In the United States, and probably globally, we are faced with a clinician workforce crisis. Clinicians are burned out. They are leaving the practice. We're talking about doctors, nurses, pharmacists, and without clinicians, we have no healthcare. It is an existential threat to the entire healthcare industry if we do not find ways to alleviate the excessive burden, administrative burden that is placed on our clinicians. And at the same time, we now have a technology that can transform that entire process, remove so much of that. It's not just about removing paperwork. It is about helping clinicians be able to work and do the things that they want to do and ultimately stay in the practice. That is a critical piece to what we're trying to accomplish here. There are other pressing issues. Access to affordable healthcare. That starts with knowledge and empowering individuals with that knowledge. We now have an ability with these type of tools to be able to do that. We have an ability to be able to remove a lot of that administrative waste and excessive cost in the system. So that it is not too expensive or so expensive for economies. And so if we think about what's crushing economies, it's the rising cost of healthcare. And we have now an ability to remove or at least try to decrease some of that. There are so many major problems that can be addressed with this technology. And if we choose not to leverage it and just sort of put a pause on it, keep in mind who's pausing. It's the responsible, good actors. You know who's not pausing? The bad actors who will then ultimately continue propagate its use in areas that will ultimately impact all of us. What's important is that we're having this discussion. But it's very important that we also recognize that this is an incredibly important, powerful tool for good. And we need to make sure that we put the guardrails around it to make sure it's actually done correctly. But don't take it away from the folks that are trying to solve some of these biggest problems.

Dr. Maddux: I recall back in the early, early days of the internet as it went from being a private internet to the public internet. At that time, you had tools that remind me of the fact that this language model is text-based right now. But we went through, remember the Minnesota gopher system and the FTP system and then. Tim Berners-Lee put the specification out for the World Wide Web and you suddenly had a multimedia environment that developed over time. What is the environment that this large language model will develop into? Will it include the ability not only to articulate language, which it does unbelievably well? But articulate expression and other kinds of things that are other dimensions of communication. What do you anticipate this will lead to?

David C. Rhew, M.D.: We are heading down a path of multimodal data sets in AI with a set of capabilities that if we combine them together, we can do so many amazing things. What you just shared is a great example of how we may be able to start thinking about these different tools to be used. Today, generative AI, we oftentimes associate that with chat GPT, large language models. But when we think about foundation models, we're talking about images, other sources of information. We have mechanisms today to be able to apply, I'll call it standard, traditional AIML to convert unstructured forms of data into structured forms, whether it's voice, text, images. That information combined with the type of information that we know from text, that may be from clinical progress notes, it could be from published reports. This is now an incredible opportunity to be able to combine these different data sets and truly understand what's going on. I think what we have been dealing with since the history of technology in healthcare has been mechanisms where individuals have to enter this information into a system in a structured way for it to be usable. Now we have an ability to be able to take a look at all the other information that's out there that's been entered in through whatever mechanism. It could be... a text message, it could have been a clinician just typing away at some kind of a paragraph in terms of what's happening. All of that now can be structured and actually applied. Voice is an incredibly powerful tool. We've barely scraped the surface in terms of what we can do with voice, not just in terms of the words that are spoken, but also the intonations, understanding whether the person actually is representing a medical condition behind that voice, the sentiment behind it, the emotions. That's all critical as well. This is an exciting time. We've got more tools available to us to be able to do more things than we've ever had in our entire lifetime. And that's why many people believe this is a pivotal moment in health care, because we're actually now starting to realize that the data sets that we have had in so many ways sort of been siloed are now being opened up and being interpreted in a way that we can actually make sense of it.

Dr. Maddux: Our company is involved in the care of people with advanced kidney disease predominantly and as this disease progresses towards its life- threatening stage, it becomes progressively more important to engage those patients in good health behaviors, good choices, and consistency in their ability to receive their health care. Anything come to your mind as you think about it in kidney care from the use case standpoint that we should be thinking about particularly?

David C. Rhew, M.D.: We'll use chat GPT as an example. So what's chat GPT really good at? It's really good at taking information, converting it into something that people can understand in their language, at their grade level, and with empathy. So immediately we've got now a new tool that we can use to be able to help better communicate to individuals that oftentimes we assume, oh, they must speak English because I speak English. Well, not everyone speaks English. They must speak at this grade level. I assume that everyone understands these terms. Not necessarily so.  We've got this amazing tool. We also have an ability to start looking at all the other factors that we know are important. Your zip code or your postcode, that has a huge impact in terms of your ability to be able to take better care of yourself. There may be issues around transportation, housing, food insecurity, and things of that sort. We know that these are major challenges, but yet at the same time, when we talk about interventions, we assume that everyone can still, they'll figure it out themselves. What we can do is we actually can start looking through all the data. There's the ability to use text analytics to be able to extract information and then convert that and look at for social determinants of health. I mean, this is actually something that organizations, large regulatory are asking clinicians to be able to abstract starting in 2024 on every single patient encounter, or at least have that documented. We have now technology that can do that. We can do that behind the scenes. We can do it concurrently leveraging other healthcare providers or other members of the team. So we have now an ability to be able to look more holistically at the patient and understand how we may be able to take these tools and apply it in such a way that we can better communicate and hopefully address some of those issues. Because they may say, yeah, I understand what you're trying to do, but it doesn't help me because I can't afford the medicine. Well, if we can somehow match those individuals with programs that support them and deliver it at the time, then we may be able to close some of those gaps. We're starting to realize that part of it is these communication tools, but the other part is how do we enable individuals to be able to be connected to the programs that will allow them to be able to then be successful. And that's another exciting part of what the technology can do.

Dr. Maddux: Yes, so much of what we do with our patients is try to make sure they understand what their options are for therapies and how to how to treat themselves and the ability to clearly envision a interaction with the chatbot that created an opportunity for us to essentially perform a discrete choice survey on that patient based on what their needs and wants in life are without them. really having the typical question answer survey, but just through conversation. We think there are quite a few use cases that are different from hard diagnoses, but actually in interacting with the people we interact with. We operate our company in 150 different countries. That's a lot of languages. It's a lot of places where, although English is the language of the company, the official language of the company. You know, much of our company speaks natively another language. I've been fascinated with the stories that I've read that the chat bots, chat GPT and others that were trained on certain languages suddenly began to understand other languages over time as they, you know, trained through the literature and through what's available. How does that actually happen? Does anyone understand what actually happened in the ability for these models to learn languages that they weren't actually trained on?

David C. Rhew, M.D.: There are so many mysteries behind the Chat GPT. If you were to ask a lot of the experts in the area, and I don't consider myself an expert, I just consider myself an observer of what's been going on. But the folks that actually have been spending a lot of time, they can't even figure out why and how it's doing these things. I think partly it's because it's such a large network, and what's happening is as you provide specific prompts and head down specific paths, And sometimes you head down even with different prompts, a different path and so you may end up with a different result. And when you head down paths of different languages and start trying to connect, make those connections, and you start, and the chat knows what your last response was so it actually can rely on the last prompts and some of the things it's learned. It can then base it off of that so it ends up kind of going down different paths that we hadn't anticipated. That's what's really exciting about this, that you know there is an amazing tool. Now there's an element here that we feel uncomfortable with because we like to be very certain and 100% know that we're going to have the exact same result every single time. But when you go in and talk about what humans do regularly, we talk about the need for us to be able to have a high reliable consistent result that's very accurate. But we also as humans are in this creative mode where we're trying to think about these other ways and things that we haven't thought of. And that's one of the really interesting things about the chat GPT is that we can actually change what was referred to as the temperature settings. And we can change it such that it can be more in the creative mode versus more in sort of the instructive, reliable, accurate mode. And they're both important. And obviously we usually try to set it somewhere in the middle, but at certain times it is important for us to be able to kind of think out of the box and to be able to move into different directions and start thinking about things and then building off of that.

Dr. Maddux: I've had the chance to take some of the things I've had to write to our organization and many of the providers we work with and have ChatGPT translate that into a wide variety of languages. and then have somebody native from there tell me whether this is a good translation or not because we work with them in both English and their native language. And the models are unbelievably well-versed in language. At their very base level, their ability to translate is remarkably good and immediate almost. And so, it's been quite interesting to sort of test that out in the real world. What do you think we as a multinational public company and organization should be thinking about with regard to our adoption of these technologies in the coming months and years?

David C. Rhew, M.D.: I would say the very first thing is that I think organizations have to realize this is not just an incremental change, just yet another technology. This is a transformative technology, much like when the internet came. If one were to say, well, I'm not really gonna care about the internet because we do things our own way, in all likelihood, that business is gonna be severely disrupted. It's time to embrace this and understand how you can apply it within your organization, and not just for one use case, but really across multiple different categories. You know, everything from supply chain, administrative workloads, billing, to the actual clinical care, patient engagement, research, you name it, across all of those different categories and more. So that would be the first thing. The second is that with artificial intelligence, we do recognize that it carries with it a sense of responsibility. That means we have to start thinking about how to build in processes, governance processes that allow us to be able to address those issues we talked about earlier. The fact that we don’t want to propagate bias, the fact that we want to make it, you know, explainable, transparent. We want to make sure it's working as designed, that, you know, we have actually not created harm, that we don't have unintended consequences, that we don't see the models drifting based on different inputs that are coming in. So, we need to actually look at how do we set up a process. This is not just going to be on the person who built the, tested the original model, which may have, they may have done all the things right. But as soon as it goes out in full deployment, that's when we start needing to think about do we have a process to be able to actually look into this. And that is going to require a focus for the senior leadership to adequately resource for artificial intelligence. Artificial intelligence in some cases we are seeing now is across the board the part of their strategic plan it needs to be treated like that. and not just a software upgrade. And so I think that will probably be one of the biggest changes that organizations have. Now they're starting to look at how responsible AI can be applied, not just as a nice thing to have, it's important, but actually implemented within their own organization so that they are making sure that everything they say they're gonna do, they are doing.

Dr. Maddux: I've been here talking to Dr. David Rhew, Chief Medical Officer for Microsoft, about the remarkable transformations that CHAT GPT and generative AI have made in society and fields of medicine. David, thank you so much for joining us today.

David C. Rhew, M.D.: It's been my pleasure. Thank you very much for having me.