Interview of Jean-Emmanuel Bibault, Oncologist and researcher specialised in artificial intelligence at the Georges Pompidou European Hospital. His research focuses on the use of AI in medicine, to predict the evolution of cancers and the effectiveness of treatments.
1. The use of AI for medical imaging is one of the most promising solutions for improving diagnosis and care in general, but some are calling for caution regarding the results of such new equipment. What is the state of knowledge and what are the existing limitations?
Indeed, AI, and more specifically what is known as Deep Learning, is quite effective for image analysis, particularly in medical imaging, scanners or MRI.
There are several aspects to consider. Firstly, do existing algorithms work well for image-based diagnosis? In reality, we know they work very well providing they are properly trained. However, there is a risk of training them inappropriately, for instance, by using data sets that are not representative of a population or that come from a minority population. Secondly, with regard to the use of these algorithms – provided they have been properly trained – it will also be necessary to validate their use in medical studies, much as we do for the validation of medication. As such, medical studies will probably be needed to evaluate the use of AI in the analysis of medical images.
A recent example concerns the use of AI for mammography screening. A study carried out in Sweden on 80,000 patients showed, for example, that the double reading of mammograms by two doctors, as is currently done in France and generally in the western world, can be just as effective as a reading by a doctor and a reading by an AI. This does not lead to a failure to detect breast cancer, but saves up to 44% of medical time. This means that more screenings can be carried out, at a lower cost and more quickly. So there seems to be a potential benefit, yet it needs to be assessed through rigorous medical studies.
Then there is education and teaching in the use of these techniques. Because we know that even if we have high-performance algorithms, it is possible to use them inappropriately. A recent study showed that doctors had changed their diagnosis because they saw that the AI had provided a different diagnosis. They then lost confidence in the AI and gave the opposite diagnosis. So there is a whole question about the importance of having high-performance algorithms, but also about the need to know how to use them correctly and to have enough confidence in them. It is therefore essential to teach doctors about the use of algorithms and quality criteria, so that they know how to use these algorithms properly, but also how to identify potentially defective algorithms.
The final aspect concerns the democratisation of access to high-performance algorithms, wherever we are. Personally, I tend to think that if we use algorithms intelligently, we will be able to deploy them everywhere in France and throughout the world at the touch of a button. This would enable us to offer an equivalent quality of medical service everywhere, much more easily than we can deploy doctors. For instance, a difference of one millimetre in the cancer demarcation before radiation occurs, can reduce the chances of recovery by 10%, which is considerable. Such problems could be avoided by using high-quality algorithms. So, as far as access to care is concerned, I think that if we act wisely, this will help democratise and improve the quality of care, rather than the opposite.
2. How do you respond to those who think that AI will reduce health inequalities? Or how can it increase them?
There is potential for improvement, but there are also risks to consider. I can give you an example: a study published in 2016 by Stanford University, in the journal Nature, compared the performance of a Deep Learning algorithm in recognising melanomas on skin photos against 21 expert dermatologists.
It turned out that the algorithm performed better than the dermatologists in the database used to train it. However, it was then realised that this database contained very few, if any, images of African-American skins. As a result, the algorithm did not work well at all for black skins. This highlights the importance of the representativeness of the data used to train AI algorithms. If we are not careful in this respect, we run the risk of creating algorithms that are not adapted to certain populations, without even realising it. It is a significant problem to consider.
One way of avoiding this kind of bias is to add an interpretability component to algorithms. This means using techniques that make it possible to understand why an algorithm has made a particular diagnosis, thus avoiding the “black box” effect. We already have techniques for doing this, and it would make it easier to detect surprising, biassed or problematic results, as opposed to a situation where the algorithm was operating in an opaque way. This is a risk reduction aspect.
In terms of potential benefits, one well-known example concerns diabetic retinopathy and its automated diagnosis using Deep Learning. Google recently conducted a clinical trial in India to rigorously validate a machine capable of diagnosing and assessing diabetic retinopathy in an automated fashion. This could have a positive impact, particularly in a country like India, where the number of ophthalmologists is limited, but where there are many cases of diabetic retinopathy due to the high number of diabetics. Having a diagnosis is a crucial step, although treatment and follow-up are also essential. AI could help to fill medical deserts, including possibly in France.
However, I feel it is important to add for political decision-makers that we must not underestimate the need for medical and paramedical staff and direct all investment towards AI tools. The approach should not be based on a simple comparison of performance between AI and healthcare professionals, but rather on the ambition to do better. It is possible that in the future, an AI could match a doctor in terms of diagnosis, but our aim must be to ensure better quality of care and optimal management for all.
We are already facing a shortage of doctors and healthcare workers, which is jeopardising our ability to provide high-quality care. In the future, we will need both AI and healthcare professionals to maintain and even strengthen the quality of our healthcare system.
3. What is your perspective as a doctor and researcher on the regulatory issues surrounding AI in healthcare?
I think there is a risk of over-regulation in France and Europe. There is no doubt that AI tools will continue to be developed, mainly in the United States and China. These tools will have the capacity to capture considerable economic value, especially if they demonstrate improved medical performance.
However, it is essential to understand that this will lead to a new dimension of economic appropriation, something that has already given rise to much discussion, particularly in the context of AI. Medicine is no exception, and if high-performance tools are developed, our patients will legitimately demand to be able to benefit from them.
Beyond the economic dimension, there are also public health concerns, particularly with regard to bias. The daily use of tools developed in the United States or China could mean that these tools are better suited to treating American populations than European populations. This could indirectly lead to a loss of opportunity for European patients if we were to use only American tools. I believe it is a priority to promote the development of AI tools in Europe using European data sets.
Regulation is necessary to protect patients, but it must not be so excessive that it slows down innovation. The GDPR is an example of this, as it has already been criticised for the obstacles it imposes on research by imposing long and complicated procedures for accessing data.
Ultimately, I think it’s a question of striking a balance between regulation that protects patients and overly strict regulation that hinders innovation.
4. The French Bioethics Law stipulates that patients must be informed about the use of AI-enabled medical devices and that developers of AI in healthcare must ensure that the algorithms can be explained to users. What do you think?
In medicine, I think it’s important to be able to explain things properly to patients. However, the idea of informing patients about the use of AI seems to me to be out of touch, if not absurd.
Let’s imagine a cancer patient who goes to a medical establishment for radiotherapy. At some point, AI is used in the treatment preparation process. If we decide to inform the patient of this use, it raises questions about the implications of this information. If the patient refuses, what can we do? Suggest that they go elsewhere? This is not a solution, because medical processes are often standardised, with workflows established and implemented in a similar way in all facilities.
On the other hand, it also raises a very pertinent question about tolerance for errors. At present, it is difficult to determine precise statistics, but human medical errors remain one of the main causes of death in the western world. When we talk about integrating AI into healthcare, we immediately seem to focus on specific cases of potential error, which is surprising. This approach touches on our perception regarding medical errors and how we react when some form of automation is involved in healthcare, broadly defined. Of course if mistakes are made, responsibility must be established, whether or not AI was involved.
At the end of the day, it is likely that errors attributable to systems exploiting AI, directly or indirectly, will be far less frequent than those caused by humans. I think it would be wiser to take a more epidemiological and statistical approach to assessing the impact of AI on medicine.