An artificial intelligence (AI) model has been developed that accurately predicts the disease-risk associated with human gene variants.
The sequencing of the human genome has revealed substantial genetic variation. However, for 98 percent of gene variants, the clinical implications are unknown. An international team of researchers from the University of Oxford and Harvard University in Boston, Massachusetts, built an AI tool named EVE – evolutionary model of variant effect – which uses advanced machine learning technologies to predict the disease-risk of variations in human genes. EVE identified more than 256,000 human gene variants of unknown clinical significance as either disease-causing or benign (not harmful).
'Increasingly, people have access to sequencing their genomes, but making sense of the data is not always straightforward' said study senior author Dr Debora Marks, an associate professor from Blavatnik Institute at Harvard Medical School. 'We believe our approach can be used as an added tool in current clinical assessments and offers a powerful new way to reduce uncertainty and clarify decision-making, particularly in the clinical setting'.
Published in Nature, the scientists initially trained EVE to search 250 million protein sequences from 140,000 non-human species across evolution for patterns of genetic variation and thus determine the likelihood of whether each variant is disease-causing or benign. Genes that are conserved by evolution are more likely to be essential to function and benign. Alterations to these sequences could be linked to disease.
The scientists then used EVE on 3219 human genes that are known to be associated with disease, in order to ascertain whether EVE was making accurate predictions, and discovered that EVE's results were remarkably accurate. To test EVE's accuracy further, the researchers compared EVE's results against results from clinical experiments on well-studied mutations in five genes, which included BRCA1, and discovered that EVE's predictions were in keeping with the known clinical data.
'Our results turned out to be far better than we expected,' Dr Marks said. 'It seems that by simply training a model to fit the distribution of sequences across evolution we extract information which enables us to make unexpectedly precise predictions about disease risk arising from a given genetic variant.'
EVE outperformed other prediction models, including computational tools and the gold-standard high-throughput screening. Traditional, machine learning models use data that are labelled according to their disease association. However, labelling can result in bias and unreliable models. Therefore, in the development of EVE, the researchers used a modern machine learning technology using unlabelled data that provided a certainty score for the predictions and a continuous score of disease-likelihood.
The researchers highlight that EVE should not be used as a diagnostic test. Inaccurate interpretations could lead to misdiagnosis or false assurance that the patient is disease-free. Dr Yarin Gal from the University of Oxford, who co-led the research, added that, 'What we hope this approach will do is generate powerful data that can empower the clinicians on the frontlines to make the right diagnostic, prognostic, and treatment decisions'.
Leave a Reply
You must be logged in to post a comment.