Wow, what a lot of people! I recently attended the Festival of Genomics and Biodata in London and was impressed by the sheer number of attendees. It was my first time at the conference, which provides a crossroads for the entire genomics ecosystem. What I experienced inspired me to share my thoughts here. I hope they will be interesting for people working in genomics, reading DNA and working out how data might inform clinical practice and patient care.
It's well known that the practice of medicine is changing, with pharmaceutical companies producing treatments targeted towards specific patients and their particular diseases. The secret to this is contained in a wide number of disciplines within medicine but particularly within the inner workings of cells, the proteins they make and molecules they produce.
Cells, DNA and the panomic approach
My visit to the festival reaffirmed my belief that progress requires as broad a range of data as possible – everything we can get from a cell, from the DNA itself to the manner in which genes are switched on to the products produced by the cells. This is called the panomic approach; that is, all the types of omics. When combined with lower cost and more flexible devices it could open the door to improved, practical clinical medicine.
After my short train ride from Cambridge to Kings Cross and a quiet walk along the Regents Canal, I was beginning to feel quite relaxed. Then I turned the final corner to be met by an enormous crowd of people. I have genuinely not seen so many people since before the pandemic. The first hour was spent in a state of shock but then memories of crowded floors and full lecture theatres kicked in and we were away. I wasn't the only one – nearly every conversation kicked off with a version of "I haven't seen so many people for years". And so many conversations, so much energy. Spring had come early to North London.
Long read sequencing becoming the norm
The conference had attendees right across the spectrum from industry leaders, top academics and students to a final talk by Brian Cox (I'll get to that later). A major theme that ran through several debates was that long read sequencing is becoming the norm. Here, more of the DNA is read in one go, and each individual read is typically being 10,000 base pairs or longer and is coupled with higher accuracy.
Long read sequencing previously had limitations in throughput, cost and accuracy. As these limitations are being overcome through Oxford Nanopore and PacBio technologies – with a Nature article (Jain et al 2018) demonstrating a quoted average read length for Oxford Nanopore of up to 100 kb for ultra-long reads with ~99% accuracy – it is significantly broadening their application domains in genomics.
It was obvious that much of the data presented at the conference included some or all long read sequencing data. This significant adoption is driving the other key players such as Illumina, who are historically better known for multiple shorter reads, to also offer long read sequencing. I was impressed by how their chemistry updates and latest generation devices were increasing read lengths while also improving accuracy.
Maybe, if the end result is a suitable assembled sequence, it is becoming less important to users which underlying technology is utilised. In a sense, researchers and clinicians are becoming increasingly agnostic to the fundamental approach utilised and are concentrating more on the potential utilisation of the data.
Genomics, the complete set of DNA and genes
A second theme that ran through the conference was that using a combination of different data sets from genomics (the complete set of DNA and genes) through epigenomics (the manner in which the environment can control the expression of genes), transcriptomics (the RNA produced in a cell), proteomics (the proteins produced using instructions in the RNA) to metabolomics (small molecules produced by the cells) is providing increasing clinical efficacy. From a clinical standpoint this makes sense – the more varied the clinical inputs the better – and indeed this extends even further from the genomic data to the physical presentation and history of a patient.
A true multiomic or panomic approach, where the full range of information described above is measured from a cell, is likely to further drive an agnostic approach to the source of genomic data. It is unlikely one company can provide everything, and even if they do a lab is likely to have a range of equipment. This is going to further increase the importance of data aggregators – pulling all the data together – and the means by which this incredibly data rich and potentially overwhelming information is clearly presented to researchers and clinicians. As the clinical utilisation and uptake increases, the human side of data understanding and presentation is likely to become as important as the detailed technology underpinning the data acquisition.
In a way the overwhelming feeling of thousands of people crowding into a conference was symbolic of the multiple data sources being presented to clinicians. What you have to do is take a deep breath, have a coffee, and plan how to take it all in to get the outcome you need. If you do all this, then finally you might get to the end and the calming voice of Brian Cox reminding us of the excitement of the scientific journey we are all on.
At CC we specialise in both helping clinicians make use of this complex data to improve patient care through our teams of life science and artificial intelligence experts – as well as developing the complex engineering to measure it. Please reach out to me via LinkedIn or by email you think we might be able to help.