Saphetor is a precision medicine startup founded by Dr Andreas Massouras, the CEO, with whom we were delighted to talk with about cutting edge clinical data-driven diagnostics, powerful databases for genetic research, and the fascinating world of the genome and the information that lies within.
So how does DNA play a role in clinical diagnostics? While Dr Massouras confirms to us in our interview with him that our genome is not entirely deterministic – meaning other factors than genes may come into play in the manifestation of a disease – genetic information provides crucial information if interpreted correctly. This kind of DNA-sequence analysis done by Saphetor provides valuable input for clinical decisions, helping doctors identify their patients’ diseases accurately and prescribe the appropriate treatment. Saphetor provides its services to diagnostic laboratories, either in hospitals or for independent institutions and practitioners, by handling all the computational aspect to yield readily interpretable classified data. Like many others in the new age of healthcare, Dr Massouras is convinced that this diagnostic method will be part of the standard of care in the near future.
In order for one to go about interpreting a genome and finding key data in the midst of a sea of information, Saphetor has devised powerful algorithms that yield variants with disease-causing probabilities and comprehensive genome interpretation annotations. Nucleotides are the molecular building blocks of genes – the units of an organism’s information. Differences in the sequence of nucleotides, called “variants”, may have cause or have an impact in disease. Identifying disease-causing variants allows doctors to diagnose the precise condition among many that have similar symptoms or screen a patient for known genetic diseases.
“Every person has about 5 million differences in their DNA to most other people and to the human reference genome, so if you look at your DNA and mine there will be 5 million differences."
A patient’s data is first obtained from Next Generation Sequencing (NGS) which yields a raw sequence of the patient’s genome. The 2-step analysis then starts, in order to make sense of all this information. The first step is to find the variations, meaning the differences in the patient’s sequence compared to the human reference genome. However, an extensive list of variants does not give us the information we’re looking for yet, it still needs to be interpreted. The next step is to put them in categories automatically. All available information about the patient’s variants is then cross-referenced with data from decades of research in order to help accurately find which of the 5 million variations is/are causative or contribute to the disease.
If this weren’t complex enough, we have to keep in mind that genes do not usually work alone, and a variant in one gene may have a knock-on effect in others. A number of “driver mutations” are required to cause cancer - a single mutation is not normally sufficient. For genetic conditions, there are often different variants, inherited from each parent separately, which cause a disease only in combination. Further, NGS analysis can be used to identify, say, a disease causative variant in one gene as well as a variant on a different gene which help the physician give the right prescription. Some genetic variants are only linked to efficacy of drugs and not to the condition, but identifying them may also be very important. Combinations of variants are therefore usually considered.
Integrating all of this information is the challenge that Saphetor is addressing, with algorithms that will quickly pinpoint which variants are most likely to make the patient sick. The clinician doing the diagnosis will look at the list of 5 million potential variants, but will pick only one of the most probable variants yielded by Saphetor to make a quick yet accurate decision on treating the patient.
"What are the major challenges in properly integrating big data into medical research?"
“The main one is that there is a ton of information. Integrating it is a challenge in itself because data comes from a lot of different sources and is sometimes not in a format that one can readily use. So the first challenge is to bring all this data into a format that makes it very useful. But then, once it is integrated, the service provided by Saphetor is not just dumping lots of data onto a user. The next challenge is presenting it in a way that the most important parts are automatically highlighted to make the job of the user easier” – said Dr Massouras.
To provide such services, designing and sustaining a technical infrastructure that can accommodate all this data securely and appropriately is key. Saphetor, founded in 2014, is a young firm, and we were very interested about the technical set-up of such a venture. In fact, Saphetor are not the first ones to be able to store and process a lot of data. Organizations such as CERN handle massive amounts of data, while many people otherwise use the Cloud as infrastructure. However, Saphetor chose not to use Cloud technology for privacy reasons, the data is best kept in a secure and private location due to its sensitive nature. Although building the infrastructure was a challenge, Dr Massouras humbly acknowledged how Saphetor benefited from the expertise of others who had done this before.
To get an idea, if genomic medicine does become part of the standard of care as predicted, it will potentially represent petabytes worth of data (10^15 bytes) – so comparable to what the CERN is now storing. Of course, Saphetor is not at that stage yet, and Dr Massouras indicated that the most important thing for Saphetor is to build a scalable infrastructure. Scalability is the capacity to grow to accommodate more and more data without impacting the performance or the accessibility of the existing data. Design is key to ensure scalability, but maintenance is as well, keeping the databases up to date with the newest findings that come from various sources. As mentioned before, Saphetor’s mission is to assist clinical decisions and work towards a standard of genetic data-driven accurate diagnostics.
Saphetor grows and strives to reach that goal every day. They work closely with their customers to constantly improve their offering with the feedback loop that feeds into their algorithm’s development, based on their clients’ and partners’ medical needs, and information updates that come from the research world. They have a multidisciplinary team, which includes programmers, geneticists and molecular biologists. This is very important in developing their product since their combined expertise is clearly crucial in correctly developing their data integration algorithms, and their variant classification algorithms.
“Our job will never be done, because it is a continuous improvement.”
They also have a very good understanding of the advancement of medical care in the region around Lake Geneva in Switzerland, which is starting to be popularly called the “Health Valley”. Fruitful partnerships have already been signed with HUG and CHUV(*), the two academic hospitals in the region. For example, HUG’s department of medical genetics uses Saphetor’s services to crunch large amount of data, diagnose patients with rare medical conditions and continue to research genetic correlations between certain variants and the occurrence of a sickness. More specifically, they address developmental and neurological diseases, and they are expanding their menu of conditions that they can diagnose. Although rare diseases themselves affect 6% of the population in most developed countries, there is now an expansion of the research into neurological and psychiatric diseases such as epilepsy, schizophrenia, and many more, showing the increasing scope of this technology’s applications. Their colleagues in Geneva are at the top of their field: they know exactly where the state of the art is and they are ready to apply any new finding in research immediately into clinical practice.
Of course traditional symptomatic diagnostics will still have their place in the healthcare industry of the future. It depends on the patient’s condition, to what extent it is genetically-caused or to what extent it has to be diagnosed with other methods. There’s a large range of diseases and depending on the disease a new balance of things will be found, after including genetic data-driven diagnostics.
The usefulness of this method has been clearly demonstrated and nothing will stop it from becoming an integral part of medicine – it is now beyond doubt that genetic analysis is essential. The global limiting factor is the rate of adoption of the technology. As of now, this rate seems to be accelerating. Over the last decade, there had been a few over-inflated expectations; but now this technology is mature, and so are the attitudes and expectations. Genetics is not the answer to everything in medicine, but it has become clear that it is part of the answer to a large range of health-related questions. The only question remains: how fast is it going to become ubiquitous?
Saphetor has also created a free knowledge tool called VarSome, which contains a number of databases for human genomic variants. It is continuously updated by researchers all over the world, and has accumulated 18 billion items of variant and gene annotation so far in only a few months. This new powerful and fast research tool is available at varsome.com.
*HUG = Hopitaux Universitaires de Geneve (Geneva University Hospitals)
*CHUV = Centre Hospitalier Universitaire Vaudois (Vaud region University Hospital Center)
By Katya Guez
Katya Guez is an editorial intern at WSPC, and is currently pursuing her Life Sciences and Bioengineering degree at the Swiss Federal Institute of technology in Lausanne.