Published online: 18 August 2017
School of Chemistry and Molecular Biosciences (SCMB), University of Queensland, St Lucia, Qld, Australia
Tel: +61 7 3365 8549
The occurrence of highly resistant bacterial pathogens has risen in recent years, causing immense strain on the healthcare industry. Hospital-acquired infections are arguably of most concern, as bacterial outbreaks in clinical settings provide an ideal environment for proliferation among vulnerable populations. Understanding these outbreaks beyond what can be determined with traditional clinical diagnostics and implementing these new techniques routinely in the hospital environment has now become a major focus. This brief review will discuss the three main whole genome sequence techniques available today, and how they are being used to further discriminate bacterial outbreaks in nosocomial settings.
In recent decades, society has witnessed a rapid and alarming increase in highly resistant pathogens causing human disease, to the extent that the World Health Organization (WHO) has labelled antimicrobial resistance (AMR) as, ‘a serious threat to global public health’1. The challenge of AMR is arguably greatest in health-care facilities, which present a unique environment for pathogens to proliferate and infect those most vulnerable. Once within the hospital, these pathogens can be readily spread due to the close proximity of patients and the mass of shared vectors in the environment (including bathrooms, wards, healthcare workers, trolleys, etc.). With limited treatment options, patient outcomes during these nosocomial outbreaks decline, leading to increased mortality and a substantial economic burden2–4. Traditional techniques employed in routine hospital diagnostics focus mainly on phenotypic and gene-centric analyses, which are useful for inexpensive and rapid surveillance in the preliminary stages of an outbreak. However, these techniques are unable to render fine-scale analyses capable of detecting transmission pathways and otherwise unexpected epidemiological connections. As such, new high-resolution technologies, such as whole genome sequencing (WGS), are now being used in response to outbreak investigations. This article will discuss the three main WGS technologies available today and briefly review their existing and future impact in clinical settings.
Many hospitals worldwide already appreciate the power of WGS analysis, having integrated it into several published investigations5–7. The advantages of WGS are most accessible through the use of short-read sequencing as exemplified by the Illumina platform. In addition to being both high-throughput and cost-effective, analysis tools specific for short-read sequencing are well established, making it a useful and reliable research tool. One of the main advantages of WGS is the ability to detect single nucleotide variants (SNV). Detection of SNV can allow prediction of phenotypic changes, such as enhanced resistant to antibiotics (for example, via the loss of a functional outer membrane porin cause by indels or point mutations8–10). When coupled with metadata, SNV data can also be used to predict transmission pathways by inferring transmission directionality via the accumulation of SNV over time. Short-read data are also routinely used to determine presence or absence of genes using read mapping or short-read assembly techniques.
Carbapenem-resistant Enterobacteriaceae (CRE) are among the most prevalent clinically relevant organisms, designated as an urgent threat by the Centers for Disease Control and Prevention11. Snitkin et al.12 were amongst the first to apply WGS to identify transmission pathways in a CRE outbreak that could not be resolved using traditional epidemiological investigations alone. Recently, we have used WGS to investigate a suspected CRE outbreak in a Brisbane hospital13. Using Illumina short-read sequencing, we were able to determine transmission of a carbapenemase-producing Enterobacter cloacae sequence type (ST) 90 strain within the intensive care unit (ICU) between three patients over a three-month period. Of the 10 isolates analysed, we detected only 4 SNV overall, indicative of direct transmission. Patients 1 and 3 were found to have identical isolates at the core genome level, despite not being in the ICU at the same time. This suggested a probable environmental source of the infection, rather than direct patient-to-patient transmission. This was supported by comparison of the isolates to publicly available genomes, which identified a near-identical E. cloacae from 2013 (isolated in the same ward and hospital). This E. cloacae differed by only one SNV from the 2015 isolates, suggesting a reservoir in the hospital environment since at least 2013. Despite efforts to identify the environmental source in the hospital through extensive screening of all possible environmental vectors (excluding healthcare workers), a source was not found. Overall, WGS was able to accurately determine the relationship of E. cloacae over the course of the outbreak, providing unambiguous evidence of environmental cross-transmission over at least two years, prompting further surveillance and re-imposing infection control standards.
The most significant drawback of using short-read sequencing is the inability to accurately characterise mobile genetic elements (MGE) such as insertion sequences (IS), genomic islands and plasmids. MGE are often associated with important elements such as virulence factors and antibiotic resistance genes14–16, but are also known to comprise numerous repetitive regions, the main culprit being IS17. These repetitive regions are unable to be traversed by short-read sequencing, causing ‘collapsed repeats’ in final assemblies that ultimately impedes the contextualisation of important genomic regions18–20. While tools exist to patch together these assemblies, such as Bandage21, increasing the sequence read length to span these repetitive regions is the only unambiguous solution for resolving MGEs22.
Currently, the most established long-read sequencing technology is Pacific Biosciences (PacBio) Single Molecule Real-Time (SMRT) sequencing. This technology can now routinely provide complete bacterial chromosomes and plasmids23, allowing contextualisation of important genomic regions, tracking of plasmids and completion of high-quality reference genomes24. One of the most progressive examples of PacBio integration into clinical settings comes from Mount Sinai, New York, where Sullivan et al.25 successively sequenced 137 methicillin-resistant Staphylococcus aureus (MRSA) isolates using PacBio to allow high-precision surveillance and outbreak control. In our investigation of the 2015 carbapenemase-producing E. cloacae, we used PacBio SMRT sequencing to completely characterise a large ~330 kb IncHI2 plasmid carrying a complex ~55 kb multidrug resistant (MDR) region that contained the carbapenemase gene blaIMP4. Once characterised, we were able to undertake broader epidemiological surveillance for this plasmid, resulting in its identification in E. cloacae patient isolates (and an Escherichia coli) from other South-East Queensland hospitals. Comparison of our plasmid to publicly available data also found a remarkably similar plasmid carried by a Salmonella enterica isolate from a cat26. Similarly, Sheppard et al.27 were able to use a combination of long- and short-read sequencing to track blaKPC positive Enterobacteriaceae in a single hospital over 5 years, ultimately showing the promiscuity of blaKPC within several different hosts and in several different plasmids. These studies highlight the importance of tracking not only outbreaks focused on clonal transmission, but also MGE transmission between bacterial strains or species in hospital settings (Figure 1).
While this technology is effective at producing complete genomes, it does come at a price. A single bacterial genome with PacBio SMRT sequencing can cost 20 times that of an Illumina sequence28. Additionally, the amount of DNA required, the time needed for the library preparation (Figure 2), as well as the relatively low throughput system, makes routine sequencing with PacBio less attractive than short-read sequencing due to the time required per sample. Ultimately, to implement long-read sequencing into the clinical setting, the turn-around time from sample isolation to analysis results needs to be within a workable timeframe.
Oxford Nanopore MinION sequencing is a relatively new sequencing platform designed to couple long-read sequencing with a rapid turn-around time. Its portability and capacity to perform real-time analysis during sequencing has made it an attractive companion for outbreak investigations in remote regions, as was the case with the ZIBRA project29. Clinically, MinION sequencing has been used to retrospectively analyse S. enterica isolates in relation to a European-wide outbreak30, which showed that the outbreak strain could be identified from less than 2 hours of sequencing, despite not having the resolution to elucidate transmission pathways. Viral pathogens have also been identified from clinical metagenomic samples, with a projected turn-around time of less than 6 hours31. While this new technology is promising, some hurdles remain before MinION sequencing can be used routinely. These include the lower base level accuracy, as well as the current difficulty in sequencing more than 12 isolates in parallel (as compared to Illumina)28,32.
Overall, WGS technology has advanced considerably in the last decade, with an equivalently rapid decline in price33–35. While still being more expensive than traditional diagnostics, if successful in preventing further transmission during an outbreak this cost becomes negligible compared to the cost of continued patient treatment and repeated infection control. WGS also retrospectively provides a large wealth of information, allowing the hospital to progressively catalogue past and ongoing infection occurrences, ultimately providing a highly detailed epidemiological map of pathogen movement in the hospital and from the community that can quickly be referred to in the case of new infections. Implementation of WGS in response to an outbreak guarantees an in-depth high-resolution analysis that cannot be determined using traditional phenotypic and genotypic methods alone. Currently, all three sequencing technologies presented in this article can be used complimentarily to produce a comprehensive outbreak understanding and provide ongoing genomic surveillance of the strain or element. Routine implementation of WGS in healthcare settings will undoubtedly become widespread in the near future, aiding clinicians, patients, infection control and researchers alike.
Leah Roberts is currently a third-year PhD candidate completing her studies at the University of Queensland, Brisbane, Australia. Under the supervision of Associate Professor Scott A Beatson and Professor Mark A Schembri, her research interests have mainly focused on the use of whole genome sequencing and its application in clinical settings. Using a range of sequencing technologies such as Illumina, PacBio and Nanopore, Leah has analysed a variety of bacterial outbreaks, including Acinetobacter baumannii, Klebsiella pneumoniae and Enterobacter cloacae. She has presented her work at both domestic and international conferences, and has won several awards including the 2016 ASM BD award for Queensland, as well as the 2017 Applied Bioinformatics and Public Health Microbiology (ABPHM) student poster prize.
Provides comprehensive and practical guidance on how to control food safety hazards.