Additional info on IGV¶
Reviewing Read Level Support¶
In the Interpretation Browser users can click on the “Variant” hyperlink (chromosome:position) and view the read level support for their variant in the BAM through IGV.js.
The user is redirected to log in again with AD into the openCGA application which displays the selected alignment and variant tracks.
In addition, users can download a batch script from IGV.js that facilitates viewing the read level support in the desktop version of IGV.
For further details of how to use IGV in the Interpretation Browser, please refer to the existing 100,000 Genomes Project Genomics England Guide to Clinical Reporting for Rare Disease.
Visualising CNV breakpoints¶
Refining CNV breakpoints IGV.js¶
Detection of copy number variants in the GMS is being performed using DRAGEN CNV for the Rare Disease pipeline and with Canvas for somatic variants in the Cancer pipeline. Structural variants are being detected using Manta for both Rare Diseases and Cancer. See the Rare Disease Analysis and Cancer analysis guides for further information. The following provides guidance for visualising and refining CNV breakpoints IGV for both rare disease and cancer cases. Either a local installation of IGV or the web-based application of IGV.js provided by Genomics England can be used.
Copy number variants (CNVs) detected by the DRAGEN CNV or by Canvas variant callers are based on coverage and therefore the breakpoints may be imprecise. Variant calls provided by Manta are based on split reads and improperly paired reads and can be used to help refine breakpoints.
Below is an example of how to refine CNV breakpoint in the sample LPxxxxxxx-DNA_xxx, CNV Canvas:LOSS:chr6:78257398:78326483. It is a common non-pathogenic CNV chosen for illustration only.
How to visualise breakpoints in IGV.js¶
CNV breakpoints can be visualised in the local IGV Desktop viewer or IGV.js. To visualise rare disease CNVs, users can click on the genomic coordinates in the tabular display on the Interpretation Portal. In cancer cases, the list of structural variants can be found in the HTML reports. Please note that in the supplementary analysis, only Manta calls can be visualized in the IGV browser, but coverage can be visualised in the same way as described for rare disease cases below.
To visualise CNVs using coverage profiles, users can view the coverage bigwig file (users can do this via IGV.js by unticking all boxes except for the BIGWIG file in "Available files to show").
Example: https://igv.genomicsengland.nhs.uk/?samples=1000000032:LPxxxxxxx-DNA_xxx ®ion=chr6:78257398-78326483 (zoom out to see the surrounding regions)
CNV calls are too frequently large to display individual reads for the whole region, therefore, to focus on breakpoint regions, zoom into each breakpoint and tick the BAM file and the bigwig file to show. Shown below are the reads at the breakpoints of the above CNV.
In the example above, it is clear that in both breakpoints the drop-in coverage coincides with around half of the reads having lots of mismatches starting from the same position, and therefore the breakpoint can be precisely pinpointed. Duplication breakpoints can be visualised in a similar manner and an increase in coverage would be expected.
Please note, that the local installation of IGV viewer has more useful functionality than IGV.js, and allows to colour improperly paired reads, and to review information associated with individual reads in bam by hovering over the reads. N.B. Some users have noted that visualising BigWig and BAM files in their local copies of IGV is slow and have found that IGV.js is sufficient for basic visualisation. See Appendices below for details on how to download batch script from IGV.js for local IGV Desktop use.
Limitations¶
Breakpoint refinement depends on the presence of split and/or improperly paired reads, and for that these reads need to map to unique regions of the genome. Therefore, split and improperly paired are not available when CNV breakpoints fall in the large regions of poor mappability (e.g. large segmental duplications), and this is a frequent case for recurrent pathogenic CNVs.
In the absence of split reads, to visualise breakpoints, users need to use IGV viewer rather than IGV.js to be able to colour improperly paired reads. Such calls are likely to be called as imprecise by Manta.
BAM and VCF visualisation¶
BAM and VCF visualisation will be provided by a tool called IGV.js. This works in a similar way to IGV desktop application.
-
To open IGV.js go to the following link – https://igv.genomicsengland.nhs.uk/
-
You may be prompted to sign in via Azure (if you are not already signed in to the Interpretation Portal or another instance of IGV.js)
-
You can then build the query to stream the section of BAM or VCF you are interested in as follows:
https://igv.genomicsengland.nhs.uk/?samples=STUDYID1:LPxxxxxxx-DNA_xxx; STUDYID2:LPxxxxxxx-DNA_xxx ®ion=chr13:32,929,163-32,929,227
-
Samples are allocated to studies depending on their type and reference genome used. They can be found in the following studies:
-
Once you have generated the query for your sample from the LP barcode you will be given the option for the files you can visualise.
-
Select the files you would like to see and click on the Show Tracks button on the top right-hand side of the page.
-
Your BAM file will then be viewable on the screen and you will be able to browse to different regions of the genome.
-
To allow the user to see how long hey have left in the session there is a clock in the top left-hand corner. The token will last for 30 minutes unless you hit the refresh button to allow another login session.
-
The user also has the option to open the BAM file in IGV desktop by using the download batch script button.
-
Save the batch script locally on your computer
-
pen IGV desktop on your local machine and under tools select Run Batch Script and open the batch script download.
-
This will load your BAM or VCF into IGC desktop for you to view.
-
Note: If you sessions have expired in Open CGA you will not be able to load the file in IGV desktop.
File types available to view¶
File Type | Description |
---|---|
Short tandem repeat genotypes detected by ExpansionHunter | |
Copy number variants detected by DRAGEN CNV | |
Genotypes of approximately 500,000 SNPs used for Genomic and Data Checks | |
Genotypes of SNPs used by the Sample Matching Service | |
Small variants detected by the DRAGEN small variant caller after normalisation | |
candidateSmallIndels.vcf.gz | Subset of the candidateSV.vcf.gz file containing only simple insertion and deletion variants of size 50 or less. |
Structural Variants detected by Manta, joint called for family members where available | |
candidateSV.vcf.gz | Unscored SV and indel candidates detected by Manta. Includes low quality and small (<50bp) variants. |
Genome coverage file | |
Intermediate file from DRAGEN CNV. This file can be used to review dropout regions for which CNV signals are not extracted from the alignments for inclusion in CNV calling. CNV events may span these intervals if there is sufficient signal in flanking regions. | |
Genome alignment (CRAM format) generated by the DRAGEN aligner |
Querying OpenCGA¶
N.B. It is the responsibility of the NHS Genomic Laboratory Hubs (GLHs) for any genomic data download from Genomics England to a NHS GLH computer or file system. It is not permitted to download a BAM file as this could cause issues with your network speed. Please only use the download functionality for VCFs.
-
Navigate to the OpenCGA webservices https://apps.genomicsengland.nhs.uk/opencga/webservices
-
First you will need to generate your session ID. This is done using the login endpoint found by expanding the Users section.
-
“Patient Choice” (aka Consent) is taken from patients when a test is ordered in TOMS.
-
A record of the patient choice decisions is visible for both Rare Disease and Cancer cases in the GMS Interpretation Portal.
-
Please see the diagram below for a simplified workflow of consent options in TOMs.
-
Fill out your user id in the user box and in the body section type {"password":"yourpassword"} then click on the box, “Try it out!”
-
You will see your session ID (sid) generated in the box below and this will be needed to get your files later.
-
Now to build the URL for the download of the BAM you will need your session ID, the sample name you wish to download and the study ID. For example:
https://apps.genomicsengland.nhs.uk/opencga/webservices/rest/v1/files/LPxxxxxxx-DNA_xxx.vcf.gz/download?study=10000000xx&sid=YOUR_SESSION_ID_HERE
Querying Open CGA via the command line¶
For those users wanting more functionality there is an opencga.sh command line to query the database.
For more information please see:
http://docs.opencb.org/display/opencga/opencga.sh
Client libraries to REST web services¶
There are also both python and R client libraries developed in Open CGA to query all the data. For more information please see:
http://docs.opencb.org/display/opencga/RESTful+Web+Services+and+Client