Home - CPI

Change Log

Custom Exomiser reports

Mon, 10 Feb 2020 03:38:00 +0000

Users can now run Exomiser using parameters of their choosing. For example, users can set the allele frequency , variant type, inheritance mode, and model organisms. To use Exomiser, from the menu Reports -> Exomiser. Results will be emailed to you when completed.

gnomAD annotated search: De novo & Mendelian inheritance filters

Tue, 04 Feb 2020 21:47:00 +0000

Similar to the pipeline annotated search, the gnomAD annotated search now supports de novo and mendelian inheritance filters.

v2.0 and v2.1 dropped from CPI28 release

Thu, 30 Jan 2020 01:46:00 +0000

As of Jan 29, 2020 (cpi28 relaese), datasets generated from pipeline version v2.0 and v2.1 have been dropped from the database in favor of the newer pipeline versions v2.3 and v2.38. Those that would still like to access the older pipeline versions can do on a request basis. Or can request to their data reanalyzed using newer versions of the pipeline.

Truncated alignment (BAM) files for offline viewing

Thu, 30 Jan 2020 01:13:00 +0000

We've introduced a more efficient way of downloading the alignment files without having to download the entire BAM by focusing on the variant of interest. We've added a packaged download of all the needed files (VCF, BAM, indexes) for offline viewing. The original BAM is truncated to include 1000 bps before the start coordinate and 1000 bps after the end coordinate of the variant. These files can be found in the 'Datasets' section under the heading of "Variant alignment files". Rather than waiting for hours/days, you can download in seconds/minutes.

gnomAD annotated search

Wed, 13 Nov 2019 21:50:00 +0000

We've introduced an exciting new way to browse and filter variants using annotations provided by gnomAD for exomes. The existing search, which is based on the pipeline provided annotations, is unchanged and can still be used in the usual way. This format of the search looks at the same data, but viewed in a different way. There are several differences between 2 search mechanisms. The primary difference is that the pipeline annotated search is never updated and represents the annotations that were current at the time they were analyzed. On the other hand, the gnomAD annotated search is updated quarterly with every new release of gnomAD across all previously analyzed patients. By using the latest frequencies and gene names, we can go back and review cases that were previously unsolved. There are several other differences that should be noted: The pipeline annotated search displays ONE variant linked to ONE transcript. With the gnomAD annotated search, we are displaying transcripts (not variants), which means for one variant we can have MULTIPLE transcripts. The significance of this is that in the search results, we may see multiple rows for the same variant, represented by different transcripts. The other difference, are the tabbed search results. The pipeline annotated search currently shows a single tab for all search results. With gnomAD annotated search, we are splitting the results into 2 tabs. The first tab displays variants that have a link to a gnomAD record. The second tabs lists variants that have no link to gnomAD. By default, the gnomAD annotated search assumes the assembly build is GRCh38. If users need to go back and look at previously analyzed patients, they should remember to switch the assembly to GRCh37. Some new filters were introduced including: LoF (loss of function), Homozygous count, Canonical, and gnomAD consequence. It should be noted that structural variants are not supported in the gnomAD annotated search. To search SVs, please continue to use the pipeline annotated search. The new gnomAD search is not meant to replace the pipeline search, but to complement it with fresh new information for the variants gnomAD has coverage for which has far lower coverage than the pipeline. It is highly recommended to use both. For a demonstration video please follow this link: https://youtu.be/iGRiP8hmet8

Exomiser analysis available

Wed, 04 Sep 2019 04:41:00 +0000

What is Exomiser? Taken from the website: "The Exomiser is a tool that finds potential disease-causing variants from whole-exome/genome data. Starting from a VCF file and a set of phenotypes encoded using the Human Phenotype Ontology (HPO) it will annotate, filter and prioritise likely causative variants. The program does this based on user-defined criteria such as a variant's predicted pathogenicity, frequency of occurrence in a population and also how closely the given phenotype matches the known phenotype of diseased genes from human and model organism data. " Our tests indicate that Exomiser is able to rank the correct gene, In the top spot #1, 52% of the time. In the top 5, 78% of the time. In the top 10, 91% of the time. How to use? Download the attachment, Unzip/decompress the attachment, Open the file ending with HTML or TSV, The genes are listed from highest ranking downwards. Currently we use some default settings in exomiser, but it can be re-analyzed using some more specific criteria to produce better results. What's important is that the 'clinical diagnosis' is captured in our Patient database so that Exomiser can do the phenotype+genotype analysis. Exomiser results are available for download through the web-portal in the 'datasets' section for newly imported results provided that the clinical diagnosis is available.

GRCh38 assembly now supported

Thu, 18 Jul 2019 00:27:00 +0000

We have finally made the switch to the latest GRCh38 assembly for bioinformatics analysis. The search now includes the option of choosing assembly versions GRCh38 and the older GRCh37. This means users can still browse our database for both GRCh37 and GRCh38 assemblies. We have done some comparisons between the 2 assemblies for some previously analyzed samples and have noticed differences in pathogenicity scores and inclusion/exlusion of variants.

Data storage and compression using CRAM

Thu, 22 Nov 2018 21:43:00 +0000

As our database expands in scale, managing and storing large genomic sequence data has become a challenge, particularly with the large BAM files. As we head towards cheaper sequencing costs we are anticipating a tsunami of data as researchers switch from exome to whole genome sequencing. In preparation, we've taken early steps of further compressing our BAMs using CRAM compression (lossless). Our database system manages its disk space autonomously such that if our allocated disk space reaches a threshold of 80%, it will automatically convert the oldest BAMs to CRAMs and archive them to tape storage, making way for newer datasets. Our testing has shown that the CRAM compression format saves roughly 30% and will provide significant costs savings. Users that wish to have access to the archived BAMs can click a button from our web interface and the system will automatically restore the CRAM from tape and convert them back to BAM.

Search filter: Clinical diagnosis and provisional variants

Fri, 12 Oct 2018 03:12:00 +0000

We've added 2 new search filters for Clinical diagnosis and Provisional variants. By combining these 2 filters together, we can look for all patients that fall under the same disease category and look for provisional variants they may have in common. This is useful in a research context where we may be able to find a pattern of variants that may be influential in disease pathogenesis. Both the Clinical diagnosis and Provisional variant information is pulled in from our related Patient Database. Therefore, for these filters to be useful, they must first be specified in the Patient database prior to use in the search.

Predictive relatedness, sex and ancestry reports

Fri, 31 Aug 2018 04:23:00 +0000

We've recently added a new step in our pipeline to use the genetic data to predict the degree of relatedness, sex and ancestry. This is particularly useful as a quality check to spot potential sample mix ups, poor DNA quality, contamination, errors in patient details provided etc. In the event of a possible error, users are automatically notified with the reports attached in the email for further investigation. We are currently running the reports retrospectively for all of our previous data sets and have already found a some data entry errors. In such cases, we may want to rerun the pipeline analysis as such errors can affect the variant prioritization. These reports are also available as downloads in the 'datasets' section.

Prediction filtering can be separated by logical OR/AND

Fri, 31 Aug 2018 04:14:00 +0000

Previously when combining filters on predictions and scores, the search automatically separated each filter on a conditional AND by default. The change we've made recently, is to allow users to specify the logical operator (AND/OR) between the prediction and scores filters such that you can query the database by saying give me all the variants that have polyphen prediction 'probably damaging' OR clinvar prediction 'pathogenic' in a single query. Previously, if you had to do this, you would run separate searches for each polyphen and clinvar.

Search profiles - New filters

Mon, 23 Jul 2018 03:24:00 +0000

Previously only gene lists were supported in the Search Profile feature. Recently, we've added support for storing the lists of mutations types (exon, splice, missense, nonsense), ExAC frequency, Gnomad frequency.

Improved structural variation prioritisation

Tue, 26 Jun 2018 04:59:00 +0000

Matt Field has made some significant improvements to the the prioritisation of structural variants (SV) and we've updated our database to reflect those changes which include combined report for both SV callers, prioritise SVs where exons are most likely to be impacted, max length filter applied to most SV types, and whether event is novel/known. These changes dramatically reduced the number of high priority SVs from >3000 to around 90 and 449 medium priority SVs. Please note that we do not retrospectively re-analyse and update the SV reports for any of the previous records. This only affects any new data.

Handling control samples

Tue, 26 Jun 2018 04:44:00 +0000

We don't necessarily want to see variants from our control samples in the database, but at the same time we still want to be able to download the VCFs and do SNP validation to ensure we don't have sample mix ups. We've created a separate page of control samples and their corresponding VCF for download.

Automated archiving of BAM files

Tue, 26 Jun 2018 04:41:00 +0000

Our capacity to keep BAM files available for download is a real challenge and we are faced with the constant pressure to free up diskspace as more projects come onboard. We've come up with a way to automatically archive BAM files that are older than 1 year to tape storage without any human intervention.

Health reports: GWAS

Tue, 26 Jun 2018 04:20:00 +0000

In addition to Clinvar and Snpedia, we've recently added GWAS Catalog to the health reports based on the the rsNumbers for a patient. GWAS is particularly useful in a research context by comparing variant frequencies in the affected population against a control (healthy) population using statistical analysis to establish a hypothetical link between variants and disease traits. In the health report under GWAS, we've added the following columns: disease traits, studies, risk allele, initial sample size, replication sample size, p-value and risk allele frequency. In GWAS it has been shown that false positives are not uncommon (false association between variant and disease) due to uncontrolled biases and so it's important to take into consideration whether any replicate studies were done to give more confidence to the hypothesized association.

Health reports: Clinvar & Snpedia

Tue, 22 May 2018 04:05:00 +0000

We've added a new feature where users can generate health reports downloadable in a Excel format from multiple datasources including Clinvar and Snpedia based on the patient's rsNumbers/variants and genotype. The health reports indicate the patient's risk factor associated with a particular disease/trait. It can take up to 20 mins to generate and an email is sent with the attached health report. Magnitude is a subjective measure of interest ranging between 0-10. The higher the number the more significant. A magnitude score of 2 or higher is probably worth investigating. A magnitude score of 4 or higher is definitely worth investigating. More info at: https://www.snpedia.com/index.php/Magnitude.

Excluding variants from search based on Patient study codes

Tue, 17 Apr 2018 01:48:00 +0000

When doing our own variant analysis, we often seek variants that shared between affected individuals, and we already provide this capability using the 'shared' filter. We recently added a new filter to take this search one step further by removing variants found in the unaffected individuals (usually from the same family). There is a new textbox called 'Exclude variants' where users can add patient study codes to exclude the variants found in these individuals from the variants found in the other individuals in a single search operation. Keep in mind, that each person will carry thousands of variants, so filtering in this way can be quite slow if no other filters are applied. So it is recommended that users apply as many filters as possible to narrow the search before using this functionality.

GnomAD ethnic frequencies exportable

Tue, 17 Apr 2018 01:15:00 +0000

We've added a new option for users to export GnomAD ethnic frequencies to excel which includes south asian, east asian, african american, jewish, non-finnish european, finnish and other minor allele frequencies (MAF). It's optional because we don't actually store the gnomAD frequencies in our database and have to fetch them from elsewhere making export slower especially when exporting thousands of variants. It's best to filter as much as you can before enabling this option.

Affected statuses: Database vs Pipeline

Tue, 03 Apr 2018 22:55:00 +0000

We recently introduced a new filter called 'Pipeline affected status'. This is not be confused with the other 'Affected status' or 'Disease status' filter which is taken from our Patient Database. The 'Pipeline affected status' differs such that you can reconfigure the pipeline to use a different affected status from what is set in the database to produce different cohort reports. This is useful in cases when the affected status applies to multiple phenotypes or diagnoses and you want to to do repeated cohort analysis under different conditions.

Phenotype to Genotype based variant searching

Thu, 15 Feb 2018 04:22:00 +0000

Expand your variant search based on known phenotype-genotype relationships. This filter only works if you have specified patient ids in the filters. The phenotypes collected from the specified patients are used to query OMIM for gene relationships. A new tab called 'Phenotype-Genotype' is displayed in the results showing the relationships between phenotypes and genes. This only works well for patients that have a good number of phenotypes captured in our databases.

RS number filter

Thu, 15 Feb 2018 04:21:00 +0000

Users can now search by rsNumbers in our search fitlers

Variants from cohort reports are now included

Thu, 15 Feb 2018 04:20:00 +0000

Previously only the variants from the SNV, INDEL and SV reports were included into our database. We've recently rebuilt our database to include all variants, even the questionable ones of poor quality, found in the cohort report because there are some suggestions of an inheritance pattern discovered during the pipeline pedigree analysis. This means more variants for you to browse than there was before.

Gene interactions - Genes don't work in isolation, and your gene lists shouldn't either

Wed, 20 Dec 2017 03:56:00 +0000

Genes don't work in isolation, and your gene lists shouldn't either. Researchers will often have a list of known genes to look for when prioritizing variants based on the patient's clinical diagnosis. but what should you do if no candidate variants can be found solely based on your gene list? There are many approaches, but one option is to expand the gene list based on known gene interactions and pathways. We rely on the highly curated database called BioGRID to expand the gene list to include the network of genes known to interact either directly or through protein-to-protein interactions. To use this new feature, there is a new checkbox called 'Gene interactions' which users can tick to expand their gene-based search in this way.

Search profiles

Wed, 20 Dec 2017 03:47:00 +0000

Users can now create their own search profiles as a way of storing commonly used search filters without having to repeatedly choose the same options over and over again. One example is to include your gene lists in a search profile. The search profiles are associated with the user only and are not shared.

View all previous posts

Name