Latest improvements for CLC Genomics Workbench

  Current line         Previous line          Archive

CLC Genomics Workbench 8.5.4

Release date: 2017-03-22

All changes in this release have also been fixed on the CLC Genomics Workbench 10.x and 9.5.x lines at time of writing, with the exception of the one release note marked with an asterisk. That issue was fixed for CLC Genomics Workbench 10.0 and will be fixed in a future  release from the CLC Genomics Workbench 9.5.x line.

Improvements

  • All NCBI server communication is now encrypted (uses HTTPS).
  • Updated the URL to use for links to UniProt databases.
  • Updated BLAST executables to be compatible with macOS Sierra. This change only affects Mac users.

Bug fixes

  • For the Basic Variant DetectionLow Frequency Variant Detection and Fixed Ploidy Variant Detection tools: 
    • Fixed an issue where the count and read count could be reported as marginally higher than they actually were in a small minority of cases. For the affected variants, this could then also result in variant frequencies being reported that were slightly higher than they should have been, in some cases above 100%. Variants affected by this issue are a small subset of variants where the variant affected overlapped another potential variant and where only the affected variant was then reported. This change could lead to a small decrease in the number variants reported compared to earlier versions of the CLC software, due to a variant no longer passing the count or read count filtering constraints. The impact of this change is expected to be low. For example, in our tests, for a particular analysis that reported 250,000 variants, 30 fewer were reported with the same parameters and filters applied after this fix was implemented.
    • Fixed an issue where the coverage of a longer variant that contained another variant was reported for both the longer variant and the contained variant. The coverage for the contained variant is now reported correctly.
    • Fixed a bug where count, read count, and forward- and reverse read count could be incorrect for variants found in overlapping regions of a pair of reads and where the variant was originally identified as being adjacent to one or more other variants.
    • Fixed an issue affecting coverage calculation for SNVs without immediately adjacent variants when using paired read data: if the second read of a pair containing the variant did not meet the requirements of the quality filter, neither the first nor second read of that pair contributed to the coverage calculated for the variant.
    • Fixed an issue where for a SNV without immediate neighboring variants, overlapping reads of a pair that had conflicting base calls for that variant position contributed to the values calculated for coverage, read coverage, and read count of that variant.
    • Fixed an issue where the forward and/or reverse count for a longer variant, supported by paired reads with both children having the same direction, could be too low. The forward count and reverse count is now reported correctly.
  • Fixed an issue with the InDels and Structural Variants tool where an incorrect insertion could be called when the optimal alignment of a read's unaligned end around the breakpoint included a gap in the insertion sequence.
  • Fixed an issue where, when searching for both read1 and read2 in a broken pair, the Find Broken Pair Mates tool reported that the mate of read1 was itself. The tool now correctly shows that the mate of read1 is read2.
  • Fixed a rare issue where some annotations could, but did not necessarily, go missing on sequences with greater than 1000 annotations of a given type on that sequence before the deletion and where the right-click context menu option "Delete selection" was used.

  • Fixed a bug in the Manage Enzymes wizard that prevented a user from cancelling the action if "Save as new enzyme list" was enabled.
  • Fixed a problem with the identification of the correct sequence types from MLST schemes in cases where the schemes contained blank characters. This issue affected Workbenches with the CLC MLST Module or CLC Microbial Genomics Module installed.*


CLC Genomics Workbench 8.5.3

Release date: 2016-06-16

Bug fixes

  • Fixed an issue with the RNA-Seq Analysis tool that could arise when the "Genomes annotated with genes and transcripts" option was chosen: If two or more genes had the same name, and a transcript could be assigned to each from the mRNA track, then the value in the "Transcripts annotated" column in the GE track and in the TE track was 0. Furthermore, all counts for such genes were reported as zero, even when there were reads mapping to them.
  • Fixed an issue where the Motif Search tool incorrectly reported all match accuracies as either 0% or 100%.
  • Fixed a bug that made the help for tree side panel settings inaccessible when the workbench was run in limited (evaluation) mode.

Advanced notice

  • From the autumn 2016 release, only 64 bit versions of the CLC Genomics Server, CLC Genomics Workbench, Biomedical Genomics Workbench, CLC Bioinformatics Database and CLC Assembly Cell will be made available. 32 bit versions of these will be discontinued from that time.
  • The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools will be removed from the Workbench in early 2017.
  • The tools in the Expression Profiling by Tags section of the Toolbox will be removed in early 2017. Tools affected: Extract and Count Tags, Create Virtual Tag List, and Annotate Tag Experiment.


CLC Genomics Workbench 8.5.2

Release date: 2016-04-07

Improvement

  • Performance optimization for sizing phylogenetic trees by metadata.

Bug fixes

  • Fixed an issue with handling dates when importing metadata from Excel format files using the Metadata Table Editor.
  • Create Detailed Mapping Report tool: the detailed mapping report statistics table is now showing previously missing values for regions with partial coverage. For fully covered regions these values cannot be calculated, and empty strings are replaced with coverage minimum, average and standard deviation. Numeric sorting is retained by inserting NaN values instead of empty strings, where calculations cannot be made.
  • Fixed an error when running Merge Overlapping Pairs on extremely short reads.
  • Fixed a bug that was causing missing report text lines.
  • Added support for a SAM record to be able to declare a CIGAR string, which leaves no residues left for aligning when importing SAM/BAM files.
  • The Download Pfam Database tool has been updated to download version 29.
  • Fixed a frame offset bug that occurred when translating reverse complemented CDS regions into protein sequences.
  • Fixed an issue where Workflows were not able to remove intermediate data from permission enabled locations unless the top folder was writable.
  • The "Metadata Role Override" parameter that was visible when creating Workflows has been removed.
  • Fixed an issue that caused the 'Use random codon' parameter in the Reverse Translate tool to report a null-error.
  • Fixed threads being leaked in Map Reads to Reference when caching of indexed reference sequences was used.
  • Fixed an issue where Map Reads to Reference would under rarely occurring circumstances report a persistence error.
  • When the InDels and Structural Variants tool is added to a workflow, the "P-value Threshold" parameter did not show up in the Select settings wizard step under "Significance of unaligned ends breakpoints". This has been fixed.
  • BED Export: when exporting block list entries (such as connected exons from mRNA tracks), positions were absolute. This has been fixed: positions are now relative to the 'chromStart' position.
  • Fixed bug when download buttons on BLAST result table view failed for nucleotide sequences.
  • Fixed an issue with renewing a borrowed license.


CLC Genomics Workbench 8.5.1

Release date: 2015-10-15

Bug fixes

  • Fixed a bug when the "Search for sequences at NCBI" tool would fail to download nucleotide sequences with the error message "The following sequences were not downloaded correctly: ...".
  • Fixed a problem with the BLAST at NCBI step of the Create Protein Report tool.
  • Fixed an issue leading to an error during VCF export where the data involved had originally been imported from VCF files and the values in the QUAL field were integers.
  • Export of floating-point (decimal) numbers to VCF format were previously dependent on the specified locale. This has been fixed so that the decimal separator now always is a point.
  • When doing automatic association of metadata, the log now shows which metadata rows were not associated with any data.
  • Fixed a bug that prevented metadata manual information to be accessed from within the Workbench.
  • Fixed a bug where doing automatic association using a metadata table stored on a CLC Server would fail.
  • Automatic association of metadata now handles association based on the a prefix of data names rather and exact matching to the whole data name.
  • A metadata table no longer needs a key column for its rows to be manually associated with data elements.
  • An option to override metadata roles previously visible in the configuration of Workflow outputs was removed.
  • Fixed issue that caused locked parameters to be overwritten by a previously entered value, during workflow execution.
  • Fixed an error happening when a Workbench Data Location was pointing at a file on the system instead of a folder. It will now appear as unavailable in the Workbench Navigation area.
  • Enabled tooltips for all parameters when configuring and executing workflows.
  • The login process from a Workbench to a CLC Server must now complete before opening a clc url will begin.
  • Fix a problem on Macs where the Workbench was not recognized as a custom protocol handler for clc:// urls.
  • Resolved a rare occurring exception that could be triggered by switching editor view with a double click.
  • Fixed a problem where after import of a large volume of data, using the "Show results" option in the process tab resulted in an error.
  • Fixed an error that occurred when pressing the Print button in the Help dialog (Mac OS X only).

Changes

  • In the output from the Trio Analysis tool, the inheritance option "Accumulative" has been renamed to "Recessive".


CLC Genomics Workbench 8.5

Release date: 2015-09-08

New features and improvements

  • The Sequencing QC report now contains the total number of reads in the summary.
  • Numerical comparison operators => and <= have been added to the filter tool for tables.
  • Quality scores ( QUAL ) are now calculated and added as annotations for variants. These values are included in VCF exports.
  • Batching on selected elements is now possible: it used to be restricted to selected folders.
  • The Search for Sequences at NCBI tool now has an option to search the EST database.
  • Improved memory management when handling large report elements.
  • Improved use of multiple cores when running the Create Detailed Mapping Report.
  • Improved use of multiple cores in the InDels and Structural Variants tool.
  • The output of the Reverse Complement Sequence now gets the suffix "-RC" attached to the name of the input. It used to be "-1".
  • The Hierarchical Clustering of Samples tool can now be executed as part of a workflow and can be executed on a CLC Genomics Server.
  • The fastq exporter can now export sequences up to 500Kbp. The limit used to be 32Kbp.
  • Tooltips on leaves of phylogenetic trees now display a description of the attached sequence.
  • Numbers are no longer appended to the names of Workflow elements when creating a copy of a Workflow using "Open Copy of Workflow".
  • Metadata Management. Keep track of input files and import meta information for your samples.

Changes

  • The tool "ChIP-Seq Analysis" has been renamed to "Transcription Factor ChIP-Seq"

Bug fixes

  • Fixed a SOLiD NGS importer bug where import of very low quality, colorspace-encoded, paired-end sequence reads in fastq format could lead to paired sequence lists where the wrong reads were marked as pairs.
  • Fixed an issue with the Map Reads to Contigs tool that could be extremely slow when included in workflows with multiple inputs.
  • Fixed a bug in the Annotate and Merge Counts tool where the Feature ID of mature 3' small RNAs in the "grouped on mature" tables was incorrect if the input data type was an Experiment.
  • Fixed an issue where some filtering operations, such as "doesn't contain" did not act correctly when filtering table cells that contained multiple pieces of information.
  • Fixed automatically generated link to COSMIC website, which previously led to retired page.
  • Fixed an issue where annotations that spanned the ends of a circular sequence would be incorrectly placed in the Circular Sequence View.
  • Fixed a bug that caused the workbench to freeze if certain sequences were displayed in circular view with radial rendering of labels.
  • Fixed an issue whereby Create Box Plot and Principal Component Analysis could sometimes be run with illegal arguments, leading to an error message.
  • Fixed a bug in the Predict Secondary Structure tool when the option to calculate the partition function was selected  for long molecules (>1000 nucleotides).
  • Fixed an issue where some filtering operations, such as "doesn't contain" did not act correctly when filtering table cells that contained multiple pieces of information.
  • Fixed errors which prevent the side panel options of the gel view of a sequence list to be correctly applied and stored.
  • The list of Illumina adapters sequences has been removed from the Genomics Workbench.
  • Fixed an issue where one could not zoom in after zooming out fully on very large workflows.ing out fully on very large workflows.
  • Fixed an issue that prevented a root folder on Windows drives from being used as a File Location.
  • Fixed an issue where updating an existing installation on Windows would result in the .vmoptions file being deleted, which makes the Workbench run with the default Java configuration.
  • Fixed exported reports having the wrong author in certain situations.


CLC Genomics Workbench 8.0.3

Release date: 2015-08-13

Bug fixes

  • Fixed a read mapper bug that caused some reads to be incorrectly reported as unmapped when global alignment was selected.
  • Fixed an issue with the sort order for paired reads in SAM/BAM exports in high coverage regions.
  • Fixed a SOLiD NGS importer bug where import of very low quality, colorspace encoded paired-end sequence reads in fastq format could lead to paired sequence lists where the wrong reads area marked as pairs.
  • Fixed an issue where the Local Realignment tool when run with RNA-seq mapping could occasionally report a match that did not meet internal requirements as a valid match. This had a downstream effect when variant calling tools were run, and then failed upon encountering such a position. This issue has also been addressed in this release.
  • Fixed an issue where the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection tools would stop with an error when encountering a place in a read mapping containing a match that did not meet internal requirements of a valid match.
  • Selecting an entry in a Blast results table could highlight the wrong alignment in the Blast editor, if the table had been filtered or sorted.
  • Fixed an issue where an error would arise when using the Design Primers editor and clicking on an annotation on the sequence.
  • Fixed a bug that caused the mapper to enter an infinite loop if a reference of length 0 was used.
  • Fixed a rare bug that sometimes made the read mapper halt prematurely when several seeds were identified at the same reference position.
  • Fixed a rare issue where the Workbench would display an error message when installing a 3rd party licensed plugin.
  • Fixed an issue where an error would arise in some view types when an region of a sequence had been selected and then the "Zoom to selection" tool was used.

Improvements

  • The Illumina import now shows the file name on top of the process bar during the import.
  • In the "SAM/BAM Mapping Files" import tool, any inconsistencies between the reference sequences in the BAM file and the reference sequences in the CLC software that are provided for the import of the BAM file are now highlighted in red in the "References in files" table.


CLC Genomics Workbench 8.0.2

Release date: 2015-06-15

Bug fixes

  • Fixed an issue with running BLAST at NCBI where an NCBI-generated error about their CPU usage limit being exceeded was not being reported transparently and a result of "no hits" was being reported instead.
  • Added a work around to a java issue that occasionally resulted in the Workbench displaying an uninformative error and requiring a restart to continue working.
  • The InDels and Structural Variants tool can now better detect variants when using target regions near the edge of the regions.
  • Fixed a rare error in the Create Statistics for Target Regions tool. The error resulted in a failure when a target region only included the very last nucleotide of a chromosome.
  • The relative read direction filter in the Low Frequency Variant Detection tool is less strict on variants with large coverage.
  • The variant callers could enter an infinite loop for certain inputs. This fix adds a check that was unfortunately missing in previous fix for this problem.
  • Fixed bug in which Local Realignment could produce an illegal read mapping. This only happened for RNA-data.
  • The variant caller will now fail if it encounters an illegal RNA read mapping. If the variant caller fails with such a message, and if it was run on locally realigned data, then we suggest to re-run the local realignment to avoid the error.
  • The Reverse Translate tool ignored any genetic code specified in the codon frequency tables. All reverse translation would thus default to the standard genetic code.
  • Fixed wrong display of "Supported format" when exporting elements from either the Folder Editor or the Local Search Editor.
  • Fix of potential wrong file being saved when editing a file found via the Local Search Editor.
  • Plots inside reports are now shown with their saved side panel settings.
  • Fixed saving different line colors in plots through the side panel.
  • Side panel option to show legends for a plot with more than 10 samples is now enabled.
  • Fixed an issue that led to an error when rendering plots for empty data sets.
  • Fixed text inside variant boxes in the track view sometimes having a small font size.
  • When installing a workflow with bundled data, it is no longer possible to select a read-only folder for storing the data.
  • Transcriptomics experiment and sample tables can now be sorted, even with large numbers of rows.
  • Fixed an issue where the "Empty Recycle Bin” option was sometimes incorrectly unavailable.
  • A fix was applied to avoid an exception in circumstances when the cleanup of downloaded files from BLAST failed.


CLC Genomics Workbench 8.0.1

Release date: 2015-04-16

New features and improvements

  • Former plugin "Duplicate Mapped Reads Removal" is now integrated under the name "Remove Duplicate Mapped Reads " and can be found in the NGS Core toolbox. For users that had previously installed this plugin, it needs to be uninstalled.
  • Link Variants to 3D Protein Structure now generates full biomolecules: even if the PDB template only contains a single subunit, the full multimer can be generated from symmetry information. This makes it possible to locate variants on the interface between protein chains.
  • The filtering option in the Create Track from Experiment tool only considered the predicted fold-changes in the positive direction, so features that were reduced in expression were filtered out. This has now been fixed.
  • BLAST has been upgraded to BLAST+ 2.2.30 that includes a number of improvements and bug fixes. A full list of BLAST+ 2.2.30 changes can be viewed at http://www.ncbi.nlm.nih.gov/books/NBK131777.
  • Transcriptomics experiment and sample tables can now be sorted, even with large numbers of rows.
  • Particular annotation types (columns) can now be specified for export in Excel, HTML and tab delimited formats.
  • Added column to output of "Annotate and Merge Counts" indicating 3' or 5' direction when using "grouping on mature" parameter.
  • Increased the performance for gzip export.
  • The results of BLAST searches now include a new view, the Blast Hit Table.

Bug fixes

  • Fixed an issue with running blast searches at the NCB I where an NCBI-generated error about their CPU usage limit being exceeded was not being reported transparently and a result of "no hits" was being reported instead.
  • Fixed an error that in rare cases would result in a division by zero error message when selecting rows in the Annotation Table view.
  • Fixed an error that made it impossible to add an annotation via the Annotation Table view if the table is empty.
  • Fixed rare problem where a track list of reads tracks and graph tracks would break.
  • Fixed an error affecting the "Cut Sequence Before/After Selection" tool in the Cloning editor.
  • Fixed a bug where a left-click quickly followed by right-click was interpreted as double-click on OS X (in the persistence search result list, in the toolbox tree, and in the workflow editor).
  • Fixed an error that occurred when running the Create Sequencing QC Report tool and requesting quality analysis reporting..
  • Fixed an error that prevented the import of adapters from csv format.
  • Fixed the SOLiD NGS importer to correctly import basespace encoded sequences in fastq files. It is still assumed that sequences originate from colorspace.
  • It is now possible to filter tables based on content in the 'Link Variants to 3D Protein Structure' column.
  • Fixed a rare error that caused the Amino Acid Change tool to crash if a CDS feature was less than 3 bases long.
  • Fixes and updates for automated genome downloads (Zea mays, C. elegans).
  • Fixed a bug in the probabilistic variant caller that caused it to fail for certain input.


CLC Genomics Workbench 8.0

Release date: 2015-02-24

New features and improvements

  • New tools:
    • Create Track from Experiment. This tool makes it possible to convert Experiments to Tracks. In the Experiment, the results of the statistical analysis are annotated on the experiment as additional columns. It can be advantageous to visualize the results of the statistical analysis as tracks.
    • Link Variants to 3D Protein Structure makes it possible to visualize amino acid changes on 3D protein structures. After running the tool on a variant table, variants can be visualized on 3D structures. 3D models are automatically built using structural templates from the PDB. The new tool can be found under 'Resequencing Analysis | Functional Consequences | Link Variants to 3D Protein Structure'.
  • The Map Reads to Reference tool now supports both linear gap cost parameters and affine gap cost parameters. The addition of affine gap cost support allows you to get more accurate results for reads with stretches of insertions or deletions.  
  • The read mapper used in the RNA-Seq Analysis tool has been upgraded to use the new read mapper described above. This upgrade enables you to run RNA-seq Analysis with as little as 6 GB RAM and at the same time improves your end results. However, you cannot yet use affine gap cost parameters in your RNA-Seq analysis.
  • Performance of the Merge Read Mappings tool has been improved, especially in situations where the number of reference sequences is very large, such as when merging reads mapped against de novo assembly results.
  • The tool Amino Acid Changes has been expanded with an extra output that makes it possible to visualize amino acid changes in track format. The amino acid color schemes can be changed in the Side Panel under "Track layout" and "Amino acids track".
  • Chromosome bands/cytogenetic ideograms can now be downloaded to the Workbench via the Download function. The ideogram can be added to track lists to get a better overview of the data.
  • Tracks:
    • Improved resource management: Makes it more efficient to work with tracks involving large numbers of reference sequences. This typically applies to situations where a reference genome is not available, such as when tracks are based on de-novo assembly results.
    • Consistent output when enriching variant tracks and annotation tracks with extra table columns. Output tracks from these tools now have the same number of added table columns and the columns will always be in the same order. Previously, if an added column had empty values for all variant rows, it would have been removed from the final table, resulting in varying number and relative order of additional columns when multiple samples were processed with the same tools/workflows. All columns are retained now, facilitating downstream processing of exported tables, and providing immediate visual reference as to which enrichment/annotation tools have been applied, even if they did not produce any results for a particular sample.
    • Tables for variant tracks and annotation tracks can now sort and filter columns with cells containing multiple numbers.
    • Improved the track viewer for variant tracks to show the sequence alteration on the rendered variant.
    • Improved performance of creating variant tracks and annotation tracks.
    • Graph tracks now show negative values filled upwards to y=0 (as expected).
  • Workflows:
    • When installing a workflow in the workflow manager, the newly installed workflow is automatically selected.
    • The "Run" button in the workflow editor does not require a saved workflow anymore to be enabled.
    • In the execution wizard of a workflow the "Reset to default" button is now active.
    • All icons in the workflow editor are now on the left side.
    • Introduction of snippets: Parts of workflows can now be saved as a snippet and reused in other workflows.
    • Installed workflows: It is now possible to create a copy of an installed workflow and open the copy in the view area by clicking once and then right-clicking on the installed workflow in the toolbox. This brings up the option "Open Copy of Workflow".
  • MA plots, scatter plots and histograms can now accept expression tracks as input.
  • An extra optional output called "Create coverage graph", that shows the coverage in each position of the targets, has been added to the tool Create Statistics for Target Regions.
  • Increased decimals for numbers when exporting table to CSV, tab delimited text, and Excel.
  • Improved reporting of errors related to low disk space.
  • New features for the 3D molecule viewer:
    • Align to Existing Sequence makes it possible to connect a 3D protein chain to a sequence, sequence list, or an existing alignment
    • Transfer Annotations makes it possible to create atom groups from sequence annotations (and vice versa) for connected sequences.
    • Improved layout of the property viewer.
    • Improved PDB import of water molecules, DNA/RNA, and saccharides.
    • When importing PDB files, the resulting Molecule Project now contains citation information (PDB ID and primary reference), which can be found in the 'Show History' view.
  • Batching: Processes tab and analysis execution logs now display batch names in addition to analysis names for enhanced clarity.
  • The External Application Client Plugin is now available directly from the Workbench Plugin Manager.
  • Multiple target region tracks for the "Indels and Structural variants" can now be specified.

Bug fixes

  • Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
  • Fixed display problem in read mappings showing too many hidden insertions (as vertical black lines) in certain overlapping paired reads.
  • Fixed problem with links and text in tables that were being cut off when succeeding a link.
  • Restriction site analysis: The values "Cut position(s)" column of the restriction site analysis table now behaves like numbers instead of text, meaning sorting and filtering works.
  • The tool Identify Graph Threshold Areas can now use negative values to define its threshold.
  • Workflows:
    • In the workflow editor the "Reset to default" now always reverts to the right names.
    • In the workflow editor the validation is now correctly triggered when changing the configuration of an input element.
    • The workflow editor can now open workflows in which the graphical view of the workflow is corrupt.
    • Fixed an exception which could occur during workflow migration.
    • Data with the same name can now be bundled multiple times in a workflow installer.
    • Previously when a plugin contained custom actions and a workflow, the workflow could not be installed. This has been fixed.
    • Fixed problem with unlocked output names that previously could not be configured during execution of a workflow.
    • A workflow with configured data from a server is now automatically validated when connected to the server (when opened in the editor). Previously the workflow had to be closed and reopened first.
    • The original workflow file included in a workflow installer can now be exported directly without having to restart the workbench in advance.”
  • A problem with saved table settings that sometimes did not work has been fixed. The bug fix includes a more robust/generic way of saving table settings with different columns. To fix this problem, existing saved table settings should first be loaded on an object where it works (i.e. has the same columns as when it was saved); and then the table settings should be saved with the old name to overwrite the settings.
  • Fixed an error that could cause batch processing to open all results rather than saving them.
  • Fixed problem with import of BED files using external applications.
  • SAM/BAM import will no longer fail for alignments with POS = 0, but instead import them as though they were unmapped.
  • Fixed problem going back in the wizard for the "Find Binding Site and Create Fragments" tool.
  • Fixed error occurring when removing an unsaved reads track from a track list.
  • Metadata for phylogenetic trees: A bug has been fixed with import of metadata containing column names with colons.
  • Fixed error when showing protein translations of annotations shorter than 3 bases.
  • Search for PDB Structures at NCBI has been fixed to correctly show PDB deposit date and organism type.
  • Fixed a bug in the Mapping Coverage exporter.
  • Fixed reads tracks reads-amount indicators (the numbers between the reads track and the box with the tracks name and number of reads) that sometimes wrongly said 0.
  • Small RNA Analysis -> Annotate and Merge Counts: When you choose to create a “grouped on mature” output, the small RNAs are grouped by both the 5’ and the 3’ mature sequences separately in the “grouped on mature” output. The column heading has therefore been changed to show "Mature" instead of "Mature 5' ".
  • When using the RNA-Seq Analysis tool with the "One reference sequence per transcript" option, the "Maximum number of hits for a read" option was sometimes not taken into account for multi-hit reads. This has been fixed.
  • Two problems with the F1 help has been fixed; 1) When pressing F1 in a workflow tool wizard more than one help window appeared, and 2) Fixed problems showing help by pressing the F1 key in tool wizards.
  • Fixed a bug that in some cases caused an error when annotating read sequence lists with the GFF/GTF/GVF annotation tool.
  • Amino Acid Change tool: In cases where an mRNA track does not overlap all annotations in the CDS track, "Coding Region Changes" were not added to variants overlapping a CDS but not overlapping an mRNA annotation. This has been fixed.
  • Variant callers and the "Amino Acid Changes" tool: In cases where variants overlapping an mRNA annotation but not a CDS annotation,"Coding Region Changes" were not added to variants overlapping an mRNA annotation but not a CDS annotation. This has been fixed.
  • Fixed an error that in rare cases would prevent creation of tracks from references sequences.
  • Hypergeometric test on annotations: Fixed a rare error that occurred for some data sets containing annotations of the form: '1234 // abc'.
  • Fixed a bug in the QC report creation step of the ChIP-seq analysis.
  • Fixed a bug for color space reads in the RNA-Seq Analysis tool that caused only exon-exon matches to be reported.
  • An issue where an XSQ file containing both base space and color space versions of the same reads were incorrectly imported into the same sequence list, resulting in each read appearing twice has been addressed.
  • The alignment editor view and alignment primer design view now have independent settings.
  • Fixed an issue with mapping of paired-end reads, where these were erroneously reported as broken pairs when the fragment size derived from the alignments of the two ends of the pair was longer than reference sequence.

Changes

  • Contigs coming from the de novo assembler will now have underscores in their names rather than spaces.

Plugin updates and bug fixes

  • The TRANSFAC Plugin has been updated and now has two modes of operation: "Classic" and "Genomic". The Classic mode is the legacy mode taking sequences as input and annotating these sequences. The new Genomic mode takes regions on a genome (an annotations track) as input. In both modes it is now possible to specify global thresholds of similarity score which can be used to filter the annotations included in the output.
  • A bug has been fixed with import of metadata containing column names with colons in the Metadata Import plugin.

Compatibility

  • This release can be used with CLC Genomics Server 7.0
  • This release is using the read mapping and de novo assembler that corresponds to CLC Assembly Cell 4.3.


CLC Genomics Workbench 7.5.5

Release date: 2015-08-18

Bug fixes

  • Fixed a read mapper bug that caused some reads to be incorrectly reported as unmapped when global alignment was selected.
  • Fixed a bug that caused the mapper to enter an infinite loop if a reference of length 0 was used.
  • Fixed a rare bug that sometimes made the read mapper halt prematurely when several seeds were identified at the same reference position.
  • Fixed a SOLiD NGS importer bug where import of very low quality, colorspace encoded paired-end sequence reads in fastq format could lead to paired sequence lists where the wrong reads area marked as pairs.
  • Fixed sort order for paired reads in SAM/BAM exports in high coverage regions.
  • The analysis/workflow execution system now handles search algorithms specially so that search results are not modified. This eliminates a host of concurrency issues.
  • Fixed an issue where selecting an entry in a Blast results table could highlight the wrong alignment in the Blast editor, if the table had been filtered or sorted.
  • Minor improvements in persistence.
 


CLC Genomics Workbench 7.5.4

Release date: 2015-06-18

Bug fixes

  • Fixed an issue that caused the Reverse Translate tool to ignore the genetic code specified in the codon frequency tables such that the reverse translation used the standard genetic code.
  • Fixed an issue introduced by a fix in the Genomics Workbench 7.5.3 restricting the use of the Create Statistics for Target Regions tool on tracks containing a larger number of nucleotides  (>2147483647 bp) than could be supported for coverage table output. This check is no longer applied if coverage table output is not requested.
  • Fixed bug in which Local Realignment could produce an illegal read mapping. This only happened for RNA-data.
  • The variant caller will now fail if it encounters an illegal RNA read mapping. If the variant caller fails with such a message, and if it was run on locally realigned data, then we suggest to re-run the local realignment to avoid the error.
  • Read-only folders are no longer offered as potential locations to save data bundled with a Workflow.
  • Side panel option to show legends for a plot with more than 10 samples is now enabled.
  • Fixed saving different line colors in plots through the side panel.
  • Plots inside reports are now shown with their saved side panel settings.
  • The automated paired distance estimate can no longer exceed the maximum distance accepted by read mapper (100,000 bp).
  • Fixed an error that occurred when hovering the mouse cursor over the edge of a read mapping.


CLC Genomics Workbench 7.5.3

Release date: 2015-04-23

Bug fixes

  • The filtering option in the Create Track from Experiment tool only considered the predicted fold-changes in the positive direction, so features that were reduced in expression were filtered out. This has now been fixed.
  • When using the RNA-Seq Analysis tool with the "One reference sequence per transcript" option, the "Maximum number of hits for a read" option was sometimes not taken into account for multi-hit reads. This has been fixed.
  • Fixed an issue with mapping of paired-end reads, where these were erroneously reported as broken pairs when the fragment size derived from the alignments of the two ends of the pair was longer than reference sequence.
  • Fixed an error affecting the "Cut Sequence Before/After Selection" tool in the Cloning editor.
  • Fixed an issue with running blast searches at the NCBI where an NCBI-generated error about their CPU usage limit being exceeded was not being reported transparently and a result of "no hits" was being reported instead.
  • Fixed a bug in the probabilistic variant caller that caused it to fail for certain input.


CLC Genomics Workbench 7.5.2

Release date: 2015-02-17

Bug fixes

  • Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
  • Fixed error when removing an unsaved reads track from a track list.
  • Fixed display problem showing too many hidden insertions in certain overlapping paired reads.
  • Metadata for phylogenetic trees: A bug has been fixed with import of metadata containing column names with colons.
  • Fixed a bug in the Mapping Coverage exporter.
  • Fixed reads tracks reads-amount indicators (the numbers between the reads track and the box with the tracks name and number of reads) that sometimes wrongly said 0.
  • Small RNA Analysis -> Annotate and Merge Counts: When you choose to create a “grouped on mature” output, the small RNAs are grouped by both the 5’ and the 3’ mature sequences separately in the “grouped on mature” output. The column heading has therefore been changed to show "Mature" instead of "Mature 5'".
  • Two problems with the F1 help has been fixed; 1) When pressing F1 in a workflow tool wizard more than one help window appeared, and 2) Fixed problems showing help by pressing the F1 key in tool wizards.
  • Amino Acid Change tool: In cases where an mRNA track does not overlap all annotations in the CDS track, "Coding Region Changes" were not added to variants that overlap a CDS but not an mRNA annotation. This has been fixed.
  • The Low Frequency Variant caller could end up in an infinite loop in certain corner cases. This is now fixed.
  • Fixed "Export Graphics" default save-as directory.
  • Fixed problem with import of BED files using external applications.
  • Hypergeometric test on annotations: Fixed a rare error that occurred for some datasets containing annotations of the form: '1234 // abc'.
  • Fixed a bug in the QC report creation step of the ChIP-seq analysis.
  • Fixed error when showing protein translations of annotations shorter than 3 bases.
  • Fixed a bug for color space reads in the RNA-Seq Analysis tool that caused only exon-exon matches to be reported.
  • Fixed problem going back in the wizard for the "Find Binding Site and Create Fragments" tool.
  • Fixed a bug that in some cases caused an error when annotating read sequence lists with the GFF/GTF/GVF annotation tool.
  • An issue where an XSQ file containing both base space and color space versions of the same reads were incorrectly imported into the same sequence list, resulting in each read appearing twice has been addressed.

Plugin updates and bug fixes

  • A bug has been fixed with import of metadata containing column names with colons in the Metadata Import plugin.

Compatibility

  • This release can be used with CLC Genomics Server 6.5.3
  • This release is using the read mapping and de novo assembler that corresponds to CLC Assembly Cell 4.3.


CLC Genomics Workbench 7.5.1

Release date: 2014-10-28

New features and improvements

  • "Filter Annotations on Name" can now insert names to filter on from significantly bigger files. Previously the limit for the file size was 10KB, this has now been increased to 20MB.
  • RNA-Seq Analysis: The ENSEMBL gene id of each gene, where available, has been added as an additional column to the gene expression track output.
  • Improved performances of the ChIP-seq Analysis tool for genomes with a large number of chromosomes.
  • It is now possible to run a workflow without an optional input.

Bug fixes

  • A bug has been fixed in the Set Up Experiment tool. Exon-related expression values can now only be selected when present in the individual samples.
  • When creating a subset of a paired experiment, the sub-experiment no longer appeared as being paired. This bug has been fixed and sub-experiments created in previous versions should recover the pairing information when accessed with this version of the workbench.
  • Pfam filtering bug fixed. Previously, Pfam only reported the first domain of each type in a query and as a consequence many domains were missed. We recommend that users whose research depends on Pfam annotations re-run the tool on their data.
  • The AAC tool did not annotate variants in 3' UTR with their DNA-level change using the HGVS c.xxx format. This affects any analysis done with Gx 7.5 or earlier based on ENSEMBL CDS tracks from older versons. The AAC analysis should be redone using Gx 7.5.1 for correct annotation. Important: Please also check the description in the Gx 7.5 release notes of a bug fix in the translation of CDS annotations to protein sequences that was wrong in cases where the reading frame was not +1 or -1 in CDS annotations imported from ENSEMBL.
  • Fixed problem importing VCF files using the AO and RO genotype field.
  • Fixed problem importing certain VCF files.
  • Fixed a bug in the 'Maximum Likelihood Phylogeny' tool that failed when generating bootstrap values for certain input alignments.
  • Fixed problem with scrolling to the relevant files when selecting objects as parameters in tool wizards.
  • The Blast text results have been improved so they show the correct query and subject positions regardless of strand.
  • Fixed a problem that prevented BLAST operations when choosing to run these on the CLC Server.
  • Fixed problem with import of read mappings with supplementary alignments. When importing read mappings with supplementary alignments, supplementary alignments are not imported. Previously import of such read mappings caused import errors.
  • Fixed rare problem with coverage that could occur in zoomed out reads tracks containing wrapped paired reads.
  • Fixed rare error when sorting experiment tables.
  • Fixed a bug in the Annotate and Merge Counts tool that in rare cases resulted in incorrect sorting and crash.


CLC Genomics Workbench 7.5

Release date: 2014-08-28

New features and improvements

  • New tools:
    • New variant callers (Resequencing analysis):
      • Three new tools for detecting variants are available in the "Variant Detectors" toolbox under "Resequencing Analysis": Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection.The Basic Variant Detection and Fixed Ploidy Variant Detection are similar in nature to the Quality-based and Probabilistic Variant Detection tools respectively. The main difference is that all filters, previously employed in either the Quality-based or Probabilistic Variant Detector, are now available in all three variant callers, in addition to a new filter: the relative read direction filter. The Low Frequency Variant Detection tool is a new statistics-based tool for detecting low frequency variants e.g. in mixed tissue cancer or mixed population samples. The Quality-based and Probabilistic Variant Detection tools have been moved to the "Legacy tools" folder in the toolbox, and will eventually be retired. Please note that any benchmarking done for your own purpose using these tools should be repeated when you switch to the new variant callers. We recommend that you read the Special notes upgrading to Genomics Workbench 7.5 for further information.
    • Improved read mapper and a tool for downsampling (NGS Core Tools):
      • Memory usage reduced for the read mapper, enabling mapping against human genomes on a modern notebook.
      • Caching of reference index files improves the speed when the same reference is used repeatedly for read mapping.
      • The new "Sample Reads" tool can be used to downsample large sets of reads for all types of NGS analysis.
    • New ChIP-Seq tools (Epigenomics Analysis):
      • The ChIP-Seq Analysis tool found in the toolbox under "Epigenomics Analysis" has been replaced with the plugin "Peak Shape ChIP-Seq Analysis" (that has been renamed to "ChIP-Seq Analysis"). The old "ChIP-Seq Analysis" tool has been renamed to "ChIP-Seq Analysis (legacy)" and moved to the new "Legacy tools" folder in the toolbox. The new ChIP-Seq Analysis tool uses a new approach to identify genomic regions with significantly enriched read coverage and a read distribution with a characteristic shape. The parametrization of the algorithm is done automatically by learning the characteristic shape of the signal from the data, making the algorithm intuitive and easily understandable.
      • The "Annotate with Nearby Gene Information" tool can be used to annotate ChIP-seq peaks with the nearest gene upstream and downstream, based on the start position of the gene. The resulting annotations are provided in the same format as in the legacy ChIP-seq Analysis.
  • A new folder called “Legacy tools” has been added to the toolbox. The "ChIP-Seq Analysis (legacy)"  has been moved to this folder along with the Probabilistic Variant Detection and the Quality Based Variant Detection tools.
  • Workflows:
    • The input information is now shown in the preview dialog and also exported to all formats.
    • It is now possible to edit the workflow input name by right-clicking on the input name in the workflow.
    • Tools with object parameters now accept multiple inputs. This applies to e.g. Trio Analysis that now can be run in a workflow using child, mother, and father as input in the same workflow.
    • Workflows as such can have multiple inputs (though this will disable the batch functionality).
    • Data can now be directly bundled with a workflow installation. This means that reference data can be packed and shared together with a workflow (only recommended for small data).
    • A workflow input can be pre-configured. If kept unlocked, it can be used to give a default when executing the workflow.
    • A text field has been added to the side panel, where you can search for elements in the workflow. A found element will be centered and highlighted.
    • A new editor was added to the workflow to make it easier to check the configuration. The new editor can be accessed from the lower left corner of the View Area and lists all configuration parameters.
    • Workflows can be packaged with a plugin and will get installed simultaneously with the plugin.
    • Workflows installed on the server now have an overlay icon in the workbench, to make them easily distinguishable from workflows installed in the workbench.
    • The execution of a workflow in the workbench and on the server has been unified to have the same behavior regarding logs, intermediate results and output naming.
    • Locked settings in the workflow wizard are now again hidden per default when executing the workflow, to give a cleaner, simpler look to the configuration. When expanding, all parameters are displayed.
    • One tool can now receive input from two different sources; 1) a reads track that is the input that hold the data that is to be analyzed (in this case reads that is to be locally realigned), and 2) a parameter that can have different functions depending on the tool that it is connected to (e.g. an InDel track can be used as a guidance track for the local realignment. In other situations the parameter track could be used for e.g. annotation or could provide a reference sequence).
    • New workflow-enabled tools:
      • Create sequence statistics.
      • NGS Importers.
  • Protein Analysis, Pfam domain search:
    • Pfam Domain Search now uses HMMER3 and the latest Pfam database that can be downloaded with the new tool "Download Pfam Database".
    • Searching multiple sequences is significantly faster.
    • New filters are available in the improved Pfam Domain Search tool to enable generation of the same results as the online tool.
  • 3D Molecule Viewer:
    • Protein Structure Alignment - high quality structural alignment of whole protein chains or selected regions of a protein, available from the Side Panel of Molecule Projects.
    • Project Tree improvements - new ways of selecting nearby atoms. Improved visualization of intermolecular bonds. Atom groups are now stored on the Molecule Project and can be renamed. Labels on custom atom groups now show residue names if applicable.
    • New molecule color scheme where only the carbon atom color is varied.
    • 3D view state - all 3D visualization settings (including custom atom groups) can now be stored on a molecule project and shared with others.
    • Molecule Preparation. Many improvements, including better handling of partial charges and more recognized chemical groups.
  • The option "Fix maximum of coverage graph" has been added to the Side Panel for reads tracks. This allows direct comparison of the coverage of the individual reads tracks.
  • Local Search enabled from the menu bar now includes filtering on "Path".
  • Advanced filtering on tables now includes the option to filter for a space, comma or semi-colon delimited lists of terms.
  • Zoom tools redesign: The “Zoom to selection” feature is now also available for sequences, sequence lists, alignments and read mappings.
  • The tracks info panel, with track names in the left side of the track, now wraps information instead of showing a scroll bar.
  • Saving/applying side-panel settings for tables now works for different tables that share some columns.
  • Graph Tracks can now be exported to Wiggle file; the span option is now supported in the Wiggle import.
  • "Filter Based on Overlap" and "Annotate with Overlap Information" now accept Expression tracks as input.
  • SAM/BAM import. It is now possible to choose to ignore unmapped reads when importing SAM/BAM files.
  • Fisher's Exact Test. Added options for correction of p-value for multiple testing; Bonferroni correction and False Discovery Rate (FDR).
  • Speedup: newly created expression tracks will display graph faster.
  • Copy operations can now be stopped.
  • Output from "Reverse Translate" can now be a Sequence List.
  • Import of Example Data and imports done through dragging files into the workbench and dropping them in the Navigation Area will no longer block the user interface while executing. Instead, the import happens as a background process that can be monitored and controlled via the Processes tab in the lower left corner.
  • CLC Workbenches now support high resolution displays such as Apple retina displays of all data shown in the View Area (including tooltips).
  • Small improvements of the de novo assembler speed.
  • Improved error message in the Empirical Analysis of DGE tool in case of invalid expression values in experiments (occurs rarely).
  • More informative naming of coding region translations produced by the tool Translate to Protein. The name for a coding region translation consists of the name of the input sequence followed by the annotation type and finally the annotation name.
  • Genetic codes: The list of NCBI translation tables has been expanded to include translation table 25 "Candidate Division SR1 and Gracilibacteria" (See: http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c).
  • Improved error messages due to low disk space.

Bug fixes

  • Translation of CDS annotations to protein sequences was wrong in cases where the reading frame was not +1 or -1 in CDS annotations imported from ENSEMBL. This error affected the Translate to Protein tool, translation functionality in sequence viewers and their context menus, as well as the Amino Acid Consequences (AAC) variant annotation tool. We highly recommend redoing the AAC analysis for correct variant annotation, as CDS tracks typically are created from ENSEMBL data.
  • A bug in the Fisher Exact Test tool that in some cases caused incorrect counting has been fixed. The Fisher Exact Test algorithm now checks if a case variant also exists in a control variant as a different type (e.g. an SNV variant can exist as part of an MNV variant). Note that variants only found in the control tracks are no longer included in the output.
  • The right-click menu on certain annotations in tracks was not working when viewing a single track. This has been fixed.
  • Icons in the workflow editor are now scaled consistently when zooming in or out.
  • Several issues with the validation display in the workflow editor have been fixed.
  • A bug has been fixed in the workflow configuration wizard. Previously the input was not taken into account when deciding which parameters were enabled.
  • Fixed problem where the "space" key did not trigger "Find Conflict" in the stand-alone read mapping editor.
  • Fixed stand-alone read mappings not showing mismatches and insertions in the overflow graph.
  • Fixed a bug in the de novo assembler and legacy read mapper which could cause a crash due to a collision of temporary file names.
  • Fixed a bug which caused the de novo assembler to crash in rare cases on systems running windows. Tools depending on read mapping might also have been affected by this.
  • NGS import tools now work when run via CLC Server.
  • 'Replace input sequences with result' in Cloning Editor no longer fails.
  • A bug has been fixed in the Local Realignment tool. The bug materializes in extremely rare cases when applying the variant callers on locally realigned RNA-seq mappings with spliced reads. On these mappings, local realignment could generate invalid spliced reads (after local realignement, you could have spliced reads with segments that overlapped).
  • Fixed minor annotation track rendering problem on tracks having peaks with large amounts of overlapping features.
  • Fixed bug which caused "Merge Annotation Tracks" to output wrong "Origin tracks" annotations in some situations.
  • Fixed bug causing occasional error when the same track list element was opened more than once.
  • Fixed "Export Graphics" default save-as directory.
  • Multi-sequence BLAST search results (BLAST tables) are now exportable as plain text.
  • Due to a change in the COSMIC format CLC Genomics Workbench could not import COSMIC data. This has been fixed. Through Import->Tracks we now support the following COSMIC databases in tsv format, which can be manually downloaded from the COSMIC ftp site (http://cancer.sanger.ac.uk/cancergenome/projects/cosmic/download):
    • Complete COSMIC data
    • Complete mutation data
    • All Mutation in census genes

Changes

  • To improve the stability of workflows: If a variant caller finds no variants, an empty track is produced, rather than no output.
  • Due to upgrade to Java 7,  Windows Server 2003 and OSX 10.5.8, 10.6 are no longer supported by Oracle. Hence, the system requirements are now: Linux, Windows Vista, Windows 7, Windows 8 or Windows Server 2008, or Mac OS X 10.7 or later.
  • As of June 2014, COSMIC download requires registration. This means that COSMIC is no longer part of the resources that can be downloaded with the Download Genome Data Tool. You can still register at the COSMIC website, download the file to your computer, and use the Import Tracks tool to import the data.
  • In Experiments, the column previously named "Total intron reads" is now named "Total intron-exon reads", and the column previously named "Unique intron reads" is now named "Unique intron-exon reads". This new headings will appear in Experiment tables for both newly and previously created Experiments.

Compatibility

  • This release can be used with CLC Genomics Server 6.5.
  • This release is using the read mapping and de novo assembler that corresponds to CLC Assembly Cell 4.3.


CLC Genomics Workbench 7.0.4

Release date: 2014-05-14

Bug fixes

  • Fixed a bug in RNA-Seq Analysis regarding the calculation of RPKM. This error was introduced with the new RNA-Seq tool in CLC Genomics Workbench 7.0. When calculating RPKM, the total number of gene reads was used instead of total exon reads. This will only have a significant impact in case there are many intron reads mapped to this gene. With this release we have fixed the bug. Users that base their analyses on RPKM values conducted with CLC Genomics Workbench 7.0 - 7.0.3 should refer to our public notification about this issue to get further details, including how to determine if re-running RNA-seq analyses will be necessary and a work-around if this will not be possible. The Legacy RNA-Seq plugin is not affected by this bug
  • Fixed a bug in the Filter against Control Reads tool which meant that variants that are of type "Replacement" and which also introduce an insertion were not properly removed by the filter, even if there were reads supporting them. We recommend all customers that have relied on this tool for processing data with this tool in CLC Genomics Workbench 7.0.X to run the tool again in the 7.0.4 version.
  • Fixed error that caused selections in views not to be centered in the middle of the view.
  • Fixed bug that caused a crash in the Reassemble Contigs tool
  • Fixed bug that made the Workbench crash when viewing tables under certain circumstances
  • Fixed problem with "Find" on stand-alone read mappings with a circular reference and sequence lists containing circular sequences.
  • Fixed bug that sometimes caused the workbench to crash when running "Local Realignment" on mappings generated with other mappers and imported as BAM files.
  • Fixed problem with some parts of workflow not being executed if there was multiple branches in workflow

Changes

  • Users running RNA-seq analyses with only gene annotations can now choose whether to calculate  the RPKM for these genes (i.e. genes without transcripts) or not.


CLC Genomics Workbench 7.0.3

Release date: 2014-03-20

Bug fixes

  • Fixed problem with the amino acid changes tool that reported all variants within coding regions as non-synonymous. This error was introduced with Genomics Workbench 7.0.2


CLC Genomics Workbench 7.0.2

Release date: 2014-03-18

Bug fixes

  • Fixed problems causing an error when trying to uninstall plugins.
  • Fixed issue with pausing and resuming running processes


CLC Genomics Workbench 7.0.1

Release date: 2014-03-14

New features and improvements

  • Improved parameter specification for RNA-seq Analysis
  • It is now possible to perform both batched and non-batched import of VCF files without genotype information
  • Statistical Analysis: Improved reporting of invalid input to the tools "On Gaussian Data" and "On Proportions"
  • Fasta export:
    • Fasta export with trimming is now much faster and consumes less memory
    • Fasta export now reports progress while executing
    • When the "Remove trimmed regions" option is set, the Fasta export will ignore sequences in which all nucleotides are covered by a Trim annotation
  • Translate to Protein (Batch Process):
    • The wizard now has options for specifying whether to translate the coding regions or extract translations from the annotations
    • The log has been made more detailed and informative
    • If the result is just a single protein sequence, the output will be just that, otherwise all sequences are output as a list
    • If the tool estimates that the number of protein sequences to be produced is greater than 1.000.000, it will create protein sequences without history, and it will not copy the common name, latin name, and taxonomy fields
  • The PDB importer has improved support for custom residues

Changes

  • When importing a VCF file. If multiple count tags are present in a VCF file, the VCF tags are prioritized in the following order: 1) CLCAD2, 2) AD, 3) AO
  • In the "Amino Acid Changes" tool, the description of coding region changes at the DNA level now complies with HGVS recommended nomenclature with regard to variants in untranslated regions. Examples: "c.-4A>C" describes a SNV four bases upstream of the start codon, while "c.*4A>C" describes a SNV four bases downstream of the stop codon

Bug fixes

  • Fixed problems where the workbench window would not position itself correctly on startup
  • After annotating variants with the tool "Annotate from Known Variants" a small fraction of the MNVs disappeared. This has now been fixed
  • The tool "Restriction Site Analysis" ignored the selected number of cut sites in GWB 7.0. This has been fixed
  • Fixed table filter being invisible for vertically split tables
  • Fixed problem where "InDels and Structural Variation" would throw ArrayIndexOutOfBounds exceptions for certain data
  • Fixed missing icon for "Metadata Import" in the phylogenetic tree table
  • Fixed "Filter Based on Overlap" accepting expression tracks as inputs but not knowing how to handle them
  • Fixed typo in "RNA-seq Analysis" that was visible in workflows
  • Fixed error in mapping long reads as part of de-novo assembly, Read Mapping Legacy plugin, RNA-Seq Legacy plugin, and Transcript Discovery plugin
  • Multi BLAST results table: the missing "Description (E-value)" field is displayed again in the table output
  • A rare error has been fixed in the Secondary Peak Calling tool
  • An issue with workflow connections not being displayed properly has been fixed
  • Fixed a bug that in certain cases made the De Novo Assembly fail
  • Fixed a bug that in certain cases made the RNA-Seq Analysis fail
  • Fixed a bug that made access to data impossible because of a failed rename operation
  • Fixed a bug that made PDB import fail on workstations with Turkish Locale settings
  • A problem importing Ensembl version 75 files has been addressed. If you have previously imported Ensembl version 75 files, please see the FAQ entry for full details of what to do
  • Fixed a bug in labeling of phylogenetic trees. Newly created trees were labeled with saved general tree settings and subtree texts


CLC Genomics Workbench 7.0

Release date: 2014-02-11

New features and improvements

  • New beta plugin available: memory-efficient read mapper that significantly reduces the memory requirements for read mapping.
  • RNA-Seq on tracks: A substantial update of the popular RNA-Seq Analysis tool together with new statistical tools for analysis of differential expression form a great improvement for all users working with RNA-Seq.
    • The output of the RNA-Seq Analysis is based on tracks and includes tracks with the read mapping, expression values and fusion genes. Tracks from different samples can be shown in one track list, enabling richer visual comparison across samples.
    • The gene-level and transcript-level expression results are now output as two different tracks and can be used together for visualization. Downstream analysis can be performed on either.
    • A new column "Relative RPKM" on the transcript-level expression track can be used to see the relative expression of alternative transcripts for a gene.
    • Experiments based on the new expression tracks can be used for browsing the track list with read mappings and annotations.
    • It is now possible to map the reads against the full genome as well as gene regions.
    • The new read mapping algorithm introduced with CLC Genomics Workbench 6.5 is now also used for RNA-Seq. This means that mapping is faster but for some data sets it will also require more memory. For a human data set using the latest annotation sets (obtained through the Download Reference Genome Data), there is a minimum requirement at 16GB of RAM and we recommended 24 GB of RAM. If this causes problems, it is still possible to make use of the old RNA-Seq Analysis tool which is available as a plugin.
    • The wizard has been redesigned to make use of tracks and includes a more explicit way of controlling what reference annotations should be used (if any).
    • If you have an annotated reference sequence that you have used for RNA-Seq, you can convert it to tracks using Convert to Tracks.
    • The fusion genes table has been changed into an annotation track which can be used to browse the read mappings in a track list.
    • Variant tracks can be annotated with expression values from expression tracks.
    • New RNA-Seq tutorials available at http://clcbio.com/tutorials
  • New statistical test based on EdgeR:
    • The tools available for statistical analysis of differential expression have been extended to also include the 'Exact Test' (developed by Robins and Smyth and implemented in the EdgeR Bioconductor package). The test is applicable to comparisons of pairs of groups and implicitly performs TMM normalization.
  • New functionality for phylogenetic trees (was previously part of a beta plugin)
    • Greatly enhanced viewer for visualizing and working with phylogenetic trees. The viewer allows the user to rapidly create high-quality, publication-ready figures of the trees.
    • Large trees are made easy to explore using different zoom functionalities and a small minimap of the entire tree. The viewer also comes with two alternative tree layouts, namely circular layouts and radial layouts, which are great for visualizing very large trees.
    • Supports importing, editing and visualization of metadata associated with nodes in phylogenetic trees.
    • Tool to reconstruct phylogenetic trees based on k-mers. This approach avoids the computationally intensive step of constructing a multiple alignment of the input sequences. The k-mer based reconstruction tool is especially useful for whole genome phylogenetic reconstruction where the genomes are closely related.
    • Tool performing a statistic evaluation of different substitution models to be used with maximum likelihood tree construction. The output of this tool is a report that lists the recommended settings to be used when constructing phylogenetic trees based on maximum likelihood.
    • Added an option for using the Kimura 80 substitution model when creating trees with distance based methods.
    • Distance-based tree reconstruction methods can now reconstruct trees from protein alignments using the Jukes-Cantor substitution model or the Kimura protein ML distance estimate.
    • A user defined start tree can now be supplied to the ML inference tool.
  • Complete redesign of the graphical user interface including:
    • New tool bar graphics
    • New product logos and colors, including splash screen
    • New background graphics on canvas and in dialogs
    • Tool bar has been re-organized
    • New tab design. Aligning the look and feel across platforms, which is particularly important to mac-users as split screen used to take up a lot of screen space.
    • New zoom tools. Easy adjustment of zoom speed for improved zooming of huge sequences.
    • Detachable Side Panel editors. A Side Panel editor can be dragged from the main workbench window and dropped outside the workbench, making a separate window e.g. on a second screen, if available.
  • New concept for Side Panels and Views:
    • Support for multiple screens: views can be moved to a different screen by dragging the tab of the view
    • Side Panels now consist of palettes that can be organized in group and the order can be customized
    • Palettes can be detached and placed anywhere on the screen
    • Navigation Area and Toolbox can be minimized to allow more pixels for displaying data
  • Zoom tools redesign:
    • The zoom tools have been re-organized and placed closer to the data
    • A zoom slider shows the present zoom level and can be used to adjust zoom
      • Detailed zoom is a new feature that allows zooming in and out in very small increments by dragging the zoom slider and moving the cursor above it. An expert feature for e.g. large tracks.
    • Zoom to selection button now available for track views
  • Copying data in the Navigation Area runs much faster and uses less memory than before. This is a great improvement which also kicks in when moving data between a CLC Genomics Server and a Workbench.
  • Tracks:
    • The speed of the Annotate with known variants and Filter against Known Variants tools have been greatly improved when using a large reference database like dbSNP.
    • Table filtering of tracks: it is now possible to use "overlaps" and "doesn't overlap" when filtering on the region column. This allows for quicker inspection if any of the variants or annotations overlap a particular position.
    • Tooltips on variant tracks in track lists now include the number of variants in the track.
    • The Identify Graph Threshold Areas tool is now capable of identifying intervals with higher-than-average reads. This is obtained by setting a “window-size” parameter in the "Identify Graph Threshold Areas" wizard that specifies the width of the window around every position that is used to calculate an average value for that position.
    • Previously, when importing variants from VCF files and from UCSC, a small number of variants were ignored because they were not proper replacements or MNVs because they contained reference bases at the ends. These variants are now trimmed and properly imported. This also affects the Download Reference Genome Data tool.
    • When selectiong Import -> Track it is now possible to import several files into a single track in one step with the newly added batch mode function. Please be aware that this is not possible if you work with VCF files without genotype information.
  • Workflows:
    • Possibility to have bulk configuration of elements. This enables to set the same reference data for multiple elements at once.
    • Workflows can be added inside a workflow. The inner workflow is "unfolded" into the single elements.
    • Parameters can now be renamed in the editor by the creator during configuration of the elements.
    • During workflow installation the wizard now allows editing of the name of the installed workflow.
    • Workflows with invalid/unknown elements are laid out nicer and more consistent.
    • The sidepanel has now an option to display rulers in the editor to indicate better the size of a workflow (particularly when exporting)
    • Fit Width now fits the entire workflow in the editor by zooming out.
    • The sidepanel has a new section "Minimap" which shows an outline of the whole workflow. It allows to navigate the workflow in the view and also supports zooming
    • One can change the design of the workflow editor via the sidepanel (removed the old designs in the preferences)
    • Better validation when configuring parameters in workflows
    • If a tool receives inputs from at least two tools, the inputs can now be ordered via the context menu on the connections or the input part of the target element.
    • The name of an output in the workflow can be set by configuring the output element
    • Parameters of a workflow run can now be exported to various formats via the wizard
    • It is now possible to reset a reference parameter. Before it was only possible by removing the whole element and add it again.
    • In the workbench the installed workflows are now sorted alphabetically.
    • The graphics export of a workflow now knows about the scale and one can now export the whole workflow or only the current view.
    • A cpw file can now be dragged into the workflow manager and will be installed.
    • Further speed improvements on working with larger workflows in the editor
    • New tools that are now workflow-enabled:
  • 3D structure viewer:
    • Property viewer - a new tab in the Side Panel. Shows detailed information about the atom under the mouse pointer. If multiple atoms are selected (Ctrl-click), the distance (two atoms selected), angle (three atoms selected), dihedral angle (four atoms selected) formed by the selected atoms, is shown in the Property viewer. If a molecule is selected in the Project Tree, meta-data relating to the molecule is shown in the Property viewer.
    • Issues List. Issues related to the molecule structures and their chemistry representation is listed in the Issues List view on Molecule Projects. If seen in split-view with the 3D viewer, a selected issue will zoom to the atoms involved in the issue and select them in the 3D view.
    • General improvements to the PDB importer
    • Double-clicking an entry in the Project Tree will zoom-to-fit on the molecule or atom group.
    • When selecting atoms (by mouse clicking or from sequence), the atom context (whole residue or molecule) will also be shown in the 3D view. From context menu on 'Current' selection, an atom group can be generated either from the exact selection or from the selection plus the context (whole residues or molecules).
  • Create statistics for target regions:
  • Amino acid changes:
    • There are two new columns reporting amino acid changes for the longest transcript. Previously, amino acid changes would be reported for all transcripts, and this information is still available, but many users prefer just to use the longest transcript, and this information is now available in two new columns: one for the change on the protein level, and one for the change on the coding DNA level.
    • Variants up- and downstream of the coding regions are now annotated with a coding DNA position as long as they are inside the transcript. In order for this to be reported, the amino acid changes tool has to be supplied with an mRNA track which will be used to determine whether the variant included in the transcript.
  • Extract consensus sequence:
    • Is now able to copy annotations from both existing consensus sequence and the reference sequence.
    • When extracting consensus sequence from a mapping, conflict and low coverage annotations now include the position on the reference.
    • New "Ambiguous base" annotations are added (when "Add annotations" are enabled in output options) for ambiguous consensus-symbols.
    • Removals caused by filtering are now explicitly annotated as "Removal" and not erroneously as "Deletion". “Deletion” is now reserved for annotation of areas where the reads agree that a gap exist although the gap is not found in the reference sequence. The new Removals are named "Removed by filter" and qualified by a "Cause of removal=filtering" qualifier to make them distinguishable from removals caused by low coverage.
    • Quality scores are now always computed for the consensus sequence when quality scores are available in the input read-mapping no matter the state of the "Use Quality Score" input parameter.
  • Read mappings can now be exported to a tabular file including detailed per-base information on coverage and nucleotide composition including insertions and deletions.
  • Trim annotations can be used to trim off sequences when exporting to fasta.
  • Secondary peak calling has been improved: it now only detects peaks that have a distinct peak shape, only peaks that fall within the same interval as the top peak are called. In addition, trim annotations are taken into account so that no peaks are called within trimmed regions. This greatly reduced false positive calls. Finally, the annotations now include information about the secondary peak's fraction of the maximum peak height.
  • Advanced table filter now includes an option to search for "starts with" in addition to just "contains"
  • Limitations on export of Excel 2010 files (xlsx) are removed:
    • Multiple tables can be exported to one xlsx file
    • Reports can be exported to xlsx
    • Hyperlinks are preserved in xlsx files
  • SignalP prediction has been updated to be server-, batch- and workflow enabled.
  • Folder contents view: subfolders will display how many items they have
  • Policy settings now also control the use of the "Download Reference Genome" tool (using the online_search key)
  • Assemble Sequences tools now accept sequence lists as input.
  • REBASE restriction enzyme list updated to version 310.
  • InDels and Structural Variants: We now require higher similarity for short alignments than for longer alignments for accepted matches.
  • Quality-based and Probabilistic Variant Detection: The homopolymer filter has been improved so that longer variants no longer are lost when it is applied.
  • ChIP-Seq tool now reports if the background distribution cannot be calculated properly.

Bug fixes

  • Workflow:
    • In the editor the "Fit Width" zooming was active, but behaved as "100%" zooming. Therefore the "100%" zooming is now active instead of the "FitWidth"
    • Added or connected elements are now placed near where you connect or add them, even when zoomed near or far.
    • It was possible to Undo the action of adding a workflow output, but it was not possible to Redo afterwards.
    • The right-hand icons in an element now scale with zooming.
    • The log of a workflow run on the server contains now the same information as when run in the workbench.
    • When configuring elements in the editor, the "Reset to CLC Standards" button is now functional and will reset the parameters to their default values. When configuring during installation or execution the button is disabled.
    • The log of a server executed workflow now states when the workflow has been cancelled.
    • A workflow with elements which provide additional inputs could not be batched.
  • Crash when adding data to experiments (e.g. by running any of the statistical analysis tools) has been fixed.
  • Track visualization: various bug fixes of track visualization.
  • Various bugs in the extract consensus sequence tool have been fixed.
  • Tracks with many "chromosomes" took up extra disk space. These are now more compressed.
  • In Reads Tracks, if no quality score information is available all residues are given the worst value (0) rather that the highest (64), which was the case before.
  • Fixed crash when creating a detailed mapping report.
  • Read mapping and variant detection wizards were unresponsive when using input with many reference sequences.
  • When translating to protein, ambiguous nucleotides potentially resulting in stop codons were not translated properly, and only the codons resulting in an amino acid were represented in the protein. Now the stop codons are also represented by an X in the protein sequence.
  • A problem with filtering and sorting the BLAST output table has been solved. Some of the columns were treated as text instead of numeric.
  • Restriction maps, histogram data, and primer tables could not be exported to csv or similar. This has been fixed.
  • When setting up an experiment, samples in groups are now ordered according to how they appear when selected as inputs (in the same order).
  • When using the “Find Open Reading Frame” tool, the input sequence was reported incorrectly in the ORF table. This has been fixed.
  • Fixed problems with Excel export that failed when special characters were used in the name.
  • Some of the tooltips associated with table column headers did not match the right column header. This has been fixed.
  • A bug in the reported number of "perfect mapped" and "non-perfect mapped" reads has been corrected.
  • InDels and Structural Variants:
    • Fixed problem with detection of deletions in GWB versions 6.5.1 and 6.5.2
    • The reported allele sequence for the special kind of insertions that are tandem duplications was wrong. This has been fixed.
    • Fixed a bug in the code that should make sure that the similarity required for a mapping of an unaligned end is applied only to the aligned part of the unaligned end and not to the full unaligned end (this is important for insertions. For these, the part of the unaligned end that is inserted should not be considered).
    •  A corner-case in the InDels and Structural Variation tool for inputs containing long unaligned ends has been fixed (previously, when running the InDel and Structural Variants tool, errors occurred for some read mapping data sets).
    • Fixed problem where "Indels and Structural Variation" would break on certain inputs.
  • Extract Consensus Sequence: When using ambiguity codes for conflict resolution the filters (noise threshold and minimum nucleotide count) were always applied globally and not only in the presence of a conflict. Now the filters are only applied when there is an actual conflict.

Changes

  • The two tools in the "Multiplexing" folder in the toolbox category NGS Core Tools have been changed:
    • "Process Tagged Sequences" has been renamed to "Demultiplex Reads" and is now directly in the NGS Core Tools folder.
    • "Sort Sequences by Name" has been moved to the Sequencing Data Analysis folder.
  • The De novo assembly legacy plugin has been discontinued and is no longer available for this release.
  • Motif search: annotations created by the motif search are of type "Motif" instead of "Region"
  • The “Download Genome” tool found under “Download” in the toolbar has been renamed to “Download Reference Genome Data” to make clear that the “Download Reference Genome Data” can be used to download e.g. annotations and variant data as well as reference genomes.
  • The “Fasta” importer found in the toolbar under “Import” has been renamed to “Fasta Read Files” to stress that this importer preferably should be used to import read files rather than reference sequences. The reason for this is that the “Fasta read files” import option allows the read names to be included, whereas the descriptions from the fasta files are ignored. Hence, as the standard import function keeps the descriptions in addition to the read names, we recommend using the Standard Importer for import of other fasta format data (such as reference sequences).

Compatibility

  • This release can be used with CLC Genomics Server 6.0
  • This release is using the read mapping and de novo assembler that corresponds to CLC Assembly Cell 4.2.1


CLC Genomics Workbench 6.5.2

Release date: 2014-01-23

Bug fixes

  • Fixed: Error message when running analysis on experiments (statistical tests, clustering etc.)
  • Fixed: track lists would sometimes be rendered empty when scrolling tracks with insertions.
  • Fixed problem of unresponsive dialogs when running analysis with many reference sequences.
  • Fixed a problem in track lists that caused the view to crash when there is an insertion at the very end of a chromosome.
  • The folder used by the Workbench for storing log and settings files on Windows 8 has been updated to follow conventions for Windows 8 which is identical to Windows 7. Any existing settings will be copied to the new location automatically.
  • Fixed various problems related to launching the Workbench through Java Webstart.
  • Fixed: Opening a search view for searching sequences at NCBI would sometimes fail.
  • Fixed: The Target Regions Statistics tool did not handle annotations covering the starting point of circular reference sequences properly. If you are using the tool with annotations spanning across the starting point of a circular reference, we recommend re-running the analysis.
  • In BAM files created by BWA, non-specific reads are now recognized as such during import. Previously, they were treated as unique reads.
  • Improved stability of Probabilistic variant detection on huge data sets.
  • Fixed various stability and performance problems of Maximum likelihood phylogeny.
  • Fixed problem that caused a crash with extract consensus sequence tool with certain parameter configurations and with read mappings with no reads.
  • Fixed crash of detailed mapping report tool with certain data sets.
  • Fixed error in importing SOLiD XSQ files.
  • Fixed an error when importing BAM files, including problems regarding download of reference sequences.
  • Fixed a read mapper error occurring under special circumstances when excluding regions of a reference when mapping reads .
  • Fixed a bug in the Assemble Sequences tool causing some data sets to produce inferior contigs.

Compatibility

  • This release is based on CLC Assembly Cell 4.2.1
  • This release can be used with CLC Genomics Server 5.5.X.


CLC Genomics Workbench 6.5.1

Release date: 2013-10-21

New features

  • VCF export allows you to enforce diploid reporting of the variants. This will enable the VCF files to be parsed with other software relying on each line to report two alleles. As part of this, the CLCAD field is replaced with CLCAD2 (read more in the user manual). If you use VCF export in workflows, please see this special note.
  • Heat maps: It is now possible to show a legend on a heat map.

Changes

  • Variant comparison tools are workflow-enabled
  • When importing Genbank nucleotide sequences, the Workbench will determine whether it is DNA or RNA based on the sequence rather than the description in the file.

Bug fixes

  • Fixed: An important issue with the interpretation of ensembl-style gtf files when using the Download Genomes functionality or the Import Tracks functionality of the Genomics Workbench. This issue only affects version 6.5 of the Genomics Workbench. If you have downloaded gene annotations using Download Genomes or have chosen to import ensembl-style gtf annotation files using the tool Import | Tracks using version 6.5 of the Genomics Workbench, then we highly recommend that you delete the annotation tracks you have generated, and perform the download or import again.  Annotations from earlier versions of the Workbench are not affected by this issue.
  • Fixed inconsistencies when importing variant files from UCSC, affecting variants on the negative strand where the allele sequence is longer than one base. This affects dbSNP tracks downloaded using the Download Genome tool, and we highly recommend that you delete any variant tracks imported or downloaded from UCSC, and perform the import or download again.
  • Fixed: Filter Against Control Reads was using only the first control reads track, if multiple ones were selected. The issue affected both 6.0 and 6.5 versions. If you used multiple control read tracks simultaneously to filter variants, we strongly recommend that you redo the analysis.
  • Fixed: Trimming sequencing data for vector contamination from UniVec failed
  • Fixed: It was not possible to proceed in the ChIP-Seq Analysis wizard without a control sample.
  • Fixed: GFF export failed.
  • SAM/BAM import: reads mapped to reference sequences that were not provided during import is no longer included in the list of unmapped reads. Instead they are in a separate list.
  • Improvement of information displayed in license dialogs
  • Fixed: in track views, coloring reads on quality score did not affect unaligned ends.
  • Fixed: in track views, coloring reads on quality score did not work properly for paired reads.
  • Improvements and fixes to the Indel and Structural Variation tool:
    • Improved the detection of insertions and deletions from self-mapping evidence particularly relevant for amplicon data
    • Fixed: a bug which caused some variants to be called as 'replacements' that should be called as 'insertions' or deletions
    • Fixed: a bug which caused the structural variantions to go undetected for long unaligned ends
  • Fixed: In the trim dialog, it was not possible to de-select an adapter list without resetting all settings to default.
  • Fixed: In Trio Analysis, homozygous variants on chr Y and MT and male X were wrongly marked as de novo mutations when not found in the father. The parameters for Trio Analysis have been changed as part of this.
  • Fixed: In Trio Analysis, variant tracks with "unknown" zygosity values would be accepted, creating misleading results. Read more in the FAQ.
  • Fixed: SAM and BAM export now supports direct gzip and zip compression of the files.
  • Fixed: Copying information from the Folder Contents view did not work.
  • Fixed: Local Realignment fails on certain data sets
  • Fixed: out of memory error when performing bootstrapping with ML tree construction methods.


CLC Genomics Workbench 6.5

Release date: 2013-08-21

 

New features

  • Variant detection:
    • New tool for adjusting read mappings through local realignment. The Local Realignment tool has the option to realign unaligned ends, realignment with a guidance variant track (e.g. obtained from external resources such as dbSNP, through the Indels and Structural Variants tool described below or from analysis of other read mappings) and allows for realignment of multiple samples. Has previously been available as a beta plugin.
    • New tool for detecting structural variants (detects insertions and deletions, intra-chomosomal translocations, tandem duplications and inversions) working on "unaligned ends (soft clippings)". Has previously been available as a beta plugin.
    • Important changes to variant reporting: adjacent variants are now reported as one variant instead of linked variants.
    • A new variant filter has been added to both “Probabilistic Variant Detection” and “Quality-based Variant Detection”: “Ignore variants in non-specific regions”. This new filter ensures that variants in regions covered by just a few non-specific reads are ignored.
    • Probabilistic Variant Detection: A new threshold filter, “Required variant count”, has been added to the wizard. This filter ensures that only variants present in a number of reads that exceeds the specified threshold are called.
    • Quality-based Variant Detection: Addition of a new column that reports hyper-allelic status of variants. This is based on the specified threshold “Maximum expected allele” in the “Set genome information” wizard under “Ploidy”. The output in the table is “Yes” or “No” with respect to whether the threshold has been exceeded.
    • A new column has been added to the variant track table that describes the length of the insertions, deletions, and replacements. This makes it possible to filter on the length of e.g. insertions/deletions.
    • VCF export is now using genotype fields. The tag CLCAD is used for count of a variant, and PL is used for coverage. In this version, one variant track will result in one VCF file.
  • Variant annotation:
  • Workflows:
    • Automatic adjustment of layout in workflows. It is now (again) possible to adjust the connected workflow elements automatically. Right click in the workflow editor to access a menu with the option "Layout". Clicking on "Layout" will adjust the layout of the workflow. The layout can also be adjusted with the quick command Shift + Alt + L.
    • Automatic update of tools in workflows. Tools in existing workflows will automatically be updated when opened from the Navigation Area. If new parameters have been added to the updated version of the tool, these will be used with their default settings. A workflow can be kept in its original form by saving the updated workflow with a new name as this will ensure that the old workflow is kept rather than being overwritten.
    • Information: In the “Manage Workflows” dialog a new tab has been added providing information about the workflow (such as when it was built and which version of the workbench was used).
    • Highlight used elements: In the Side Panel under “View mode” it is now possible to select “Highlight used elements”, which will show all elements that are used in the workflow. Unused elements are grayed out. The “Highlight used elements” can also be activated with the quick command Alt+ Shift + U.
    • Highlight Subsequent Path: Is a further development of the old option “Mark Subsequent Path”.  If you right click on the name of one of the tools in a workflow, it is possible to select “Highlight Subsequent Path”, which will highlight the path in the workflow from the tool that was clicked on and further downstream.  All other elements in the workflow will be grayed out.
    • Create Installer: A workflow can now be installed directly from the workbench. This can be done with the “Create Installer” button (or the quick command Alt + Shift + I). Three options exist in the “Create Installer” dialog: 1) Install the workflow on your local computer, 2) Install the workflow on the current server (requires that you are logged on to the server and that you are the administrator), and 3) Create an installer file to install it on another computer.
    • Export can now be part of workflows.
    • Workflow enabled elements can be dragged directly from the Toolbox into the workflow editor.
    • Workflow images can be copied from the editor by using Ctrl + C (copy) and pasted into the desired destination with the Ctrl + V command.
    • All elements can be removed from the workflow with the shortcut Alt + Shift +R.
    • Previously, when running the “ChIP-Seq Analysis” tool, the result would be a copy of the read mapping with annotations added. Now the annotations are added to the read mapping used as input. Workflows using the "ChIP-Seq Analysis" tool must be manually updated, deleting the ChIP-Seq workflow element and adding it again.
    • Reinstallation of modified workflows can now be done directly with the “Create Installer” function. A pop-up dialog provides the option to make "forced import" of an already installed workflow.
    • Speed improvements in the workflow editor means that the user experience when editing large workflows has been greatly improved.
    • New tools that are now workflow-enabled:
 
  • 3D Molecule Viewing: The integrated viewer of structure files, the 3D Molecule Viewer, has been completely redesigned. The Molecule Viewer offers a range of tools for inspection and visualization of proteins and other molecules stored in structure files from the Protein Data Bank (PDB).
  • De novo assembly
    • New tool: Map Reads to Contigs. This tool allows mapping of reads to contigs. This can be relevant in situations where contigs have been imported from an external source,  the output from a de novo assembly is contigs with no read mapping, or if you wish to map a new set of reads or a subset of reads to the contigs. Scaffolds can be exported in AGP format: scaffolded contigs are exported as individual contigs and not as a single scaffold with N's inserted in between contigs. This allows for submission-ready data.
    • Great performance improvement when updating the contig sequence based on reads that are mapped back to contigs.
  • Tracks: Several new features have been added

It is now possible to:

    • When there are more reads than can be shown in the available view area, an overflow graph will be displayed below the reads. The overflow graph was previously shown in grey. Now the overflow graph is shown in the same colors as the sequences. Hence, it is now possible to distinguish forward, reverse and paired reads in the overflow graph as well as mismatches in reads.
    •  Insertions from variant tracks and reads tracks can now be shown in tracks.
      • For variant tracks, a new side-panel option “Insertion” allows the user to select whether to display insertions or not.
      • For reads tracks insertions seen in more than a given percentage of reads are shown. The default percentage is 1%, setting it to 0% will show all insertions (like the cluster editor) and setting it to 100% will show no insertions.
      • Insertions in reads tracks that are present at a frequency below the specified threshold are shown with a vertical line in the reads to indicate its location.
      • Reads tracks now also have a mouse-over tooltip that provides information about insertions at specific positions. This tooltip reports the number of reads that contain the insertion and lists what was inserted.
    • Extract reads from read tracks in two different ways:
  1. Extract sequences from tracks. Allows extraction of all reads as single sequences or as sequence lists.
  2. Extract from selection. Allows the creation of a reads track containing only reads that fall within the selected region, and of specific types.
    • Four new options are available in the Side Panel for Track layout when viewing a reads track:
      1. Show quality scores: Makes it possible to adjust the colors of the residues based on their quality scores. In cases where no quality scores are available, blue (the color normally used for residues with a low quality score) is used as default color for such residues.
      2. Matching residues as dots: Replaces matching residues with dots in reads tracks. This option makes it easier to spot variants.
      3. Show read type specific coverage: When enabled, the coverage graph that summarizes those reads that could not be explicitly shown is now replaced by one coverage graph for each read type found in the Reads track. This can be used for easy and visual comparison of the strand specific coverage.
      4. Only show coverage graph: When enabled, only the coverage graph is shown and no reads are shown.
    • A new tool has been included: “Identify Graph Threshold Areas”. This tool uses graph tracks as input to identify graph regions that fall within certain limits (thresholds that have been specified by the user).
    • Extract annotations from track. This tool makes it very easy to extract parts of a sequence (or several sequences) based on its annotations.
    • Create a track list with the shortcut  Ctrl + L
    • The create histogram tool now also accepts graph tracks as input.
    • The error message "Too much data for rendering. Either zoom in to view data, or adjust data aggregation threshold" has now been added to the big grey box that appears in cases where a track cannot be shown. Previously only a big grey box was shown with no further explanation.
    • Opening a large table view of a variant track is no longer blocking the user interface. It is running in the background, and it is possible to stop loading the data by closing the table view.
  • Export framework redesigned
    • Export of multiple files: you can export several files in one go. The naming of the file will default to the name used in the Navigation Area of the Workbench, but the user can specify a naming pattern to use instead.
    • Export formats: A new column “Exports selected” has been added to the “Select exporter” table that indicates with a “Yes”, “No” or “Partly” whether the currently selected element can be exported with the given exporters. Partly means that you have made a selection of elements where only some of them can be exported by the selected exporter.
    • Improved usability with a quick-select dialog for choosing the right export format. The dialog includes a  description of each exporter that can be quickly filtered.
    • Export can be integrated into workflows
    • Support for direct compression of exported files in zip and gzip.
    • Previously, VCF export required the user to know that both a variant track and a sequence track should be selected before exporting. This has changed, so that the user only has to select the variant track as input, and the sequence track is supplied as a parameter. This means it is more obvious that it should be selected, and it also means that the choice of sequence track will be remembered for the next vcf export.
  •  The folder viewhas been improved with the following:
    • It is now possible to drag and drop objects from the folder editor. This will create a copy of the objects at the selected destination.
    • Attribute columns will be left empty if the attribute has not been defined (previously attribute values that had not been defined were set to 0 and checkboxes were shown as unchecked).
    • A new column showing the first 50 residues of each sequence has been added as an option.
    • The column with the name “Length” has been renamed to “Size”.
    • The column “Size” shows the length of a sequence, or for sequence lists, the number of sequences e.g.:
      • Sequence or contig lists: the number of sequences/contigs
      • Read mappings: the length of the consensus sequence
      • De novo assemblies: the length of the reference
      • Alignments: the length of the alignment
  • The Side Panel “Save/Restore Settings” functionhas been expanded with a new feature:
    • The “Save/Restore Settings” function (found at the top of the Side Panel) has been redesigned. It is now possible to save settings in two different ways. 1) The settings can be saved for this element type in general, e.g. for a track it would be save settings “For Track View in General”. In this way the settings will be applied each time you open an element of the same type, which in this case means each time one of the saved tracks are opened from the Navigation Area, these settings will be applied. These “general” settings are user specific and will not be saved with or exported with the element. 2) Settings can be saved with the specific element only e.g. for a track it would be save settings “On This Track Only”. The settings are saved with only this element (and will be exported with the element if you later select to export the element to another destination).
  • Alignments: If you have one particular sequence that you would like to use as a reference sequence, it can be useful to move this to the top. This can now be done automatically by right clicking on the sequence name and then selecting “Move Sequence to Top”.
  • The Sequence List Table has been improved with a new feature. A new column showing the first 50 residues of each sequence has been added as an option.
  • SOLiD import now accepts XSQ files
  • The following Plug-ins are now fully integrated in the Workbench:
  • The tomato genome, Solanum lycopersicum SL2.40.18, available in the Download Genome tool.
  • Phylogenetic trees:
    • Create Tree now support the Kimura 2-parameter substitution model for DNA sequences and Kimura's distance estimate for protein sequences (Kimura 1983).
    • It is now possible to construct Maximum Likelihood phylogenies from protein sequences.

Improvements

  • Scrolling in read mappings: The mouse scroll speed through read mappings can now be performed with increased speed. Shift + Alt + Mouse wheel scroll makes the scroll 10x as fast as when using Alt + Mouse wheel scroll. When zoomed all the way in, each mouse wheel step scrolls 10 rows.
  • Sort Sequences by Name: The multiplexing tool now allows a delimiter between group names in the “Sort Sequences by Name” wizard. This means that the selected group names are separated by an underscore. Previously all selected group names were merged without any delimiter.
  • Cloning: The cloning editor can now work without having a designated vector. In essence this means that when no vector is selected you go directly in “Stitch mode” when a fragment has been selected, whereas you go in “Cloning mode” when a cloning vector and a fragment are selected.
  • Renaming of data in the Navigation Area by clicking twice has been improved. Previously, you could accidentally enter rename mode when the intention was to get focus in the Navigation Area. Now, you can only trigger rename by clicking when the Navigator has focus.
  • Filter Annotations on Name: The wizard layouts for the tool when used directly as opposed to when included in a workflow has been standardized.
  • Extract consensus sequence tool:
    • It is now possible to use the quality scores when resolving conflicts or disagreements between reads with “Insert ambiguity codes”. Previously, “Use quality scores” could only be selected when using the “Vote” option for conflict resolution.
    • Low coverage regions are now annotated in the consensus sequence produced.
    • The Extract Consensus Sequence dialog is now shown when extracting the consensus sequence when right-clicking a selection on the reference sequence in the mapping view, enabling the user to extract part of the consensus sequence.
    • The Extract Consensus Sequence dialog is now shown when extracting the consensus sequence when right-clicking the name of the consensus or reference sequence, or when clicking the Extract Consensus button in the mapping table. The right-click menu option on the consensus to Open Sequence Including Gaps has been removed, since this functionality is now covered by the Extract Consensus Sequence tool.
  • When using the “Translate to protein” tool, the max limit has been raised to 1GB.
  • The sequence action "Open Copy" has been removed and "Open This Sequence" has been renamed to "Open Sequence".
  • The alignment tool is now more memory efficient.
  • Tables: Improved auto-adjustment of the column width (based on content and number of columns).
  • Read mapping: The speed of running a read mapping against a masked reference has been improved significantly. When mapping reads to a reference sequence, it is possible to map reads to only selected annotated regions of the reference (= masking). Previously masking of a reference was performed by replacing the masked out nucleotides with N's. The new masking method discards the masked out nucleotides by splitting the reference into separate sequences. Hence, the masked out sequences are completely ignored in the analysis. The remaining sequence fragments are positioned according to the original unmasked reference sequence.
  • Read mapping: The status bar in the lower right corner now shows the corresponding positions on the reference/contig sequence.
  • The read mapper will now place ambiguous gaps to the left, as opposed to the right, to ensure better concordance with common variant databases.
  • BLAST has been upgraded to BLAST+ 2.2.28 that includes a number of improvements and bug fixes. A full list of BLAST+ 2.2.28 changes can be viewed at http://www.ncbi.nlm.nih.gov/books/NBK131777.
  • Usability improvement of simple table filtering:
    • A dedicated filter button has been added to apply the filter directly without having to wait until the filter is automatically applied
    • For tables with more than 10000 rows, the filter will not be applied automatically after a delay. Instead, there is a helping text asking the user to apply the filter through the "Filter" button. This avoids premature filtering before entry of the filter text has completed. Since filter can take some seconds for large tables, this used to be an annoyance because the user would have to wait until filtering was done to complete the entry.
  • Phylogenetic trees:
    • Bootstrapping with the "Maximum Likelihood Phylogeny" is now possible.
    • Bootstrap values are now displayed in percent instead of absolute numbers.

Bug fixes

  • Numbering of amino acids when calculating amino acid changes was wrong for coding regions spanning the starting point of circular chromosomes. We recommend running amino acid calculation again. Please note that the actual amino acid change is called correctly, only the numbering is affected.
  • PDF export of the history of a result did not include the name and version number of the Workbench that produced the result.
  • Phylogenetic trees:
    • The Juke-Cantor distance estimate now ignore all positions containing gaps in pairwise alignments.
    • Disabled substitution rate estimation when the corresponding option is deselected by the user in the Maximum Likelihood Phylogeny tool.
    • Fixed a bug that caused branch lengths to be estimated incorrectly for ML trees.

Changes

  • Option to calculate RPKM values for genes without associated transcripts has been added to RNA-seq analysis.
  • System requirements for Linux has changed. From this release, SuSE is supported from version 10.2. This was previously version 10.0.
  • Secondary Peak Calling: The parameter “Fraction of max peak height for calling”, in the “Secondary Peak Calling” wizard,  has been changed to use the interval 0-1with 0.2 as default setting. Previously the interval was 0 – 100 with 20 as default setting.


CLC Genomics Workbench 6.0.5

Release date: 2013-07-09

Bug fixes

  • Fixed problems downloading and importing COSMIC variation data introduced in CLC Genomics Workbench 6.0.4: Sex chromosomes and mitochondrial genome were not annotated. We recommend everybody having downloaded or imported COSMIC variations with CLC Genomics Workbench 6.0.4 to re-do the download or import and re-run all analysis where this COSMIC variant track has been used.
  • Various minor bug fixes.


CLC Genomics Workbench 6.0.4

Release date: 2013-05-15

Bug fixes

  • Fixed problems with annotation file download and import with genomes with more than 22 autosomes. Read more in the FAQ
  • Fixed problem in workflows: it was not possible to configure all elements when running a workflow that branched after the input element.
  • Fixed issue with automated association of chromosome names during import of track data for some non-human organisms.
  • Fixed problem when trying to start ChIP-Seq analysis on a CLC Genomics Server
  • Fixed error in Find primer binding sites
  • Fixed error in Quality-based Variant Detection
  • Fixed problem with zero coverage not reported properly in target region statistics


CLC Genomics Workbench 6.0.3

Release date: 2013-05-08

Bug fixes

  • Detailed mapping report: better labeling of plots
  • The Create Statistics for Target Regions tool begins counting the reference positions at 0 rather than at 1. This causes a discrepancy with the reference position reported in other tools.
  • Description text in progress area is now making full use of available width of the progress area
  • Fixed errors relating to exporting graphics of read mappings
  • Handling of line breaks in annotation notes improved
  • On Linux: User interface text has been changed to not use bold font to make a better visual appearance
  • ChIP-Seq annotations were not added when running ChIP-Seq on the Genomics Server. The fix means that workflows using ChIP-Seq will be broken and needs to be re-configured by deleting the ChIP-Seq element and adding it again.
  • Create mapping graph tracks caused problems when part of workflows
  • Fixed error that caused variant detection to crash


CLC Genomics Workbench 6.0.2

Release date: 2013-03-19

Improvements

  • An update to the de novo assembly algorithm means that it will only include Ns in the contigs when doing scaffolding, or if the reads themselves contain Ns. Previously, ambiguities in the graph behind the assembly resulted in regions of Ns, but these have turned out to be problematic for customers submitting their results to NCBI, so the algorithm is now taking extra care to avoid this.
  • VCF export: headers mentioning the name and version producing the VCF file, and the identifier of the origin variant track is also encoded as a CLC URL in the header. The installer of the Workbench will per default associate the CLC URL with the Workbench, so that it can directly open the file. Alternatively, the id can be pasted into the search field in the Workbench to retrieve it.
  • Fragments generated from restriction site analysis can now be opened in batch. When multiple rows in the fragment table are selected, the right-click menu option will now create a sequence list with all the selected fragments.
  • For alignments, mappings, BLAST results and other sequence views, the right-click options to Open Copy of Sequence and Open This Sequence have been merged to Open Sequence. If a copy should be created, use Save As with the new sequence, or drag it into a folder in the Navigation Area.

Bug fixes

  • Import or download of UCSC variant tracks was only done partially with no warning to the user. Only variants on chr1 were annotated. This has now been fixed, but we strongly recommend all users downloading or importing variant data from UCSC using Genomics Workbench 6.0 to re-run the import/download using the new version.
  • Trio analysis tool did not report a reference allele as a de novo mutation, even if both mother and father only had variant alleles at this position. This has now been fixed so that reference alleles are not considered special when analyzing the inheritance.
  • The RNA-Seq Analysis produced only single reads in the unmapped reads list. This has now been fixed, and we encourage customers using paired reads as input and performing downstream analysis of the unmapped reads to rerun the RNA-Seq Analysis.
  • In the GO Enrichment Analysis tool for variant data, some columns were missing. This has now been fixed.
  • When trimming paired data, section 4 in the report did not show the right number of reads used as input.
  • Several errors related to workflow configuration and execution have been fixed.
  • Errors related to managing Workspaces have been fixed.
  • An error occurring when using variant tracks from old versions in the Compare Variants tool has been fixed.
  • Annotations were added by the Find Open Reading Frames tool, even though the option to add annotations was not selected. This is now fixed.
  • Fixed an out-of-memory problem in the Create Alignment tool.
  • The result of the Target Regions Statistics tool is now named after the input file.


CLC Genomics Workbench 6.0.1

Release date: 2013-01-23

Improvements

  • The RNA-Seq tool supports strand-specific mapping of paired reads.
  • Improved performance of showing and exporting coverage graphs
  • Added Legal and Tabloid formats for printing
  • Made the reporting of automatic pair distance ranges for de novo assembly and read mapping more user friendly

Bug Fixes

  • Workflows including variant detection need to be upgraded. The variant detection elements need to be re-created and connected.
  • Fixed error in probabilistic variant detection that caused it to crash.
  • Fixed an error in the trim report: When several trim methods were chosen, the numbers did not accurately reflect the number of sequences trimmed in each step.
  • Fixed an error in the figure showing the paired distance in the RNA-Seq results report
  • Fixed an error when translating DNA to protein. When more than 10 sequences were produced, the resulting protein sequence included X instead of * as stop symbol. We advice customers to re-run any analyses with the translation tool when using more than 10 sequences as input.
  • Non-specifically mapped reads (multihit reads) were colored red and green instead of yellow in packed view and when disconnect pairs is enabled. This is now fixed.
  • Fixed error in target region statistics when some regions were 0 bases long.
  • Link to reference sequence were missing from the history of mapping results, this is now fixed.
  • Unmapped reads from de novo assemblies were not passed on to the next element in a workflow, this is now fixed.
  • Various minor bug-fixes


CLC Genomics Workbench 6.0

Release date: 2013-01-08

New features

    • Workflow: there are several important new features for workflows
      • It is possible to control which parameters should be locked or unlocked. This means that the creator of the workflow can decide which parameters should be left open for adjustment when the workflow is executed.
      • Usability of workflow editor greatly improved
        • There is a grid for helping layout
        • Visual indication of the number of possible connections as input to a workflow element
        • Visual indication to show if parameters have been changed
        • Handles for dragging and connecting elements have been made more clear
        • Side Panel for controlling grid layout and switching on compact visualization of workflow elements
        • You can high-light selected paths in a workflow
      • It is now an option to attach the workflow design file with the installer to allow users edit the workflow
      • There is a special icon for workflows in the Toolbar to make the creation and installation of workflows more visible
      • Several tools are now workflow-enabled:
      • Workflow compatibility: with this release, all of the tools in the Resequencing folder and the Trim tool have changed. This is mainly due to the change in the variant format (explained below). Workflows using these tools need to be updated by deleting the tool, adding it again and restoring the connections and parameters that have been modified. When you open the workflow editor, the workflow elements that need to be updated are high-lighted in red. For installed workflows, this needs to be done in the original workflow design, and the installer needs to be re-built and installed again. We are sorry for the inconvenience caused by this, and we are working on a solution to make the upgrade mechanisms for the next release much more smooth.
    • Variant detection and resequencing
      • New variant data format. We recommend all users of the variant detection tools to read the change notes in the manual for this release. The main features are:
        • Variants are reported with one entry per allele. This means that heterozygous variants are represented as two lines, including one line for the reference allele.
        • Variants were previously joined to form MNVs. The MNV concept has been replaced by linkage groups that mark that two variants have been observed together and assures that tools like Amino Acid Changes will produce correct results.
        • As a consequence, the variant types have been updated.
      • As a consequence of the new data format, the Filter against Variant Database tool has been updated and renamed to Filter against Known Variants:
        • The auto-link feature is now obsolete
        • There are now three modes of filtering (learn more here ). The filter for exact matches replaces the Haplotype Comparison tool which has been removed from this release
      • New tool for annotating variants with flanking sequence from the reference
      • New tool for removing reference allele variants
    • De novo assembly
      • Automatic paired distance estimation is now part of the de novo assembly
      • Guidance only option is now able to use single reads as well as paired reads
      • The number of Ns deriving from ambiguities in the graph data structure built by the assembler is reduced. Note that this does not refer to Ns inserted as part of scaffolding.
      • Fixed problem causing scaffold annotations to be removed when updating contig sequences based on mapping
      • Improved the scaffolding accuracy for overlapping contigs.
    • Mapping reads to circular chromosomes is now fully supported
      • Visualizations of reads that map across the starting point of the sequence are shown both at the start and end of the reference sequence, marked with >> to indicate that the alignment continues at the other end.
      • All algorithms and exporters support circular mappings
      • When downloading genomes using the Download Genome tool, circular chromosomes are marked as circular. If this information is important for the further analysis, please download or import a new copy of the reference genome, since this information is not part of existing tracks. Circular and linear versions of the same chromosome are compatible when used in comparisons and for track list visualization.
    • New tool for extracting consensus sequence from a read mapping or BLAST result:
      • A number of options for handling low-coverage regions, including putting in Ns or splitting the consensus sequence
      • Ability to decide for ambiguity or voting scheme taking quality scores into account when dealing with conflicts. A noise threshold can be added for the ambiguity option.
      • Consensus sequence are annotated with important events (low-coverage regions and conflicts).
      • Ability to run in batch and be part of workflows
    • New tool for merging overlapping pairs
    • Tracks
      • Scrolling in mapping tracks can be done by pressing Alt while scrolling with scroll wheel or track pad
      • Vertical zooming in graph tracks can be achieved by pressing Alt while scrolling with scroll wheel or track pad
      • Vertical panning in graph tracks can be achieved by pressing Alt while using the Pan tool
      • VCF export of variant tracks: Please note that you have to select both the variant track and the reference genome sequence track before you click Export.
    • Trim:
      • Runs on multiple cores. This will greatly speed up trim on computers with multiple cores.
      • The definition of adapters for adapter trim has changed from the preferences to its own filein the Navigation Area. This makes it easier to manage large sets of adapters, it solves some usability problems related to the old dialog, and it makes it possible to work with adapter trim from the CLC Genomics Server Command Line Tools. Adapters can be imported directly using the standard import framework, or they can be created from scratch by manually adding in the adapter list editor.
    • Target region statistics:
      • The minimum coverage value is use throughout the coverage report and tracks for defining low coverage thresholds
      • Additional table and plot in the report showing how many target regions have a certain percentage of the region above the low coverage threshold.
      • Additional information in the track: median coverage and fraction of fragment covered by the minimum coverage
      • New output type: per-base coverage table can now be created
    • Detailed mapping report includes more information:
      • The tables for non-specific and non-perfect matches display the fraction of all mapped reads in addition to the number of reads
      • Overview plot of lengths of insertions and deletions in the read alignments
      • Tables and plots showing differences between reads and reference for each base.
      • Information about quality score distribution for matches and mismatches
      • Distribution of mismatches on read position
      • Information about number of reads with unaligned ends and distribution of lengths of the unaligned ends
    • Mapping views:
      • New option to disconnect pairs in the view. This is particularly useful for overlapping pairs which can be hard to tell apart in packed view.
    • Small RNA annotation:
      • miRBase can now also be imported from a file. Previously only direct download was possible.
      • Grouping on mature use all mature, not only 5' end.
      • Statistics on ambiguities are now available in the annotation summary report.
    • RNA-Seq: fusion gene table has been changed to list broken pairs rather than gene combinations. The pairs can be extracted to a sequence list for further investigation.
    • Import of tabular mapping files is no longer supported. This format was produced by the early Illumina pipelines (with Eland) and this is no longer relevant. The SAM format has taken the place as the de facto standard for mapping data.
    • Toolbox improvements:
      • New Favorite Toolbox: A new tab next to the Toolbox holds
        • Frequently used tools. This is automatically updated based on which tools are used most frequently.
        • Favorite tools: Right-clicking a tool in the Toolbox allows you to add a tool to your favorites list.
      • Quick launch of tools: Pressing Ctrl + Shift + T shows a dialog for easy typing and launching tools.
    • Relevant view settings are now copied when switching between different views of the same data. As an example: if you have specified a set of restriction enzymes to display in the circular view of a sequence and switch to the linear view, these enzymes will also be listed in the Side Panel here. Note that the Side Panel settings are only copied to the new view if they have been changed by the user in the old view.
    • Performance when sorting of large tables has been improved
    • Rename can now be done through right-click menu in Navigation Area.
    • Annotations on circular sequences:
      • When shown in linear view they have a cleaner appearance. Before, there was a line from beginning to end of the annotation, and this has now been removed.
      • When shown in circular view, it is no longer displayed as a joined annotation over the start point but as a continuous annotation.
    • Alignments: The performance of the algorithm for running multiple alignments has been improved and now runs on multiple cores.
    • Find Open Reading Frames can be run in batch and workflows
    • Translate to protein can be run in batch and workflows
    • Restriction map: Excel export now creates a sheet for both the cut sites table and the restriction map.
    • Alignments can be used as input for finding primer binding sites.
    • Export now has progress and can be canceled.
    • BLAST results and 3D structures can be exported as text.
    • Batching: Previously, results were saved in the same folder as the input data. This can now be changed and a new save folder can be specified. Sub-folders will be created for each batch unit.
    • Export to fastq now supports sequences up to 32k in length
    • The limit for the cloning editor has been increased to 6,000,000 bases (from 4,000,000 bases).
    • Naming of output from de novo assembly and read mapping made consistent
    • Create Expression Clone (LR) from Gateway Cloning produces sequence object rather than a list
    • Shortcut key for Translate to Protein has changed from Ctrl + Shift + T into Ctrl + Shift + P.

Bug fixes

      • Fixed a number of mapper errors causing the mapper to crash.
      • Fixed a problem in the read mapper when estimating paired distances. This lead to very few reads mapping as pairs.
      • Fix to the proxy settings recognition meaning that Download Genomes and download of BLAST databases now work when there is a proxy setup.
      • Fixed problem of not correctly formatting qualifiers in EMBL export.
      • Fixed a problem sorting sequence lists on modification date.
      • Test on proportions: Fixed an error caused by the wrong group being used as reference, which means that the positive values should have been negative and vice versa.
      • Various bug fixes.

Plug-in releases

Structural variation plug-in has been updated with a completely new algorithm based on unaligned ends (soft clipping). The plug-in is still in beta. Read more in the user manual .


CLC Genomics Workbench 5.5.2

Release date: 2012-12-07

Bug fixes

  • Fixed a number of mapper errors causing the mapper to crash.
  • Fixed a problem in the read mapper when estimating paired distances. This lead to very few reads mapping as pairs.
  • Fix to the proxy settings recognition meaning that Download Genomes and download of BLAST databases now work when there is a proxy setup.
  • Fixed problem of not correctly formatting qualifiers in EMBL export.
  • Fixed a problem sorting sequence lists on modification date.
  • Test on proportions: Fixed an error caused by the wrong group being used as reference, which means that the positive values should have been negative and vice versa.
  • Various bug fixes.


CLC Genomics Workbench 5.5.1

Release date: 2012-09-05

Improvements

  • Improved accuracy of read mapping

Bug fixes

  • Important: In Genomics Workbench 5.5, the Process Tagged Sequences tool would sometimes switch the sample names of the results. We strongly recommend everybody to update to the new version, and re-run all analyses made with this tool in Genomics Workbench 5.5.
  • Fixed: Various read mapper bug-fixes that made the read mapper crash on certain data sets
  • Fixed: Workflows would fail when intermediate results were empty (e.g. if no variants were found and a variant track was used for subsequent analysis).
  • Fixed: Consensus generation when creating standard read mappings was slow in Genomics Workbench 5.5
  • Fixed: Some IonTorrent sff files would fail to import on Windows.
  • Various bug fixes


CLC Genomics Workbench 5.5

Release date: 2010-08-10

New features

Resequencing tools

  • New variant caller: Probabilistic variant detection.
    • This is based on a probabilistic model in contrast to the quality-based variant caller that is based on quality analysis and cut-offs.
    • Supports genomes with a ploidy of 1, 2, 3 or 4.
    • Pre-filtering for non-specific matches and intact pairs
    • Post-filtering of homopolymer regions and forward/reverse reads balance
  • The current SNP and DIP detection tools are merged into one: Quality-based Variant Detection
    • Pre-filtering for non-specific matches and intact pairs
    • Post-filtering of homopolymer regions and forward/reverse reads balance
  • Target regions statistics(previously a plug-in) is now integrated into the Workbench
    • A new parameter: Minimum coverage that will report the fraction of each region that is covered by at least this number of reads
    • Works on tracks: the regions of interest are defined in a track and the resulting per-region table is reported as a track
  • Annotation and filtering tools for variants
    • Annotate and filter against database variants (dbSNP, 1000 genomes or other databases that can be downloaded or imported)
    • Filtering of marginal variant calls based on average base quality, forward/reverse reads balance and frequency
    • Annotating variants with exon numbers
  • Variant comparison
    • Compare variants within group: Find variants that are shared between a number of samples
    • Fisher exact test: Compare variants between case and control groups to find variants that are more common in the case than in the control
    • Trio analysis: Compare child-father-mother variants to enable studies of inherited and de novo mutations
    • Filter against control reads: Compare a variant track against a control sample to remove variants that are also present in the control
    • Filter on haplotype comparison: Identifies variants that have the same haplotype in two samples.
  • Functional consequences of variants
    • GO enrichment analysis.This tool can be used to investigate the effect of candidate variants by analyzing the affected genes for a common functional role.
    • Amino acid changes: Classify synonymous and non-synonymous variants and see the effect on the protein.
    • Annotate with conservation scores: Annotate a variant with a score from conservation tracks that can be imported into the Workbench.
    • Predict splice site effect: A simple investigation to see if the variant is within two bases of an intron-exon boundary

Download of reference genome and annotations

  • Integrated download of reference genome sequences and annotations for selected organisms
  • Example: for human hg19, you can directly download sequences, genes and transcripts, variants from 1000 genomes, Hapmap, COSMIC, and dbSNP (incl. common SNPs).

Tracks

  • Genomic information for re-sequencing analysis can now be stored as tracks.
  • Great power for comparison and visualization because different kinds of data (reads, variants, genes etc) are not bundled into one static file but are separated into one file per data type. This means that different data sources can be compared and visualized in a flexible way.
  • Track lists provide a mechanisms for combining several de-coupled tracks into one list for visualization purposes while retaining the individual files that contain the data
  • All tools for re-sequencing has options to create and use tracks (e.g. read mapping, variant detection etc). More tools will be re-designed to work with tracks later.
  • Tools for converting between standard sequences and mappings and tracks:
    • Convert tracks to sequences, mappings etc
    • Convert sequences, mappings and annotations to tracks
  • Tools for filtering, annotating and merging tracks
  • Support for importing files as tracksfrom a number of new formats:
    • Fasta
    • VCF
    • BED
    • Wiggle
    • UCSC table format
    • GFF / GTF and GVF
    • Complete genomics master var files

Workflow

  • Workflows can be built in the Workbench to combine various tools from the Toolbox into one analysis, connecting the output from one tool to the input from another
  • Workflows can be distributed and installed either in the Workbench or in the CLC Genomics Server
  • The creator of the workflow can configure parameters for the workflow and these will be fixed when the workflow is distributed and installed
  • The creator of the workflow decides which of the output from the tools that should be saved and which should be discarded
  • Workflows can be run in batch, making it a powerful tool for crunching high numbers of samples through the same pipeline.

New read mapper

  • Great improvement of speed for mapping (white paper to be released soon)
  • Support for complex genomes with many repeats
  • Re-design of wizard for read mapping to make it simpler and easier to use. Options to control consensus sequence building and annotating with conflict annotations have been removed, since they have very little relevance for the amounts of data created by NGS platforms today
  • Color space mapping is still performed with the old mapper
  • Automatic calculation of paired distance (only for base space data)
  • Report includes percentage of reads instead of only counts
  • Changed strategy for placement of gaps: previous versions tried to cluster gaps into as few units as possible. This would sometimes cause problems for variant calling because this would in some situations place the gaps differently from read to read.
  • Please note that the memory requirements are different than for the old mapper. The memory requirements depend largely on the size of the reference genome. We will soon update our system requirements page to reflect this.

Sequencing QC report: Create summary statistics for sequencing data in various ways:

  • General statistics on read length etc
  • Quality statistics on quality scores
  • Over-representation analysis of subsequences
  • Analysis of duplicated reads

Re-organization of menus in general to be more genomics focused

  • Classical sequencing tools organized into a subfolder (for gene and protein analysis, alignments and trees etc)
  • Molecular biology tools like cloning, PCR primer design, Sanger sequencing analysis etc moved to a special folder
  • Two new folders for core NGS and core track tools
  • Application-specific folders for the various types of NGS applications: resequencing, de novo sequencing, transcriptomics and epigenomics
  • Search tools moved to the Download menu (available from the top menu and the Tool bar)
  • Different importers integrated into one menu, including the new track import. The Vector NTI import has been moved into a plug-in that can easily be installed from the plug-in manager.
  • The Local Search has been moved from the Search menu (now renamed to Download) and into the Edit menu

Special notes for customers already using the Genomics Gateway plug-in

  • New search tool in track list editor
  • New navigation and position panel at the top of the Side Panel in the track editor
  • Download tool for downloading genomic data replaces Ensembl download tool
  • Unlimited number of chromosomes in tracks
  • More streamlined conversion tools:
  • Convert tracks to sequences, mappings etc
  • Convert sequences, mappings and annotations to tracks
  • Export tracks to gff, sam
  • Print and graphics export of tracks
  • New tool for filtering marginal variant calls
  • New tool for annotating against database variants

Plugin updates

    • Genomics Gateway plug-in is integrated into the standard Genomics Workbench and Server.
    • Probabilistic variant detection Plug-in is integrated into the standard Genomics Workbench and Server.
    • Sequencing QC plug-in is integrated into the standard Genomics Workbench and Server.
    • Target regions statistics plug-in is integrated into the standard Genomics Workbench and Server
  • Grid integration plug-in is integrated into the general server plug-in. If a grid preset is present on the server, the Grid option becomes available in the Workbench dialog.
  • Old read mapper made available as a legacy plug-in that customers can download. This facilitates compatibility of results with previous versions and can be used when memory requirements for the new mapper are too large.
  • Beta read mapper is integrated into the standard Genomics Workbench and Server.
  • Biobase genome Trax is redesigned and split into two:
    • For downloading data (requires a download license)
    • For annotating a variant track (requires an online license)


CLC Genomics Workbench 5.1.5

Release date: 2012-04-11

Improvements

    • Ion Torrent paired protocols are now supported for both fastq and sff files. Read more...
    • MiSeq multiplexed data directly supported. This means that the barcoded samples are recognized on import and the reads are grouped accordingly. The reads from the same sample will be grouped in its own sequence list.
  • New broken pair mate locater tool for getting overview of where the mates of broken pairs in a selected region are mapped. It includes the possibility to extract a sequence list with the broken pairs.
  • Aligned fasta import and export is now supported (see http://www.bioperl.org/wiki/FASTA_multiple_alignment_format). A consequence of this is that the standard fasta import of sequences will reject to import sequences that contain gaps, assuming they should be imported as alignments instead.
  • User manual includes a section on which tools will be benefit from computers with multiple cores.
  • The license order ID is visible in the License Manager, both for static and network licenses. For security reasons, the last 10 characters of the ID are masked. This will prevent unauthorized persons from copying the license order ID to another computer, but will allow the CLC staff to identify the license used.

Bug fixes

  • Fixed: ChIP-Seq Analysis would sometimes yield no results when the FDR could not be estimated. This error was introduced with Genomics Workbench 5.0.1. If you have had ChIP-Seq samples were no peaks were reported, we recommend re-running the analysis with the new version.
  • Fixed: Cloning bug: when performing restriction cloning in regions with single-stranded DNA, you would get an error.
  • Fixed: 454 paired data import: quality scores on the second part of the read were not imported.


CLC Genomics Workbench 5.0.1

Release date: 2012-02-23

Plug-in updates

Probabilistic Variant Detection Plug-in updated

  • There is a new filter that requires sequencing reads from both strands to call a variant
  • The forward and reverse coverage for each allele is reported in the output

Bug fixes

  • Fixed: Downloading of protein sequences from NCBI fails.
  • Fixed: Calculation of cDNA-level changes in variant detection fails in some situations.
  • Fixed: Trimming tool in Sequencing Data Analysis (not for High-throughput Sequencing data) does not add annotation to sequences when the full sequence should be discarded.
  • Fixed: Opening external files (e.g. pdf files or Word documents) with spaces in the file name does not work on Windows.


CLC Genomics Workbench 5.0

Release date: 2012-02-13

 New features

  • New de novo assembler.
    • Scaffolding is integrated into the assembly. This means better resolution of contigs and insertion of Ns when two contigs cannot be joined in sequence but there is pair information that connects them.
    • New extended report for the assembly with information about nucleotide distribution, contig lengths measurements and scaffolding regions.
    • User interface improvements: Wizard re-designed to better reflect the process of the assembly. The parameters related to the mapping step are only available when the user chooses to map the reads back to the contigs.
    • New parameter for specifying the maximum bubble size. There is a default value which is automatically calculated based on the input data.
    • New white paper with benchmarks and results from quality control.
    • The old de novo assembler is available as a plug-in. At the end of 2012, the plug-in will be discontinued, so it should only be used for backwards compatibility with results from older runs or if the new assembler fails.
  • Printing and pdf export of read mappings: the mappings are now wrapped to make better use of the paper.
  • SNP and DIP detection results include cDNA-level numbering and variant information compatible with www.hgvs.org/
  • SAM files exported from the Workbench now include basic information about read groups. Furthermore, read orientation for paired reads is now preserved when exporting to SAM and BAM files.
  • Improved exploitation of multi-core machines in read-mapping, RNA-Seq, and de-novo assembly.
  • Improved performance and memory management for high-throughput analyses in general.
  • Usability of Close icon on tabs has been improved. Both in terms of responsiveness and making it impossible by accident to initiate a drag and drop movement when you hit the close icon to try to close a tab.
  • "Show" submenu has been removed from File Menu, and the right-click menu now includes only the relevant views and editors. This provides a better overview.
  • The behavior of the Close Other Tabs function has changed so that it will close all views, regardless of the way the view area is split.
  • The most common annotation types are assigned a special color per default. Other annotation types previously got the same color. This has been extended so that the Workbench attempts to find a special color for each type.
  • VectorNTI import is no longer in a separate plug-in but part of the Workbench. The functionality remains unchanged.

Plugin updates

  • Genomics Gateway plug-in updated
    • New tools for analyzing variants in groups of samples, enabling systematic analysis of genetic variants for whole genome, exome or targeted approaches.
      • Find Common Variations in Group. This can be used to find common variants in a group of variant tracks.
      • Fisher Exact Test. Comparing two groups of variant tracks (e.g. can be used for case-control studies). You can see which variants are found more common in the case compared to the control group using the Fisher Exact test.
      • Filter against Control Reads. This can be used to compare a single case variant track against a negative control from the same sample. It will check whether a certain number of the reads in the control sample have the same allele present as in the case variant.
    • New tools for functional annotation of variants
      • GO Enrichment Analysis for identifying significant gene ontology terms, which are annotated to genes having at least one variation.
      • Annotation with Conservation Scores. By importing a conservation score track (e.g. PhyloP Scores), variants can be annotated with a conservation score. Variants with a high score are assumed to alter functionally important regions.
    • New data structure.
      • All tracks are now saved as single files, and you can create a Track List to visualize them together.
      • A tool is available for data conversion from track sets to single tracks
    • New organization of the "Tool box" to provide a better overview
    • Support for batching and running tools on a Genomics Server
    • The Track List view supports drag and drop for adding and re-arranging tracks
    • Several Graph tracks can be created and displayed
  • Probabilistic Variant Detection Plug-in updated
    • The probability used as threshold for the algorithm is now reported in the output
    • Variants reported cDNA-level numbering and variant information compatible with www.hgvs.org/
  • Additional Alignments Plug-in updated
    • The algorithms have been updated to the most recent versions
    • The list of algorithms has been reduced to two for compatibility reasons


CLC Genomics Workbench 4.9

Release date: 2011-11-29

New plug-ins and plug-in updates

  • New plug-in released: Ab Initio Transcript Discovery Brand new tool for transcript discovery. Based on gapped alignments of RNA-Seq data, the plug-in identifies new transcripts and creates or extends annotations on the reference sequence that can be used for measuring gene expression using the RNA-Seq Analysis tool of the Genomics Workbench. The plug-in provides functionality a la Cufflinks/TopHat. Note that this used to be called the Large Gap Mapper plug-in.
  • Genomics Gateway plug-in updated
    • New refiner: variant frequency. This allows you to filter a variation track, so that only the variants that have a frequency above a user-defined threshold remain. Note that the filter only applies to the frequency of non-reference alleles.
    • Performance improvements when visualizing read tracks
    • Fixed: CDS annotations from Ensembl did not include start codons
    • Fixed: Some variation tracks were not always recognized as variations. This means that the variation-specific refiners could not be used.
    • Fixed: Table view of annotation tracks could have a very large number of columns that are now combined into one column.
    • Fixed: There was an error when closing a view without saving changes. This could lead to subsequent errors when trying to rename tracks.
  • MLST module updated
    • Possible to download MLST schemes from any web site compatible with mlstDBnet
    • When a new allele is called because the sequencing reads are not long enough, this is reported in the isolate view rather than “New allele”
  • Structural variation plug-in updated
    • Only detection of insertions, deletions and interchromosomal variations are now supported.
    • The plug-in has a problem with repeats. The best way to work around this is to ignore non-specific matches when doing the mapping, to run the structural variant detection with a very stringent p-value cutoff and filter repeats out afterwards if possible (this could be by refinement with the microsatallite track from Biobase or another repeat track using the Genomics Gateway).
    • Integration of exporter to export results in circos format.
See a list of all plug-ins here.

New and improved features

  • Process tagged sequences
    • A summary report is now available with an overview of the number of reads per bar code.
    • You can search for barcodes (MIDs) on both strands, supporting new 454 protocol.
  • Core management:  you can restrict the maximum number of cores that the Workbench is allowed to use. This can be useful when the Workbench is running on a system with shared resources where other applications need reliable access to CPU when the Workbench is doing analyses. This is mainly an issue for the De novo assembly and Read Mapping algorithms but the restriction applies to all algorithms that use several cores.
  • Multi-site Gateway Cloning . You can perform multi-site gateway cloning and in a few clicks create your expression clones with multiple fragments. The existing Gateway Cloning tool has been expanded so that you can easily recombine several fragments as well as continue using it for the standard Gateway Cloning.
  • Find Binding Sites and Create Fragments improved:
    • If your template sequence contains ambiguity nucleotides (like N, Y etc), these will no longer count as mismatches when checking your primers. Note that the primer base of course need to be covered by the ambiguity symbol (e.g. a T would still be a mismatch if the template sequence has an R, which means either A or G).
    • Fixed: When using multiple template sequences, the choices to open or annotate a fragment from the fragment table did not work properly. They always applied to the first sequence although the fragment was located on another sequence (as indicated in the table).
  • Exporting fastq format no longer includes redundant name of the read in the quality score line. Now the name only appears once per read.
  • Enhancing the nomenclature of reporting amino acid changes in variant detection:
    • p. prefix included
    •  ? used for unknown (rather than non-standard “Unknown”)
    • = used to denote an allele which agrees with the reference sequence (rather than missing entries or entries like Ala45Ala)
    • [...] used around ,-separated lists of changes, each change coming from a different CDS annotation
    • [...];[...] scheme used to separate multiple alleles at same site

Bug fixes

  • Fixed: Import of SOLiD data failed when multiple sets of paired data was selected.
  • Fixed: Annotations spanning the sequence from start to end did not display right when the sequence was wrapped. The annotation was only displayed on the first line.
  • Fixed: Set-up experiment would crash when using many samples.
  • Fixed: Calculation of consensus sequence in read mappings: Sometimes a majority of gaps would be ignored and a base erroneously introduced in the consensus sequence. It occurs when 1) there is no coverage in an initial segment of the reference sequence, and 2) a gap is encountered in the global read alignment. From that point onwards, gap counts are included in the consensus vote, but they are taken from the start of the mapping (where they are all 0), so they are out of sync with associated base counts. High gap counts would then kick in further downstream, possibly making the consensus a gap where it should not be. We recommend checking your mapping results manually if you rely on using the consensus sequence for further analysis.
  • Fixed: importing adapters for trimming and barcodes for de-multiplexing did not work properly for CSV files and empty rows in Excel files were not allowed.
  • Fixed: Motif search did not exclude regions with Ns when the option “Exclude matches in N-regions for simple motifs” was selected.


CLC Genomics Workbench 4.8

Release date: 2011-10-12

New plug-ins and plug-in updates

New and improved features

  • De novo assembly improvements:
    • Word size can now be manually adjusted
    • When update contigs is not selected, the resulting mapping table will also include contigs where no reads map back. This means that the number of rows in the table will be identical to the number of “Simple contigs” produced by the de novo assembler. Previously contigs with less than two reads mapped back would be omitted from the table.
  • Merge Mapping Results will produce a mapping table when mapping tables are provided as input
  • New button to extract a subset of mappings from a mapping table
  • Mapping tables now include a row for reference sequences where no reads map. This is done to provide consistency of results. Opening such an entry in the table will just open the reference sequence in the table.
  • You can switch between compactness levels by pressing the Alt key while scrolling with your mouse or touchpad.
  • SNP detection no longer ignores ambiguity bases in the reads. Each ambiguity code is treated as a separate variant; no merging of the possible variants covered by each ambiguity code is attempted (this typically only has an effect when using Sanger sequencing data since standard NGS platforms do not use ambiguity base calls).
  • Translation in the Side Panel of nucleotide sequences now includes options to translate “All forward” or “All reverse” reading frames.
  • Conflict table view of read mappings: reference positions also reported in addition to the consensus sequence position.
  • Alignments: it is now possible to copy all annotations from one sequence to other sequences in the alignment.
  • Cloning editor: number of restriction cut sites and motifs are shown separately for the sequence currently displayed and for all sequences in the list.
  • Restriction enzymes updated with latest REBASE version.
  • Clean-up of the Workbench window so that it no longer holds information about which Workspace is active. This information is now displayed with check boxes in the Workspace menu.
  • SAM import and export format is now described in detail in the user manual.

Bug fixes

  • Fixed: Orientation of SOLiD mate-pair data was not set correctly on import. This meant that the reads were marked as broken pairs after mapping. We strongly recommend all users to re-run the import if using SOLiD mate pair data.
  • Fixed: Virtual tag lists created with RNA failed
  • Fixed: For circular molecules, the Find Open Reading Frames tool did not find reading frames on the negative strand. We recommend users to rerun any reading frame analyses on circular molecules.
  • Fixed: Experiments tables can now be exported in Excel and csv formats
  • Fixed: BLAST searches at NCBI always searched nr or nt, regardless of which database was specified. This has been a problem since the release of CLC Genomics Workbench 4.7
  • Fixed: If a combination of trim options is used, like quality trim or length trim in addition to adapter trimming on both strands, the reads could end up reverse complemented.
  • Fixed: Import of paired data generated by Illumina Casava 1.8 did not match the pairs correctly. Users are advised to re-import and re-analyze all data imported from Casava 1.8.
  • Fixed: Pattern discovery wizard failed when the tool is run for the second time.
  • Fixed: De novo assembly sometimes failed on Mac OS 10.7 Lion.
  • Fixed: Errors for read mappings with the text “premature end of .cas file” have now been fixed. This has only been a problem on Windows.
  • Fixed: Certain annotation types were mapped to generic annotation types when exporting sequences in Genbank format.


CLC Genomics Workbench 4.7.2

Release date: 2011-07-14

Bug fixes

  • Fixed: A cache-related bug which would sometimes result in errors when running large jobs.
  • Fixed: The UniProt search has been updated to reflect URL-changes at uniprot.org.
  • Fixed: A problem with interpretation of broken pairs on re-import from SAM format files.
  • Fixed: A problem with microarray experiments where large experiments could not be analyzed.


CLC Genomics Workbench 4.7.1

Release date: 2011-06-27

Bug fixes

  • De novo assembly produced empty results
  • Paired distances for read mapping were not recorded correctly in history
  • Read mapping in batch: the minimum and maximum paired distance fields were enabled even though the “Override” checkbox was unchecked
  • Improved performance of packed view rendering
  • Various minor bug-fixes


CLC Genomics Workbench 4.7

Release date: 2011-06-21

New and improved features

  • Mapping
  • New plugin to detect Structural variation
    • Action to detect structural genomic variation from paired read information
    • Action to detect copy-number variation (CNV) from coverage information
  • New and more flexible data structures to store information about paired data
  • All history entries will from now on include the version number of the software
  • Previous limit at 2 billion for the maximum number of reads in one analysis has been removed.
  • Reporting of amino acid changes in SNP and DIP detection now follows recommended nomenclature more closely w.r.t. changes that affect start codons and changes that cause indels at the amino acid level.
  • Performance of Excel 2010 exporter improved in terms of speed and memory requirements
  • When using a license server, the Workbench user can now specify a custom user name which can be checked in the license server configuration. This makes it possible to get more fine-grained control of the users of the license server.
  • Export of trace data in scf format.
  • BLAST
    • BLAST parameters have been changed so that number of threads is 1 per default (there is a bug in the BLAST code provided by NCBI which makes it fails on certain data sets when using multiple threads)
    • The “Mask lower case” option has been removed
    • Tool to download BLAST databases from NCBI within the Workbench
    • The BLAST Database Manager has been improved to show when referred databases are missing

Bug fixes

  • Fixed: References between SNP tables and mapping results were broken when exporting-importing data.
  • Fixed: Summary mapping report did not mention customized mapping parameters when running in batch mode
  • Fixed: Various SAM/BAM import and export errors
  • Fixed: When running adapter trimming searching on both strands, some reads were marked as “reversed” in the result. The only consequence is incorrect reporting of the number of forward and reverse reads in the mapping results.
  • Fixed: Mapping report did not calculate read length for individual reads when using paired data
  • Fixed: Alignment-based primer design failed for columns with many gaps
  • Fixed: “Find Binding Sites and Create Fragments” did not find binding sites where the primer extended the 5′ end of the template sequence
  • Fixed: DIP detection would crash on large data sets
  • Fixed: Import of certain special Genbank files failed
  • Fixed: RNA-Seq report with paired data: total number of reads was counted as pairs rather than individual reads
  • Changed: RNA-seq transcript variants are named using ‘.’ rather than ‘_’. Note that it is not possible to create transcript-level experiments based on old and new samples
  • Changed: Label of Illumina quality score selector has been changed to reflect the 1.8 update of the Illumina pipeline which now uses quality scores on the Phred scale
  • Various minor bug fixes


CLC Genomics Workbench 4.6.1

Release date: 2011-04-06

Bug fixes

  • RNA-Seq would crash when selecting prokaryote as organism type


CLC Genomics Workbench 4.6

Release date: 2011-04-05

New and improved features

  • Import of Ion Torrent data. A special importer has been made for Ion Torrent data in fastq or sff format. Read more.
  • Checkbox for reporting merged SNPs. Read more.
  • An adapter trim setting for SOLiD 50bp small RNA reads has been added. Read more about adapter trimming.
  • SNP detection: When minimum paired coverage is set, reads from broken pairs will be completely ignored. Read more.
  • DIP detection: Reporting of amino acid changes better reflects nomenclature consensus.
  • RNA-Seq: the transcript-level sample includes a column for the ratio of unique to total transcript reads. Note that this means that results generated with this version cannot be used in older versions. Read more.
  • Better support for color space SAM/BAM files.
  • Local BLAST is faster when blasting against small databases
  • Export in color space fastq format. When data is marked as color space, exporting in fastq format will produce a file with color encoding rather than bases.
  • The plug-in used by the Workbench can now be installed using the Download Plug-ins tab in the Plug-in Manager.

Bug fixes

  • Fixed: In certain situations, the data-specific parameters in read mapping did not take effect
  • Fixed: When running read mapping in batch, the data-specific parameters were only applied to the first data set in the batch
  • Fixed: Local BLAST did not work on Mac OS 10.5
  • Fixed: Read mapping and RNA-Seq crashed because the reference could not be found.
  • Fixed: Joined annotations did not get the right off set when inserting a sequence in the cloning editor.
  • Fixed: Import of csfasta paired data crashed when one read had a dot in the beginning of the sequence.
  • Import of paired qseq files: the read pairs are now joined correctly when importing paired qseq files
  • Fixed: Import of GO annotation files did not work
  • When processing tagged paired data sets, the status of the resulting files were not marked as paired. This means that subsequent analyses did not make use of the paired information.
  • Various minor bug fixes


CLC Genomics Workbench 4.5.1

Release date: 2011-02-18

Bug fixes

  • Fixed issue with synonymous overhang characters in cloning editor
  • Graphics export now works with restriction sites shown
  • Gene Ontology Annotations can now be imported
  • CHiP-seq analysis adjusted for the use of gapped aligner – CHiP-seq analysis with previous version should be redone
  • Improved support for Mac OS X systems with japanese language
  • Improved support for systems with 512 MB of memory or less
  • Blast: Fixed issue with BLAST database creation taking too long under certain circumstances
  • Blast: Fixed issue concerning errors when input sequence names contained illegal characters
  • Blast: Fixed issue with Extract And Open option being erroneously disabled
  • Blast: Option to enter custom Entrez Query limits in Blast at NCBI re-introduced.
  • Blast: Improved speed when using Blast results as input to wizards.


CLC Genomics Workbench 4.5

Release date: 2011-01-26

New features:

  • Batching functionality of all high-throughput sequencing tools. It is now possible to start batch runs, e.g. running 12 samples through RNA-Seq Analysis in one go. Read more.
  • RNA-seq: transcript-level expression values and support for paired data
    • Included option to use paired information in RNA-seq. Read more.
    • Expression values can now be stratified into transcript level expression values, both for single and paired reads. Read more.
    • SOLiD data: new algorithm for mapping reads allows much higher fraction of reads to be mapped. Rather than a score limit, you now specify the stringency of the mapping using length and similarity fractions. Read more.
    • Similarity fraction for mapping of long reads is now available as a user-specified option (this was previously automatically set). Read more.
    • Simple reporting of putative gene fusions when using paired data. Read more.
    • Note about compatibility: Results from earlier versions should not be compared with results from this version.
  • BLAST tools have been redesigned
    • New Blast Database manager for easy administration and management of local BLAST databases. Read more.
    • More user-friendly way of creating and accessing local BLAST databases. Read more.
    • Much more stable design of both BLAST at NCBI and Local BLAST when running large data sets.
    • The SNP Annotation using BLAST tool has been discontinued.
    • See migration notes for using your old databases here.
  • SOLiD data: new algorithm for mapping reads allows much higher fraction of the reads to be mapped.
    • Rather than a score limit, you now specify the stringency of the mapping using length and similarity fractions. Read more..
    • Note about compatibility: Results from earlier versions should not be compared with results from this version.
  • Multiplexing: Process tagged sequencing data
    • It is now possible to import and use a file with bar codes and sample names. This makes it easier to process data with a high number of multiplexed samples. Read more.
    • You can specify separate output folders for each sample, making it convenient to batch process the subsequent analyses.
  • High-throughput Sequencing Import includes an option to place data into sub-folders (useful for batching subsequent analyses)
  • Cloning tool re-designed to make it easier and faster to perform restriction cloning Read more
    • Restriction sites used to select target vector and fragment. Read more.
    • Sequences can now be displayed in circular mode in the cloning editor. Read more.
    • Only one sequence displayed at a time (there is a list at the top of the view to switch between sequences). Read more.
    • Option to clone several fragments and adjust overhangs and orientation in one dialog. Read more.
    • New cloning tutorial available for a quick introduction. Read more.
  • Improved layout of restriction site annotations
    • Linear view: There is a new option for displaying labels as “Stacked” which means that the labels of overlapping cut sites can be discriminated. Read more.
    • Circular view: There is a “Radial” option that will place restriction sites (and annotations) as close to the sequence as possible with a radial layout. Read more.
  • Improved layout of general annotations
    • Linear view: There is an option to separate restriction sites and annotations in separate layers.
    • Circular view: There is a “Radial” option that will place annotations (and restriction sites) as close to the sequence as possible with a radial layout.
  • Motif search available in Side Panel
    • Dynamic annotations will be added for motifs defined in the Side Panel (similar to restriction sites). Read more.
    • Use motif lists to add your own motifs to the Side Panel.
  • Annotation table now available for sequence lists, mappings, mapping tables, BLAST results and alignments. Read more.
  • SNP detection reports adjacent SNPs within the same codon as one SNP. Read more.
  • De novo assembly: post-processing options when mapping reads back to contig sequences have been expanded. It is now possible to preserve the original contig sequences from the assembler (they used to be replaced by the consensus sequence from the mapping). Read more.
  • Support for exporting tables as tab-delimited files.
  • Audit option: manual editing of sequences will be recorded with an annotation on the sequence (this has to be switched on in the Preferences dialog). Read more.
  • The default database of restriction enzymes can be expanded (requires manual edit of database file). Read more.
  • The default set of codon frequencies can be expanded (requires manual edit of table files). Read more.
  • Improved option to export and import Side Panel settings. Read more.
  • Memory allocation: the default memory allocation for the Workbench changes from 75% to 50% of available physical memory with a maximum at 50 GB.

Bug fixes

  • SNP detection bug with corrupt complementary CDS annotations.
  • SNP detection: color correction errors now count when filtering SNPs (this has become important with the new mapping algorithm for SOLiD data).
  • The molecular weight calculation for the sequence statistics report is more accurate and is now reported for both single- and double-stranded molecules.
  • Various bug-fixes


CLC Genomics Workbench 4.0.3

Release date: 2010-10-28

Improvements:

  • Enhanced usability of GSEA analysis wizard: The “Remove duplicates” option is now a check box to switch on and off. Before, the choice of switching off was implicit by choosing Feature ID as the identifier. Now this is explicit using a check box.
  • Improved performance rendering large tables, particularly those with html formatting.

Bug fixes

  • SNP and DIP detection previously ignored overlapping pairs. Now they count (as one read) if they fulfill the quality criteria (SNP detection). In cases where the two parts of the pair disagree, the pair does not count. We recommend running all SNP and DIP detections based on overlapping pairs data sets again (this would be the case if the minimum distance when mapping the reads is lower than two times the read length). There is no need to re-run mappings – just the SNP/DIP detection.
  • ChIP-Seq: “nearest gene” reported not always right. This was the case for the last peak on each chromosome and also in cases where the order of the gene annotations in the reference file did not correspond to the order of the annotations on the actual sequence. We recommend running all ChIP-Seq Analyses again to get the correct reporting of nearest genes. There is no need to re-run the mappings.
  • SNP Annotation Using BLAST failed with certain query sequences (the result could not be shown)
  • Fixed crash of 454 import on certain Linux and Mac systems
  • SOLiD import accepts read names with -P2 at the end
  • Improved import of SAM/BAM files:
    • Better support for files from SOLiD Bioscope
    • Preliminary support for Complete Genomics files (The actual alignment is not represented completely – insertions that relates to a consensus sequence will be represented as unaligned ends in the imported mappings. This should be taken into account when looking for variations.)
  • In the Sequencing Data Analysis-> Assemble Sequences to Reference, the conflict resolution was disabled when not including a reference sequence in the output.
  • When importing sequences from Genbank files, mRNA annotations now prefer taking name after “locus_tag” rather than “product”.
  • Various minor bug fixes


CLC Genomics Workbench 4.0.2

Release date: 2010-08-19

Bug fixes:

  • Fixed error when importing 454 SFF files
  • Fixed error when importing SOLiD data with quality scores when the reads had “.”
  • Fixed error mapping large data sets on Windows 64-bit systems
  • Fixed error when opening tables generated by the Transfac plug-in and the primer search tool
  • Fixed errors when running analyses on experiments generated from RNA-Seq results
  • Genbank export of annotations on the negative strand were not in the right order
  • Fixed memory and performance issues related to import of many sequences, eg. from ACE files.
  • Various minor bug fixes


CLC Genomics Workbench 4.0.1

Release date: 2010-08-10

New features:

  • Improved performance of table filtering. Removed limit on the number of rows that can be filtered.
  • Option to search for read names in mapping results (and also sequence lists and BLAST results).
  • Improved performance of conflict table.
  • Better layout of graphics export and printing of mapping results: reference and consensus sequence repeated to provide an orientation context on all pages.
  • Extracting consensus sequence of mapping tables is now running in the background to provide a better user experience.

Bug-fixes:

  • Problem regarding mapping of base-space data erroneously in color space. Under special circumstances, the user settings file contained the wrong default parameters and caused the mapping to be in color space rather than base space. We recommend running mappings performed in Genomics Workbench 4.0 again with Genomics Workbench 4.0.1.
  • Fixed problem with SNP detection on large data sets suddenly running very slow.
  • Scalability improvements in mapping and de-novo assembly with drastic improvements in performance
  • Fixed various problems regarding editing alignment and read mappings.
  • In the detailed mapping report, the zero coverage section was empty when there was only one reference sequence.
  • Various smaller bug fixes.


CLC Genomics Workbench 4.0

Release date: 2010-06-15

New features:

  • Small RNA Analysis
    • Brand new tool for analyzing small RNA (including miRNA) data sets
    • Adapter trimming
    • Counting of tags
    • Annotation using miRBase and other resources
    • Visualization of miRNA variants
    • Expression analysis
  • Renaming and redefining concepts
    • Reference assembly -> Read mapping. We adjust to the common term used today for aligning sequencing reads to a reference sequence.
    • Contig -> Read mapping. The result of read mapping was previously called a contig (i.e. the alignment of reads to a reference sequence). Now, the term “contig” is used exclusively for results from de novo assembly. The result of mapping reads is called a “read mapping”.
    • Paired-end -> Paired. We now distinguish during import between Paired-end and Mate pair data. Once imported, there is no difference, and they are both called “Paired”.
  • Trim redesign
    • Brand new adapter trimming including library of adapters
    • Performance improved
    • Multiple data sets supported as input
    • Summary report of the trimming
  • Improved SAM/BAM import
    • BAM format now supported, both import and export
    • More robust implementation
    • Better performance
    • Preview panel making it easier to match reference and SAM/BAM file
    • Reference sequence name spaces automatically converted to underscores when comparing with SAM/BAM file
  • High-throughput Sequence Data Import
    • Gzip support
    • SOLiD fastq format supported (when downloading SOLiD data from Sequence Read Archive, SRA). Read more
    • 454 paired data: Support for both FLX and Titanium linkers (also the possibility to add custom non-palindromic linkers). Read more
    • Improved support for SOLiD paired-end data. Read more
    • Support for data from Illumina Pipeline 1.5. Read more
    • Import of tabular alignment files: it is now possible to specify a read name from the file to be imported with the read. Read more
  • Better compression of reference sequences (lower memory footprint and disk space usage)
  • Performance improvement of read mapping algorithm
  • Improved memory management in general: lower memory footprint and shorter management overhead pauses.
  • Improved memory handling of large tabular data sets.
  • RNA-Seq:
    • Directional RNA-Seq. Read more
    • Exon-intron reads are now counted under Total exon reads. When comparing new and old samples, please re-run the analysis on the old samples to ensure consistency. Read more.
  • New de novo assembler has replaced the old one, making the de novo assembly plug-in obsolete. Read more
  • SNP and DIP detection
    • Dialog usability improved by adding an advanced panel for advanced users
    • Minimum counts have been made more clear by creating a Minimum and Sufficient count
  • Contig report has been renamed to Detailed Mapping Report and has been split up to support data with many reference sequences (e.g. when mapping against contigs from de novo assembly). Read more.
  • Redesign of product graphics
  • Improved consistency of data handling including faster listing of folder contents
  • Performance when saving small files significantly improved
  • Performance of ACE export improved, especially for long reference sequences or read mapping tables.
  • Sequence annotations are packed to lower memory footprint and disk space usage, especially for SNP, DIP, and Conflict annotations.
  • Improved performance of reading data files from shared drives.
  • REBASE collection of enzymes updated to latest version
  • BLAST: In the overview BLAST table, it is now possible to extract query sequences. Read more
  • Process tagged sequences: it is now possible to input barcodes on a comma-separated list. Read more 
  • Folder structure (expanded/collapsed folders) is preserved through the life-time of a wizard (e.g. when selecting input data and reference for read mapping)
  • Find in Side Panel: separators are allowed when performing position search (e.g. 1.000.000 or 1,000,000 or 1’000’000 or 1 000 000). Read more
  • It is now possible to pause and restart processes involving read mapping and de novo assembly (except the accelerated mapping part of the analyses). Read more
  • Normalization of expression data: it is now possible to do “Reads per 1,000,000″-style normalization of count-based data. Read more
  • New preference group called “Data” to hold information about adapter sequences and Gateway cloning primer additions. Read more

Bug-fixes:

  • Print of folder content now takes settings in the Side Panel into account
  • Process tagged sequences of paired data: it was not possible to specify one read without sequence (necessary for Illumina barcodes using paired data)
  • Better memory handling in conflict table
  • Read mapping: fixed windows errors on large data sets, fixed color space errors
  • RNA-Seq: max number of mismatches when running color space data could be set to three in the dialog but did not take effect. Now the limit at 2 is enforced in the dialog.
  • Find in Side Panel: space are now allowed
  • Genbank import: sequence name (LOCUS) was truncated to 18 characters


CLC Genomics Workbench 3.7.1

Release date: 2010-02-04

Bug fixes:

  • Fixed error concerning naming of dots in PCA plot
  • RNA-seq: reads that extend over more than two exons are now shown correctly
  • Error in folder editor that prevented all elements to be shown is fixed
  • Documentation on trim using quality scores has been updated
  • Names of results from reference assemblies are now named according to the input data
  • Fixed error preventing manual editing of contigs under special circumstances
  • Various bug fixes


CLC Genomics Workbench 3.7

Release date: 2009-12-15

New features

  • Global alignment for long reads when running reference assembly algorithm
  • Gapped color-space alignment when running reference assembly
  • Significantly improved speed of all operations with large data sets
  • RNA-Seq analysis:
    • Performance optimization: A run of 44 mio reads against the mouse genome now takes 32 minutes on an eight-core computer with 32GB RAM. This used to be more than two hours. With the previous version, a lot of small temporary files were created and deleted, and this took a long time and impacted the comupter’s general responsiveness. In comparison, only a small fraction of temporary files are created with the new version.
    • New option to specify minimum required exon-overlap of reads spanning an exon-exon junction. Read more…
    • New RNA-Seq report which gives statistical overview of the assembly process. Read more…
    • Result table now reports number of exon-exon- and intron-exon junction spanning reads.
    • Result table now reports chromosome location of genes. Read more…
    • Visualization of reads that span exon-exon junctions. Read more…
    • Reads mapping equally well to intron-exon and exon-exon boundaries are now identified as unique exon-exon spanning reads.
    • RPKM is better defined in the user manual. Read more…
    • Default setting for multi-hits is now 10 as in the Mortazavi paper Read more…
    • Very short reads are now assembled allowing more mismatches.
  • Expression analysis:
    • Volcano plots: you can now choose the values to plot on the x-axis. Choose between “Difference” and “Fold change”. Read more…
    • Table view of bar plots shows the same intervals as are shown in the bar plot.
    • Generic importer for expression array data in tabular format. Read more…
    • Generic importer for expression experiment annotation data in tabular format. Read more…
    • Gene Ontology (GO) files can now be used to annotate an expression experiment. Read more…
    • Tag profiling: You are no longer allowed to annotate tag samples, only experiments
    • Side panel of experiment table has been re-organized to provide better overview. Read more…
  • Import high-throughput sequencing data
    • Import tool moved from Toolbox to File menu and tool bar. Read more..
    • Import and export of the SAM alignment format. Read more…
    • Import of alignment data in tab-delimited format, including the ELAND alignment format. Read more…
    • Import of Illumina QSEQ file format. Read more…
    • Linker in 454 data is also found for non-perfect matches Read more…
  • Enhanced visualization of contigs:
    • Un-aligned nucleotides on the inside of paired-end reads are now shown
    • Paired-end reads have a single line connecting the pair rather than gaps
    • Drag handles to move the aligned/unaligned border are only shown when you can see the bases of the reads. This means that you need to have zoomed in to 100% or more and chosen Compactness levels “Not compact” or “Low”. Otherwise the handles for dragging are not available (this is done in order to make the visual overview more simple). Read more….
    • It is possible to display pairs that overlap
  • The unassembled reads from an assembly now preserves their paired-end status (this also means that you can get two lists – one with pairs and one with the remainder of the broken pairs
  • SNP detection output table now reports if multiple non-synonymous SNPs exist in same codon
  • SNP detection dialog: Quality filtering is no longer disabled when quality scores are missing. Due to performance issues it is not possible to check if quality scores are present. The SNP detection will just omit the quality score filtering if quality scores are not present.
  • SNP detection: possible to detect variants with frequency less than 1 percent.
  • Contig report now includes information about coverage for both covered regions and whole reference. Read more…
  • Opening consensus sequence including gaps will also put Ns before the consensus sequence starts and after it ends
  • The trim functionality now includes the option to trim away a predefined number of nucleotides from either end of a read. Read more…
  • Gateway cloning. Simple and easy-to-use support for creating Gateway entry and expression clones. Read more…
  • Search for matches among all your saved primers. The Find Binding Sites tool has been greatly improved to now allow you to search among all your primers. In addition, you also get a tabular output of the binding sites and possible fragments. Read more…
  • In silico PCR: create PCR product based on primer pair and template sequence (including primer extensions). As part of the improved Find Binding Sites and Create Fragments tool, you can extract the PCR product from the list of fragments through a right-click menu. Read more…
  • Check primer specificity. As part of the improved Find Binding Sites and Create Fragments tool, you can search with a primer pair in a list of potential target sequences and see an overview table of binding sites and mismatches as well as potential PCR fragments. Read more…
  • Deployment
    • You can set a path to the default data location used when the Workbench starts for the first time. This is a feature to help system administrators control where new installations per default save their data. Read more…
    • Support for removing tools accessing the internet (NCBI BLAST, update notifications etc). Read more…
  • General import and export
    • Support for import of complex regions from GFF files
    • Export tables and reports in Excel format.
    • Import section of user manual re-structured to provide better overview Read more…. Expression data importers are now described in technical details in a separate section Read more….
    • You can now export multiple sequence lists in fasta format
    • Forced import of zip files is now supported (it will force import the contents of the zip file)
    • The standard import now accepts gzip and tar files as well as zip
    • If a forced import fails, there will be more technical information about what went wrong, allowing you to identify bad formatting of the import files
    • Both Genbank and gff importer now makes several attempts at naming genes that do not have a gene name. It will iteratively try the following qualifiers: “product”, “locus_tag”, “protein_id” and “transcript_id”
    • When importing genbank files where the length stated does not match the actual sequence, a warning is shown but the sequence is accepted.
    • When exporting in csv format, the Locale settings are used to determine whether comma or semi-colons should be used as delimiter (comma used for US locales)
    • GFF plug-in has been updated to accept complex annotations
  • Miscellaneous
    • Advanced retyping of annotations using the annotation table. Read more…
    • Improved reporting of situations when a full disk prevents saving of data
    • Downloading sequences using drag and drop from the search table no longer creates a “Downloading…” node in the folder. The download process can be monitored in the Processes tab.
    • Primer design now supports PCR fragments longer than 5000 bp.
    • Extract Sequences moved from File manu to Toolbox-> General Sequence Analysis. Read more…
    • Better progress feedback on various dialogs

Bug-fixes

  • Problem with order of genes when setting up RNA-Seq experiments. If the order of input sequences was not the same for all samples, the experiment would be wrong.
  • Fixed wrong orientation of SOLiD mate-pair data
  • Fixed problem with naming of tabs. The fix means that on Windows and Linux unsaved data now gets a * rather than make the tab name bold and italics. (This has always been the behavior on Mac OS X).
  • Fixed problem displaying the “Copying…” label when copying data and then updating the folder
  • Misleading label when assembling reads shorter than 15 bp. Now it says that these reads will be ignored in assembly


CLC Genomics Workbench 3.6.5

Release date: 2009-08-18

New features

  • Export of annotations in GFF format (note that annotations with joined regions are not supported)
  • Export of sequence data in fastq format
  • Now possible to perform detailed manual editing of contigs with up to 100,000 reads
  • Improved performance when zooming large contigs displaying a coverage graph
  • Now possible to change the linker used when importing 454 paired-end data

Bug-fixes

  • Fixed problems importing expression annotation files
  • Fixed error when trimming for vector sequences
  • Fixed tblastn numbering issue
  • Various bug-fixes
This update is recommended for all users.


CLC Genomics Workbench 3.6.1

Release date: 2009-07-09

Issues resolved

  • Problem when adding annotations to an Illumina array file
  • Error handling annotated tag-data
  • DNA strider files could loose name upon import
  • Rare misplacement of annotations when editing very large sequences
  • Problem when importing color space data alongside a .cas file


CLC Genomics Workbench 3.6

Release date: 2009-06-02

New features

  • Tag profiling: tag-based transcriptomics. Read more…
  • ChIP-Seq analysis is now able to (optionally) use a control sample. Read more…
  • Advanced view of elements in a folder including batch editing. Read more…
  • Create new contig from selection. Read more…
  • Import high-throughput sequencing data: you can now import without quality scores. Read more…
  • Reference assembly of short reads: user can now choose between local and global alignment Read more…
  • Reference and de novo assembly output options have been changed so that you no longer need to decide whether you want a contig table or single contigs. Whenever more than one contig is produced, the Workbench automatically creates a contig table Read more…
  • Contig report for reference assemblies: GC content of the reference sequence now included
  • Extract sequences improvements Read more…
    • Now contig tables, overview BLAST tables and RNA-Seq samples can be used
    • User feedback in the dialog is improved
    • Problem with extracting paired-end reads correctly is fixed
  • mRNA Sequencing tool changed name to RNA-Seq Analysis to reflect the consensus about this naming in the NGS community
  • Heat maps and clustering improved:
    • You can now perform different clusterings on an experiment and save them all. In the Side Panel you can switch between the different clusterings to show the corresponding heat map. Read more…
    • Terminology change in the clustering dialogs: “similarity measure” and “cluster distance metric” are replaced by “distance measure” and “cluster linkage”, respectively.
  • Annotating samples or experiments for expression analysis:
    • This is now possible even if the number of features doesn’t match the number of annotations
    • You can now decide which column in the annotation file to use for matching to the sample or experiment. Read more…
    • Because of this extra option, you can no longer include an annotation file when setting up an experiment. You need to add the annotations afterwards
  • Microarray import improved:Added support for import of more versions of native Illumina BeadChip and GEO expression files
  • “Find” in text view now accepts Enter as command to find the next hit
  • Importing VectorNTI archives previously resulted in a sequence list. Now it imports as single sequences.
  • Import list of sequences in csv format: each line in the file represents a sequence with name, optional description, and sequence. Typically useful for importing lists of oligos.
  • You can now drag results from NCBI searches into the view area to open directly (previously you could only drag into a folder to save)

Bug fixes

  • Assembly against many reference sequences could run out of memory. This is been significantly improved.
  • Integration with the Genomics Server: fixed an error when selecting contigs from a contig table for analysis. This is no longer possible (i.e. you have to save the contig first).
  • Microarray import: Fixed a bug that prevented import of expression data with white spaces in column names.
  • Various bug fixes


CLC Genomics Workbench 3.5.1

Release date: 2009-06-11

Issues resolved

  • Rare failure when importing very large Illumina files
  • Memory problem when mapping against many(>20.000) references
  • Rare concurrency issue when translating DNA->protein in e.g. SNP detection
  • Problem rendering scatter plots without lines
  • Graphics export of contigs
  • ChIP-seq table did not show the right distance to nearest gene


CLC Genomics Workbench 3.5

Release date: 2009-06-02

Data formats

  • Data generated with version 3.5 cannot be read in earlier versions

New features

Bug fixes

  • Fixed error when trimming reads for vectors
  • Fixed out-of-memory error in mRNA sequencing
  • Fixed error in mRNA sequencing when gene annotations were present outside the reference sequence
  • Fixed error when parsing files from Clone Manager (cm5-files)
  • UniProt search works again

Note

  • This version introduces a new data format which is not readable by older versions of the software.


CLC Genomics Workbench 3.2.0

Release date: 2009-03-12

New features

  • DIP detection – automatic examination and reporting of insertions/deletions in reference assembly contigs. In the Toolbox under High-Throughput Sequencing. Can be used together with SNP detection to systematically examine positions where the reads differ from the reference sequence. This eliminates the need for manually inspecting gaps and conflicts in the contig. Learn more…
  • 15% less disk space usage of imported NGS data sets.
  • 25% faster assembly of NGS data sets.

 Bug fixes

  • Under certain circumstances, trim failed on Mac OS X
  • mRNA Sequencing: Downstream/upstream options should be disabled when using un-annotated reference sequences
  • Color space information now shown per default for mixed data sets including color space reads
  • De novo assembly report: sometimes number of reverse matches were reported as negative
  • Corrections to the ACE export
  • Better performance of files with many annotations
  • Fixed an error in RNA Structure Evaluation
  • Fixed error and improved performance of Join Sequences tool
  • Fixed error in Find Binding Sites on Sequence: no longer distinguish between lower and upper case
  • Various small fixes


CLC Genomics Workbench 3.1.0

Release date: 2009-02-26

New features

  • Support for reference assembly of SOLiD data in color space (learn more). You need to reimport your data to make use of color space.
  • Viewing of color space data in contig results (learn more).
  • Option of using non-annotated sequences (e.g. EST-library) for RNA-seq (learn more).

Bug fixes

  • Assembly and mRNA sequencing errors (“Empty match not allowed” and “Could not read from temporary file”) fixed
  • Under special circumstances, quality scores were not aligned correctly
  • SNP detection with an RNA sequence as reference failed
  • SNP detection performance for annotated sequences improved
  • Find in the Side Panel did not support spaces when searching for annotations
  • In the cloning editor under special circumstances, an error occurred when replacing a selection with fragment
  • Sequence statistics codon count were not correct when using RNA sequences


CLC Genomics Workbench 3.0.1

Release date: 2009-02-03

Updates

  • Fixed an error when trimming NGS data
  • Fixed an error in the contig view when deleting a sequence that was selected
  • Fixed an error when changing the filter of a sorted table
  • Fixed error when assembling a mix of paired ends and single reads under special circumstances
  • Fixed error in import of cas file based on SOLiD data from the CLC NGS Cell
  • Fixed a rare error when running SNP detection on a contig table
  • Made mRNA Sequencing accept a sequence list as reference
    • Fixed table view of contigs: sometimes an empty entry would appear which did not reflect the reads at the current position


CLC Genomics Workbench 3.0

Release date: 2009-01-27

New features

Transcriptomics

  • Support for both microarray- and sequencing-based (RNA-Seq) expression data
  • Visualization: Interactive heat map, table and scatter plot views
  • Transformation and normalization tools
  • Quality control tools including principal component analysis, MA- and boxplots
  • Experimental design tools for two- or multiple group comparisons
  • T-tests and ANOVA analysis with support for paired/repeated measures
  • Multiple testing corrected p-values (Bonferroni and/or FDR)
  • Clustering algorithms: hierarchical clustering, k-means and Partitioning Around Medoids (PAM) with support for various distance and linkage measures.
  • Ability to import NetAffx annotation arrays and adding annotation to experiments
  • Tools for Gene Set Enrichment Analysis (GSEA) and for Hyper-Geometric based tests for overrepresented annotation categories (e.g. ‘GO’stats or specific protein pathways).
  • Ability to work with Expression Arrays and RNA-seq results at the same time, enabling comparison of results
  • Facility for annotating sequences from GFF or GTF files (as used by Ensembl and the UCSC Genome Browser), useful for annotating reference genomes before assembly
  • Statistics on numbers of matching and unique gene, exon and exon-exon boundary spanning reads
  • Calculation of gene expression measures (RPKM) from mRNA sequence data and generation of gene expression profiles (RNA-Seq analysis)
  • Discovery of novel transcripts/exons through mapping of mRNA reads to whole chromosomes or genomes, comparing matches with known exons
  • Interactive views of assemblies and derived gene expression data

Assembly

  • Long reads assembly significantly faster
  • No upper limit on number of reads in de novo assembly (there is still a limit regarding the size of the genome)
  • New simple output option for de novo assembly: only generate consensus sequence instead of full contigs. At the last step of the de novo assembly wizard, you can now choose between “Full contigs” and “Simple contig sequences”. The latter option will result in a sequence list with all the consensus sequences. This is much faster and less the demanding for the computer. You can always create full contigs later by running a reference assembly with the consensus sequences as references.
  • Quality of trimming for contamination from own sequences improved. It is now possible to trim off smaller primer sequences.
  • SNP detection:
    • Accepts multiple contigs and table of contigs (the table output includes a new column for the name of the contig)
    • For coding regions (annotated with CDS/ORF annotations): changes on the amino acid level as a consequence of a SNP is now reported (both in the table and in the annotations).
    • General performance improvements
  • Right-clicking a graph (e.g. coverage) on a contig lets you export the data points to a csv file.
  • Contig table shows latin and common name of reference sequences. This is beneficial if you perform a reference assembly against references from different species.
  • Multiplexing – Process Tagged Sequences now has an option to filter away groups with few sequences. This is an advantage if you have very ambiguous barcode definitions where sequencing errors would lead to a lot of “false” groups. These groups can now be filtered because of their small size. (The option is called “Minimum number of sequences” and is found in the third step of the wizard.)
  • Coverage info is now included when you export a table of contigs in ace format. (It contains a “Contig Tag” of type comment (a CT clause) containing a textual description of the coverage in the form “Average coverage: 14.65″. )
  • Coverage info is put into the description of consensus sequences extracted from a table of contigs (this means that if you export to fasta, this information will be included).
  • Importing assemblies with more than one contig creates multi contig tables (ace and cas file import)

Improved user experience of processes

  • Non-modal feedback from processes:
    • When there is a message (e.g. from a BLAST search: not hits found)
    • If you have chosen to save the results in the last step of the wizard, you will be notified when the process is done.
    • Processes running on the CLC Science Server will notify when they are done.
  • Possibility to open results by clicking the button next to the process
  • Possibility to find and select results in the Navigation Area by clicking the button next to the process
  • You can see a log of your process by clicking the button next to the process (even if you did not choose to see the log in the last step of the wizard)

Support for interacting with CLC Science Server

  • Read more at http://www.clcbio.com/index.php?id=1260

3D editor re-design

  • The 3D editor now allows you to select individual structure subunits, residues, active sites, disulfide bridges and even atoms, and to customize their appearance

General improvements

  • Limited mode: when using a license server – if there are no more licenses left, you can still access your data. The Workbench will then run in Limited mode where only a few tools are available (corresponds to the tools found in CLC Sequence Viewer). Click “Limited Mode” in the license dialog.
  • Tables:
    • New advanced filter to use numerical data for filtering and to combined several filter criteria. Click the small button next to the normal filter to see the advanced filter.
    • Visual feedback when sorting and filtering tables
    • Improved automatic detection of column width
  • Performance of graphs and plots improved
  • Local BLAST is upgraded to use NCBI BLAST version 2.2.19
  • More elaborate error reports including error logs
  • You can specify which folder the Workbench should use for temporary files
  • Extract sequences from a sequence list, contig or alignment by right-clicking the white empty space. You will then be able to extract the sequences into a list or as separate sequences.
  • The “Find” option in the Side Panel of sequence views automatically detects if you have entered a position instead of a sequence.

Plug-ins

  • Extract Annotations plug-in has been improved:
    • Possibility to specify the naming of the sequences (based on annotation name, type etc)
    • Performance improvements to make it possible to extract annotations of large genomes.
  • MLST plug-in: various bug fixes

Bug fixes

  • Locale settings were not automatically set right on the first start-up. The locale settings determine whether . or , should be used for before decimals. For new installations of the Workbench, it will now be set to the locale of the computer’s operating system. For existing installations, you will have to change this in the Edit->Preferences dialog.
  • Fixed problem when BLASTing with an empty sequence
  • Various performance improvements and bug fixes


CLC Genomics Workbench 2.1.1

Release date: 2008-11-21

Updates

  • Reference assembly: fixed an error which meant that in some cases, reference assembly produces different results depending on the amount of memory available.
  • SNP Detection: Reads were dismissed because of gaps even though the reference sequence also had gaps.
  • The Side Panel’s Find only high-lighted the first hit. This is now fixed.
  • Fixed error when importing 454 fna/qual files
  • Extract sequences: fixed an error when extracting paired-ends sequences from contigs and sequence lists
  • Local BLAST: solved problem applying command-line parameters, now a checkbox determines whether command-line options should take effect
  • BLAST: it was possible to use a BLAST result as input and database
  • Trace data: fixed an error when deleting parts of an unsaved sequence with traces
  • Better performance when zooming a dot plot
  • Better performance when using the Side Panel’s Find in large contigs and sequence lists
  • When right-clicking a CDS annotation and translating into protein, gaps were erroneously introduced into the protein sequence
  • There was an error related to selecting sequences in the Cloning editor
  • Multi-select (using Ctrl / Command key) did not work for sequence lists
  • Various bug fixes


CLC Genomics Workbench 2.1

Release date: 2008-11-11

Updates

  • Support for paired-end Sanger reads
  • Support for paired-end FASTA reads
  • Improved user interface of High-throughput Sequencing Data import dialog
  • Assembly report includes information about assembly parameters
  • Corrected error when opening multiple consensus sequences
  • Fixed problem with import of NGS data in FASTA format
  • Improved error handling for assembly
  • Fixed issue with contig selections while scrolling
  • Corrected error introduced by overlapping mate-pairs


CLC Genomics Workbench 2.0.4

Release date: 2008-10-08

Updates

  • Fixed problems when assembling large or mixed data sets
  • Ensured correct setting of limit for assembly of short reads


CLC Genomics Workbench 2.0.3

Release date: 2008-10-02

Updates

  • Fixed problems with de novo assembly
  • Status properly updated when a conflict is resolved
  • Assembly programs now run on older version of Linux


CLC Genomics Workbench 2.0.2

Release date: 2008-10-02

Updates

  • Fixed problems when scrolling very large sequences
  • Fixed problem when importing very large GenBank files
  • Improved possibilities for navigating contigs
  • Improved stability when importing non-standard data
  • Improved memory handling and stability of assembly algorithms
  • Support for import of Illumina long insert paired-end data


CLC Genomics Workbench 2.0.1

Release date: 2008-09-18

New features

General performance improvements

  • Improved performance when handling large data sets

High-throughput Sequencing Assembly

  • Support for reference assembly against the human genome (i.e. reference sequences of any size)
  • New and much faster algorithm for assembling short reads (less than 55 nucleotides)
  • Significant performance improvements of reference assembly.
  • True support for reference assembly of mixed data sets in one go. Sequencing data from different platforms (and both single and paired ends) can now be assembled together. Previously this could be accomplished by making separate assemblies and joining the contigs afterwards, but now this process is automated. Read more…
  • Reference sequences can be masked based on annotations. This could be used to e.g. mask off repeat regions or only include exons in the assembly. The reference sequences have to be annotated in order to use masking. Read more…
  • Assembly report includes the number of contigs produced
  • Contigs from a reference assembly can also be shown in an overview table. This was previously only possible for De novo assembly. In the last step of the reference assembly wizard, there is an option: Create overview table including all contigs. Read more…

Import and export

  • Support for high number of Sanger sequencing data with trace information. Using the Import functionality under High-throughout sequencing you can import huge amounts of e.g. abi files. This will import quality scores but discard trace data to produce a sequence list in the Workbench which makes it possible to assemble thousands of Sanger reads. Read more
  • SOLiD import of paired-ends data improved. In some cases paired-ends data also contains single reads which are now removed during import.
  • Possible to import cas files created by the CLC NGS Cell (the command-line version of the assembly algorithms of the CLC Genomics Workbench)
  • Contigs can be exported in ACE format
  • Improvement of ACE file importer
  • Trim information in sff files can be used during import
  • Support for import of SCARF files (from Illumina Genome Analyzer systems)
  • Export of graph data points in csv format

Various high-throughput sequencing improvements

  • SNP detection now also reports position relative to the reference sequence as well as the consensus sequence. The table includes both positions per default (can be checked on and off), and the user decides where annotations should be added. Read more…
  • SNP detection table includes information about the name of annotations covering the SNPs. Previously only the annotation type was reported.
  • Trimming now also supports paired-ends data. If one of the reads in a pair is trimmed off, the whole pair will be removed.
  • Partially matched reads are reported as a graph along the contig.
  • Possibility to open consensus sequence with gaps. Right-click the label of the consensus sequence in the contig view and select: Open Copy of Sequence Including Gaps. The gaps will be represented by Ns in the new sequence.
  • Dynamic consensus graph removed from contig view. Since contigs now have a “real” consensus sequence which is also updated to reflect changes in the reads, the dynamic consensus sequence which is switched on in the Side Panel has been removed.
  • Annotations can be transferred from reference to consensus sequence in bulk. Right-click one of the annotations and choose “Copy to Consensus Sequence” or “Copy Annotations of Type xx to Consensus Sequence”.
  • Multiplexing now also possible for paired-end reads

Plug-in updates

  • New plug-in! GFF/GTF support: You can now annotate a sequence using a GFF/GTF file. The plug-in is available for all Workbenches (not CLC Sequence Viewer). Once installed, you find it in Toolbox->General Sequence Analysis-> Annotate from GFF/GTF File. Read more…
  • Extract annotations plug-in updated: it now uses the name of the annotation as the name of the new sequence.

Annotation handling

  • Annotation table has been greatly improved:
    • supports very long, heavily annotated genomes
    • usability of the filtering has been improved with feedback on the filtering process
  • Advanced renaming options. Read more…

Bug fixes

  • Fixed bugs related to contig editing.
  • Various bug-fixes.
  • Fixed problem with import in 2.0 release.


CLC Genomics Workbench 1.1.1

Release date: 2008-07-10

New features

  • Scrollbars can be adjusted manually

Problems fixed

  • Fixed problems when aligning sequences with lowercase characters
  • Fixed import of trace files without quality scores
  • Fixed problem when removing location
  • A new sequence list can be created from a selection in the table view
  • Better memory handling and managment of large contigs
  • User definable scrollbar areas for contig views
  • A few other minor bugs have been fixed.


CLC Genomics Workbench 1.1

Release date: 2008-06-27

New features

  • Increased speed of de novo assembly
  • Option of generating a contig table as a result of de novo assembly. This way the workspace is not polluted by a large number of contigs.
  • Multi contig table has options for opening contigs or extracting consensus sequences for further analysis
  • Much smoother scrolling on contigs when there is very high coverage

Problems fixed

  • Problems with import of .ACE files
  • Problems with excessive generation of files when doing de novo assembly of short read data
  • Problems with the use of quality scores in SNP detection


© QIAGEN 2017. All rights reserved - Trademarks & Disclaimers