QIAGEN powered by

Latest improvements for CLC Main Workbench early access

  Current line         Previous line          Archive

CLC Main Workbench early access 20.0

Release date: 2019-11-19

These are the draft release notes for CLC Main Workbench 20.0, due for release on December 11, 2019. The draft manual is available in PDF and HTML format. Installers for this product are available as "early access" via links at the bottom of this page. These products are not supported, and we recommend they are not used in production during the early access period. To download a commercial license for this product, you must have a license covered for Maintenance, Upgrades and Support (MUS). A 2 week evaluation license is available via the License Manager within the software.

New features

Protein structure and homology
  • Generate Biomolecule A new tool available from the side panel of Molecule Projects allowing biomolecules to be generated or extracted based on symmetry information in PDB files.
  • Find and Model Structure A new tool that finds suitable protein structures for representing a given protein sequence. From the resulting table, a structure model (homology model) of the sequence can be created by one click using one of the found protein structures as template.
  • Download 3D Protein Structure Database A new tool for downloading a curated database containing sequences with known 3D structures obtained from the Protein Data Bank
  • Molecule structures in a Molecule Project can now be exported to a PDB format file.
  • Search for PDB Structures at NCBI is now available when running the Workbench in Viewing Mode.
Workflows
  • When launching workflows, batch units can now be defined using metadata, supplied either as a CLC medata table or by selecting an Excel format file containing information about the data.
  • Workflows with multiple inputs, where those inputs should be matched with each other, can now be launched in batch mode, making use of the ability to define batch units based on metadata. For example, a workflow where sets of reads should be mapped to different reference sequences can now be launched in batch mode.
  • Two new workflow elements have been introduced, Iterate and Collect and Distribute, which allow workflows to be designed where the execution of different parts of a workflow can be finely controlled. For example, using these elements, a single workflow can contain an analysis step that will be run once per sample, as well as elements that should only be run once for a set of samples.
  • Workflows now produce a Workflow Result Metadata table, which contain one row per output, with the relevant data element associated with that row. When launched in batch mode, the batch the row relates to is clearly indicated.
  • CLC sequence files not yet imported can be imported on the fly, as an initial action when a workflow is run, avoiding the need import such data prior to launching the workflow. This is mostly of relevance if sharing data with other sites.
Export
  • Reports can now be exported in PDF format.
  • When exporting to PDF, there is now an option to export the history of the report.
BLAST
  • A new option for the BLAST tool called Filter out redundant results, will when enabled cull HSPs on a per subject sequence basis by removing HSPs that are completely enveloped by another HSP.
  • The NCBI blast executables have been updated to version 2.9.0.
  • The option "Choose filter to mask low complexity regions" has been renamed to "Mask low complexity regions".

Improvements

Workflows
  • All installed workflows can now be updated in a single operation from the Workflow Manager using the new Update All Workflows button.
  • Placeholder-based naming of outputs in workflows can now be configured at a finer level: the {input} or {2} placeholder is now replaced by the name of the first workflow input by default. This can then be further configured to use the names of other inputs by specifying them by number after a colon in the placeholder. For example: {2:1,3} would be replaced by the names of workflow inputs number 1 and 3. Previously, a workflow output configured as {2} was replaced by a concatenation of all the workflow input names.
  • The listing of items in the "Add Element" dialog in the Workflow Editor has been made improved. Installed workflows in the workbench, and no longer matches texts in the tooltips of the tools.
  • When running a workflow configured to use reference data, the Reference Data Set selection step has been updated to show the list of preconfigured elements in the tooltip.
  • The "Export to PDF" tool can now be used in workflows to export reports in PDF format.
Performance improvements:
  • Saving analysis results to an SSD is now considerably faster.
  • The import of ZIP files has been improved: temporary objects are cleaned up during the import process, reducing the required disk space.
  • Moving and deleting many elements at once is now faster.
  • Emptying the Recycle bin now takes place in the background.
  • Messages from tools are no longer presented in the form of black bubbles in the Processes area. Messages are still writtent to the log.
  • There are general performance improvements in the following areas:
    • The Navigation Area
    • BLAST and Add attB Sites tools when using large sequence lists
    • Opening large protein sequences
    • Making BLAST databases where most sequences have the same name.
Import and export
  • When exporting to PDF, there is now an option to export the history of the report.
  • The CSV importer has been updated:
    • Values no longer need to be enclosed in quotation marks in the CSV file to be successfully imported.
    • Data values starting with a numeric character but also containing non-numeric characters are now interpreted as text. Such values were previously converted to numbers and then only imported up to the first non-numeric character.
  • The import of Nexus files has been updated to more closely match the format specifications.
  • When selecting files to import from an import/export directory on a CLC Genomics Server, right-clicking on a folder name now brings up a menu with the options: "Add the content of a directory" or "Add the full content (recursively) of a directory".
  • The "Excel 2010" and "Excel 97-2007" exporters now export NaN and +/-Infinity values to #N/A.
  • When importing multiple files using the Standard Import, the process ends with an error if at least one of the files failed to import. The details of which file failed and why can be seen in the log.
  • The GenBank exporter now replaces any spaces in annotation names with underscores.
Searching
Metadata related
Create Box Plot
  • Create Box Plot now calculates the median and percentile values in the same way as the "quantile" method in R.
  • Whiskers of boxplots now range from the lowest data point within 1.5 times the inter quartile range (IQR) of the lower quartile and the highest data point within 1.5 IQR of the upper quartile. Previously, they extended 1.5 times the length of the box (IQR).
Other improvements
  • Outputs of tools provided by plugins now include the plugin name and version in the element history.
  • A new option when right-clicking on a table cell, Edit | Copy Cell, allows individual cells to be copied to the clipboard. Previously only whole rows could be copied
  • Tool and workflow logs now display an "Elapsed time" column.
  • In the tree view of phylogenetic trees, the "Reset Tree Toplogy" button will now also uncollapse any collapsed nodes.
  • The name of a non-default workspace is now shown in the Workbench title bar.
  • The table view ("Show Table") of plots has been improved in the cases where multiple data series are shown in the plot. The table now includes all of the x values from all data series, instead of the x values from just the first data series. If a data series is missing a y value for a specific x value, than the entry in the table will be empty.
  • The maximum size of a plot in a report displayed in the Workbench has been increased too 800 pixels, and the width/height ratio has been changed from 2/3 to 1/2.
  • The ranking of search results in Quick Launch has been improved.
  • CLC URLs have been made more compact.
  • The icon for the sequence view has been changed for protein sequences, so it is possible to distinguish protein sequence views from nucleotide sequence views based on the icon.
  • In the Reference Data Management dialog, the "usable free space" is shown instead of the previous "free space".
  • In the Batch Rename tool, the option 'Replace part of the name' fields have changed from 'From' and 'To' to 'Replace' and 'With' for clarification.

Bug fixes

  • Fixed an issue affecting mac OS X setups with accessibility settings enabled, where the "Replace Selection with Sequence" functionality available from within the Cloning editor could fail with an error.
  • Fixed a bug where workflow installer files did not include the specified icon.

Changes

  • The Java version bundled with CLC Main Workbench 20.0 is Java 11, where we use the JRE from AdoptOpenJDK.
  • Using Local Search, searches for sequences with a specific length or length range now only returns individual sequence elements that meet the search requirements. Previously searches were also done within other types of elements, e.g. sequences lists, read mappings, etc.
  • The Help -> Tutorials menu item has been replaced with Help -> Online Tutorials, which opens the online tutorials in a browser.
  • The Reverse Sequence tool has been moved to the Legacy folder of the Workbench Toolbox and "(legacy") appended to its name. It will be removed in a future version of the software.

Plugin notes

New plugins

  • Navigation Tools Provides the functionality formerly provided by the Bookmark Navigator and Recent Items Navigator plugins.
  • SignalP and TMHMM Provides the functionality formerly provided by the SignalP and TMHMM plugins

Plugin retirements

The following plugins have been retired, with their functionality being provided by a new plugin:
  • Bookmark Navigator
  • Recent Items Navigator
  • SignalP
  • TMHMM
The following plugins have been retired and their functionality is no longer available through the CLC Main Workbench:
  • PPfold
  • TRANSFAC

Advanced notice

The Reverse Sequence (legacy) will be removed in a future version of the software. The "Run in Batch Mode..." functionality for installed workflows with multiple inputs will be retired in a future release. Workflows with multiple inputs can now be launched in batch mode by checking the "Batch" checkbox when selecting input data. If you are concerned about these proposed changes, please contact our Support team by emailing [email protected].  

Early Access installers

These products are not supported, and we recommend they are not used in production during the early access period.