Introduction

The requirements to make an iceLogo are a multiple sequence alignment and a reference set.

The easiest way to start is to just jump in with the test data set and start to experiment with the different settings. Each example given in the visualisations chapter of the manual is directly usable and tweakable.

For the usage of the SOAP service we also offer examples that can be used as a starting point to generate the images you require.

Multiple sequence alignment

The provided multiple sequence alignment can contain the 20 classic amino acids, an X to denote an unknown or a dash (-) to denote a gap at that location. This multiple sequence alignment has to be the raw alignment, extra information will cause a failure. Many programs such as clustalW and muscle offer this option. Fasta format is accepted though headers will be discarded and not taken into account in any way while generating logos.

While there is a limit to how many characters can be shown on one line in the the positive and negative set text boxes, iceLogo considers every new line (control character new lines such as \n) to be a single sequence. Empty lines will be discarded.

Reference set

The reference set can come in two flavours, a predetermined variation of amino acids compiled from uniprot fasta files, or a variation compiled from a provided fasta. The website offers a comprehensive list of species that have a pre-compiled amino acid frequency reference set to be used as background. The same rule is used here as with the multiple sequence alignment, only when a new line is present, will the lines of aminoacids be considered a complete sequence and empty lines will be discarded.

Different visualisation methods

General information about the different visualisation methods

Icelogo offers different visualisation methods. When using the website to generate images, the colours of the amino acids can be changed to fit the required view. This is impossible with the SOAP service, where a significantly regulated amino acid is always coloured pink.

IceLogos

While an iceLogo has a lot in common with sequence logos, especially visually, it has a number of advantages. It will always use a reference set, and it allows for changing the scoring system used.

Reference sets

For more information about the different reference sets, please see the dedicated section.

Scoring methods

Percentage difference

This is the default scoring method. This compares the frequency percentage of an amino acid at a certain location in the multiple sequence alignment and the reference set. iceLogo uses the results of these difference between observed and reference set comparisons of each amino acid to create the requested logos.

Fold change

Primarily aimed at finding regulation of low abundance amino acids, the fold change is a measure of the variability between a beginning and a final element. Concrete for iceLogo this means that the changes across the sequence are emphasized and not the values of each element. The values used are these fold changes that are observed. Fold changes in the down regulated sense are translated to their negative component (for example a fold change of 0.5 indicates it is twice as down regulated, which translates to -2). This helps to remove systematic errors that might be present. Please keep in mind that having large differences but small ratios between the elements leads to a high miss rate.

Additional colouring rules

When using the SOAP server to create iceLogos, there are some specific rules when using the fold change. Because a fold change can become infinite, we default back to a certain height.

  • When only one amino acid is regulated, the height will be the maximum height possible in the iceLogo.
  • When multiple amino acids are regulated, and every amino acid sizes are infinite, the height of the amino acids will be the maximal height divided by the number of regulated amino acids on the position.
  • When multiple amino acids are regulated, and some amino acid sizes are infinite, the height of the infinite amino acids will be 10% larger than the largest non infinite amino acid.

Examples

Example set with percentage difference
Example set with fold change

The SOAP service

What is SOAP?

SOAP stands for Simple Object Access Protocol and serves to allow programmatic access to the backend of a service. This allows for an automation of requests and is much faster for the generation of images in bulk in the case of iceLogo.

Description of the SOAP server functionality

The full definition of all calls and their parameters can be found in the wsdl file. Examples of how the SOAP server works can be found on the dedicated page.

Elaborate manual

For more insights on the different visualization methods and statistics, download the pdf manual here.