header
mainbutton manualButton generatebutton soapButton
Introduction

Large sequence-based datasets are often scanned for conserved sequence patterns to extract useful biological information1. Sequence logos2 were the first to visualize conserved patterns in oligonucleotide and protein sequences and rely on Shannon’s information theory to calculate the conservation level amongst all positions in a multiple sequence alignment. A sequence logo is a histogram-like presentation in which bars are vertical stacks of symbols, the stack height reflects the level of conservation and the height of individual symbols is a measure for their frequency at a given position. In a statistically sound manner however, no tool can compare an experimental peptide or protein sequence set to the background of species-specific natural occurrences of amino acids, to a position-specific background set, or to a background set that is influenced by the experimental protocol. In addition, underrepresented elements – non-tolerated amino acids or nucleotides – are generally not or not statistically well presented.

Recently we introduced iceLogo3 which takes the analysis and visualisation of consensus patterns in aligned peptide sequences to a new level. IceLogo is a free, open source Java application that can be downloaded at http://icelogo.googlecode.com/. Here we present an iceLogo web application and a SOAP web server. Instead of relying on the information theory, iceLogo builds on the probability theory. This theory and the iceLogo algorithm is explained in the manual. Basically the algorithm takes the experimental set normally used to generate a sequence logo and compares it with a reference set. This reference set can be configurated by the user allowing it to be tailored to ideally approximate the expected background distribution. The experimental sequence set is generally a multiple sequence alignment of peptides that are expected to share sequence features. These two set will be used in a probability analysis and the result is shown in complementary illustrations like heat maps, amino acid parameter graphs and so-called iceLogos, which were all developed to aid analysis, visualisation and understanding of consensus sequences in an intuitive way.

1. Hulo, N. et al. Nucleic Acids Res 36, D245-249 (2008).
2. Schneider, T. D. & R. M. Stephens Nucleic Acids Res 18, 6097-6100 (1990).
3. Colaert, N. et al. Nature Methods 6, 786-787 (2009)

Reference

If you use the iceLogo web application or the iceLogo stand alone version do not forget to reference the iceLogo publication.
Colaert, N. et al. Nature Methods 6, 786-787 (2009)

Acquiring icelogo

A stand alone java application that can generate iceLogos can be found here. The iceLogo server and SOAP client examples can be found here.

iceLogo stand alone version iceLogo server version Computational proteomics VIB University Ghent