• No results found

3. Materials and Methods

7.2 Background Frequencies Computed from SwissProt Sequences Amino Acid Frequency

7.3.2 conservation::Statistics Module

SYNOPSIS

use conservation::Statistics;

#create a statistics object, using the alignment object

$stats = conservation::Statistics->new(-alnObj => $para->{'-alnObj'}); #another form of creation, using file directly

$stats = conservation::Statistics->new(-file => 'filename', -format => 'alignment_format');

#get the unweighted frequencies for residues in the alignment columns $uwFreqs = $stats->uwFrequencies();

#get the independent count based frequencies for the residues in the #alignment

$icFreqs = $stats->independentcounts();

#get the Henikoff and Henikoff (1994) based frequencies for the residues #in the alignment

$wFreqs = $stats->wFrequencies();

#get the distribution of each amino acid in the whole alignment using the #unweighted scheme

$uwTFreqs = $stats->uwTotalFrequency();

#get the distribution of each amino acid in the whole alignment based on #the ind. count scheme

$icTFreqs = $stats->icTotalFrequency();

#get the distribution of each amino acid in the whole alignment based on #Henikoff and Henikoff (1994) scheme

$wTFreqs = $stats->wTotalFrequency();

DESCRIPTION

The statistics class provide the basic statistics needed for the calculation of various

conservation measures. It directly relies on the conservation::AlnWrapper Object to make available various sub-parts of the alignment and the conservation::Weights objects to make available the calculated weights of various sequence in the alignment, as well as the independent counts of various amino acids in an alignment column.

METHODS

Title: new

Usage: $stats = conservation::Statistics->new(-alnObj =>

alignment_object);

$stats = conservation::Statistics->new(-file => 'filename', -format => 'alignment_format');#another form of creation, using file directly

Function: Creates a new conservation::Statistics object Returns: conservation::Statistics object

Args: alignment_object - Bio::Align::AlignI compliant object.

filename: Name of the file containing the alignment.

bioperl will try to guess the alignment format.

Title: gapChar

Usage: $gpchar = $stats->gapChar();

Function: Returns the character used to represent gaps in the current

alignment

Returns: Character Args:

Title: getBGFreqs

Usage: $bgfreq = $stats->getBGFreqs();

Function: Returns the background frequency of the amino acids as pre-

calculated from amino acids in SwissProt.

Returns: Reference to a hash, in which the amino acids are the keys and

the values are the frequency of each amino acid from SwissProt.

Args:

Title: uwFrequencies

Usage: $uwFreq = $stats->uwFrequencies();

Function: Calculates and returns the unweighted frequencies of each

amino acid in each column of the alignment.

Returns: Reference to a hash of a hash, with the first hash having as

its key indexes to column in the alignment and its value being a reference to another hash having as its key amino acids in the column and its values frequencies of those amino acids.

Args:

Title: uwPositionalAAFreq

Usage: $uwCount = $stats->uwPositionalAAFreq();

Function: Calculates the unweighted count of each amino acid at each

position.

Returns: Reference to a hash of a hash, with the first hash having as

its key indexes to column in the alignment and its value being a reference to another hash having as its key amino acids in the column and its values counts of those amino acids.

Args:

Title: uwTotalPositionalFreq

Usage: $uwCountSum = $stats->uwTotalPositionalFreq();

Function: Calculates the sum of the unweighted counts for each

position.

Returns: A hash reference, with the keys being column index and the

value the sum of the unweighted counts

Args:

Title: independentcounts

Usage: $indCounts = $stats->independentcounts();

Function: Calculates and returns the independent counts for the entire

alignment.

Returns: A hash reference, with the keys being column index and the

value being a reference to another hash, having the amino acids in the column as keys and their estimated independent count as values.

Args:

Title: wFrequencies

Usage: $wFreq = $stats->wFrequencies();

Function: Calculates and returns the weighted frequency for the whole

Returns: A hash reference, with the keys being column index and the

value being a reference to another hash, having the amino acids in the column as keys and their weighted frequencies as values.

Args:

Title: wPositionalAAFreq

Usage: $wCounts = $stats->wPositionalAAFreq();

Function: Calculates the weighted count of each amino acid at each

position.

Returns: Reference to a hash of a hash, with the first hash having as

its key indexes to column in the alignment and its value being a reference to another hash having as its key amino acids in the column and its values weighted counts of those amino acids.

Args:

Title: wTotalPositionalFreq

Usage: $wCountSum = stats->wTotalPositionalFreq();

Function: Calculates the sum of the weighted counts for each position. Returns: A hash reference, with the keys being column index and the

value the sum of the unweighted counts

Args:

Title: uwTotalFrequency

Usage: $uwTFreq = $stats->uwTotalFrequency();

Function: Calculates the unweighted frequency of each amino acid in the

whole alignment.

Returns: A hash reference, with the keys being amino acids and the

values the associated alignment-wide frequency.

Args:

Title: icTotalFrequency

Usage: $icTFreq = $stats->icTotalFrequency();

Function: Calculates the frequency of each amino acid in the whole

alignment based on the independent count scheme.

Returns: A hash reference, with the keys being amino acids and the

values the associated alignment-wide ind. count frequencies.

Args:

Title: wTotalFrequency

Usage: $wTFreq = $stats->wTotalFrequency();

Function: Calculates the weighted frequency of each amino acid in the

whole alignment.

Returns: A hash reference, with the keys being amino acids and the

values the associated alignment-wide weighted frequency.

Args:

Title: getCutoffIndices

Usage: $cutoffInd = $stats->getCutoffIndices(-cutoff => cutoff_value, -

method => 'method_value');

Function: Returns the column indices of those columns that have the

percentage gaps in them less than that specified as cutoff.

Returns: A hash reference, containing as keys only the indices of the

columns to be retained and the values set to 1.

Args: cutoff_value - figure between 0 and 1 specifying the limit

above which the percentage gap in a column must be for the column conservation score to be discarded.

method_value - could be 'hh94' (for Henikoff and Henikoff (1994) weights), 'indcount' or it could be left out

(unweighted schemes used for indcount and unweighted frequencies).

Title: percentageGaps

Usage: $pgaps = $stats->percentageGaps(-method => 'method_value'); Function: Calculates the weighted frequency of each amino acid in the

whole alignment.

Returns: A hash reference, containing as keys indices of each columns

and the percentage of gaps in the column as values.

Args: method_value - could be 'hh94' (for Henikoff and Henikoff (1994)

weights), 'indcount' or it could be left out (unweighted schemes used for indcount and unweighted frequencies).

Title: maxF

Usage: $mFreq = $stats->maxF(-freq => $frequencies);

Function: Gets the residues having the maximum frequency in a column. Returns: A reference to an array containing the residue(s) having the

maximum frequency.

Args: frequencies - A hash ref containing the residues in a column and

their associated frequencies

DEPENDENCIES

Related documents