Skip to content
/ pAss2 Public

Continuation analysis tools from pass, a gene centric assembly tool but written in python

Notifications You must be signed in to change notification settings

etheleon/pAss2

Repository files navigation

Introduction

Continuation analysis tools from pAss, a gene centric assembly tool but written in python

Shannon Entropy

We calculate the shannon entropy for each base position

def _shannon_entropy(self,list_input):
    #self.msaFile
    unique_base = set(list_input)                           # Get only the unique bases in a column
    #unique_base = unique_base.discard("-")
    M   =  len(list_input)
    entropy_list = []
    # Number of residues in column
    for base in unique_base:
        n_i = list_input.count(base)                        # Number of residues of type i
        P_i = n_i/float(M)                                  # n_i(Number of residues of type i) / M(Number of residues in column)
        entropy_i = P_i*(math.log(P_i,2))
        entropy_list.append(entropy_i)
    sh_entropy = -(sum(entropy_list))
    #print sh_entropy
    return sh_entropy

Shannon Entropy applied on the max diversity of a given KO shannonplot

Shannon Entropy applied on the max diversity of 31 Single Copy Genes shannonPlotSCG

Spanning Analysis

We identify contigs which span the whole length of the max diversity region; spanning is one fo the criteria.

Following plot shows the length of the contigs’ tailing ends before and after the MDR for 31 Single Copy Genes ribbon

About

Continuation analysis tools from pass, a gene centric assembly tool but written in python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages