“School of Biological”

Back to Papers Home
Back to Papers of School of Biological

Paper   IPM / Biological / 13164
School of Biological Sciences
  Title:   Assignment Of Protein Sequences To Protein Family Profiles Using Spatial Statistics.
1.  Vahid
2.  H. Pezeshk.
3.  M. Sadeghi.
4.  C. Eslahchi.
  Status:   Published
  Journal: MATCH Communication in Methematical and in Computer Chemistry
  No.:  1
  Vol.:  69
  Year:  2013
  Pages:   7-24
  Supported by:  IPM
A central problem in genomics is to determine the functions of newly discovered proteins using the information contained in their amino acid sequences. In this research we introduce a novel spatial association on a regular lattice for assignment of a protein sequence to a protein family. In our model we assume that for each residue in any position in sequence, not only the adjacent residues, but also the residues of closer homologs contain information. For this purpose we model the observation with auto correlated errors on a rectangular grid and use the information of the left, right, top and bottom residues of each amino acid in any position in a multiple sequence alignment (MSA) of the query sequence with members of each family. The spatial statistics for analyzing these observations is applied and the classification problem is solved by computing the probability of query sequence belonging to each protein family. The classification is based on the family whose MSA yields the highest probability. Using actual data, the application of spatial prediction for assignment of protein sequence to the protein profiles is proposed and the performance of the model is assessed. According to the spatial associations on a regular lattice, we use top ten profiles in the Pfam database that are very different from each other for analyzing amino acid sequences in a profile. Results show that in all cases protein sequences are assigned correctly to the corresponding protein profiles.

Download TeX format
back to top
scroll left or right