Home
About Bioinformatics
Bioinformatics or computational biology is the use of techniques from applied mathematics, informatics, statistics, and computer science to solve biological problems. Research in computational biology often overlaps with systems biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, and the modeling of evolution. The terms bioinformatics and computational biology are often used interchangeably, although the latter typically focuses on algorithm development and specific computational methods. (In the biology-mathematics-computer science triad, bioinformatics will intimately involve all three components while computational biology will focus on biology and mathematics.) Due to interest from computer scientists and mathematicians and the popularity of computational techniques in the field of genomics, it is commonly referred to as computational biology; a more accurate term is computational genomics. There are also lesser known but equally important areas of computational biochemistry and computational biophysics, that are also a part of computational biology. A common thread in projects in bioinformatics and computational genomics is the use of mathematical tools to extract useful information from noisy data produced by high-throughput biological techniques. (The field of data mining overlaps with computational biology in this regard.) Representative problems in computational biology include the assembly of high-quality DNA sequences from fragmentary "shotgun" DNA sequencing, and the prediction of gene regulation with data from mRNA microarrays or mass spectrometry.
Current Month
 12345
6789101112
13141516171819
20212223242526
2728293031
May. 21st, 2007 @ 02:31 pm Genome Annotation Paper
Other news my genome annotation paper is progressing well, still in talks with the JGI as to when and how we publish it.

About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
May. 21st, 2007 @ 02:23 pm PMF Consensus Profiles For Cross Species Proteomics
Current Location: Dover St, Manchester, UK
Current Mood: cheerful
Well after collecting MALDI-TOF spectra last week from my "Zoo Gels" (1d SDS-PAGE Gel electrophoresis of "Rat", "Chinchilla", "Vole", "Pig", "Sheep", and "Goat" muscle protein samples) I have begun compiling the results into consensus spectra.

Looks like I can only really do it for "Creatine Kinase" at this stage because I didn't get many Mascot protein identifications, mainly because there are not many proteins sequenced for most of these species and that the samples where quite old and I'm not very sure that the spectra are of any usable quality.

Anyway it does seem to work which could mean that I will be over in Liverpool collecting lots of Peptide Mass Fingerprint data in the near future.
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
May. 21st, 2007 @ 02:13 pm Joint BSPR / EBI Proteomics Conference 2007
Current Location: Dover St, Manchester, UK
Current Mood: busy
Tags: , , ,
I have decided to attend this years
Joint BSPR / EBI Proteomics Conference 2007 in Hinxton Cambridge (not a massivly exciting location but lots of interesting speakers)

Ruedi Aebersold is speaking which should be very interesting as I referenced a lot of his work in the review paper I wrote for CCHTS.



Rob will fund me but I could do with trying to find some travel grants from somewhere...
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
May. 17th, 2007 @ 10:52 am Update
Current Location: Brownlow Hill, Liverpool
Current Mood: busy
Right I really need to start using this account more and try and keep upto date with my work. Will come in very useful for thesis writing.

So in the world of Proteome Informatics, I have a review paper comming out sometime soon in the Journal of Combinatorial Chemistry and High Throughput Screening  (CCHTS) not a massivly prevalent journal but its my first publication so its just the start. The review its self is quite interesting providing a good introduction to the field of proteome informatics as well as discussing some of the very lastest developments to have come about (namely proteotypic peptide prediction, miss cleavage prediction and new peptide scoring systems).

In other news am writing a research paper about using proteomics data to annotate genomes, and have a new project building consensus spectral libraries.

Anyway goto run am booked onto the MALDI-TOF now to get some more PMF data.
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
Mar. 27th, 2006 @ 12:51 pm GENOMES to SYSTEMS Manchester 2006
Well I spent most of last week at the Genomes to Systems conference in the GMEX here in Manchester. There was a good turnout about 900 people quite a few I knew as well - everyone from the exUMIST/Manchester Bioinformatics, the guys from Protein Functions Group in Liverpool, a few people from my Post Genomes MSc course, the guys from GAPSIA and even a few of my colleges from AstraZeneca.

In general there was a lot of interesting talks although as always happens at these events alot of the talks went a bit off topic but there was a lot of choice in what I went to see.

SUMMARY OF TALKS I ATTENDED )

All in All twas a great seminar look forward to its next incarnation in a few years time.
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
Mar. 21st, 2006 @ 01:48 pm Relative protein sequence shannon entropy
Current Mood: busy
So my 6 sets of protein profiles are almost built after fighting with the cluster to get it to run everything. (I have clustered proteins from two different protein sequence datasets, one containing 5 yeast species and one containing 20 vertebrate species including human , I have used three different methods to cluster each of the datasets, PSI BLAST, MCLTRIBE, and PFAM . Hence 6 sets of profiles. Each cluster is then aligned using MUSCLE)

I am now looking at how to compare different protein clusters from a peptide (digested protein fragment) point of view, I have decided to calculate 3 statistics for each profile.
For each cluster I will select a QUERY protein which will act as anchor for all my calculations
(I will probably use the Aspergillus Fumigatus proteins in the yeast data and the Human proteins in the Chordata data)

1. Split each protein in a cluster on lysine (K) and arginine (R) amino acids (as if being digested by the proteolytic enzyme trypsin) .
This will divide each protein into a number of peptides I will calculate the standard deviation of the peptide number from the Query Protein.

2. I will then calculate the average RELATIVE SEQUENCE ENTROPY for each peptide. This is a calculation based on information theory which can be used to score how well each position in a sequence is conserved between all the proteins in a cluster. I have written a script to build a matrix of amino acids frequencies by looking at their occurrence in all the protein sequences in the swissprot database ( A massive amount of sequences) this matrix is then used to adjust normal Shannon entropy (which is the sum of the log(base2) position frequencies) by dividing each log by the natural frequency of the amino acid

H = - SUM Pi Log2 Pi / Qi

where Pi = the fraction of amino acid i at current position and Qi is the natural frequency of amino acid i
which is SUMed for each of the 20 amino acids

if H >= 2 this shows that the position is VARIABLE and will be marked as a V
if H <= 2 this shows that the position is CONSERVED and will be marked as a C
if H <= 1 this shows that the position is HIGHLY CONSERVED and will be marked as a H
(litwin et al, 1992)

3. I will also calculate the relative sequence entropy for each point using a fixed window size, rather than the peptides, which will parse through the protein sequence incrementing at a set step size.

Have not yet thought of the best way to then compare all this data as I will have about 3000+ clusters for each of the 6 profile sets... I think some scatter charts and histograms may be they way fowards at that point.
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
Feb. 7th, 2006 @ 11:32 am Binary Genertics
Current Mood: contemplative
Current Music: Goteki - Fight the Saucermen
I was sitting getting frustrated at the damned beowolf cluster (very cool specs: 32 node beowolf cluster of dual 2gig opterons with lots of memory and storage) that doesn't work (probably because very cheesily it is called 'agent.smith'),

anyway I was considering how the way that we currently play with nucleotides and genes is very similar to what it must have been like trying to program the very first computers using binary, very slow and low level. Which makes me think that the future is going to be genetic programming!! computer science already models many of its ideas for neural networks and evolutionary computing on biological systems, so what if we did the reverse and applied the concepts of programming to genetics.

I a way our DNA is like a computer program, each gene being an object (each gene having its own properties and features, along with its own list of operations denoted by the promoter and regulatory sequences that different types of enzyme bind to). Which means in theory you could by understanding the way in which genes work and how this effects the proteins (and the function of said proteins) which are encoded by genes write a higher level programming language based on codons and promoters and sequence words (Short sequences of nucleotides that serve a purpose eq. restriction (cutting) site) which could be considered the equivalent of assembly code, and then create and even higher level language which could be used to design and program simple evolving life forms on a computer with specific adaptions for a particular task, almost like they already do with the programmed bacteria PROGRAMMED BACTERIA add this to the recent steps foward in training brain cell for specific tasks (Rat Brains flying F11 jets) and we pretty much have everything you need for a totally organic based technology that is massivly versatile from organic transport (living spaceships) to an organic house. Tis so very crazy, fills me with renewed energy to battle once again with agent.smith and finally submit my alignment MATRIX.....!
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
Jan. 25th, 2006 @ 11:14 am Insilico Mass Spectrometry
Have recently had the IT people install a great perl modual for proteomics for me

InsilicoSpectro

Have made use of the digest function so far and its proving to be very useful.
Although it did take a lot of messing about with ENV variables to get it fully working and I am getting and error for one of my sequences which I haven't quite figured out yet.
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
Jan. 24th, 2006 @ 02:41 pm Genomes to Systems Conference
Current Mood: busy
Tags:
Improtant conference of the year taking place in Manchester presented by "The Consortium for Post-Genome Science"

GENOMES TO SYSTEMS CONFERENCE



Am already registered, don't think I will submit an abstract though.
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
Jan. 24th, 2006 @ 02:16 pm Current Work
Well lets start with a quick summary of my current PhD work which is split between Liverpool University and Manchester University.

I am researching computational solutions for cross species protein identification in proteomics.

Cross Species Proteomics PhD )

I am currently looking at the use of a protein/peptide profile database to improve cross species peptide mass fingerprinting. I have begun testing different clustering algorithms to build these profiles:

PSI-BLAST
MCL TRIBE
PFAM DB

I am testing these three programs on two sample data sets:

the first is a yeast fillamentus dataset
the second is a general chordate dataset
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group
Jan. 24th, 2006 @ 02:02 pm New Journal Blog
Current Mood: busy
Tags:
This Journal is ment for bioinformatics and work related posts.

A place for Ideas, Notes and Information.
About this Entry
Broiler Chicken Protein Gel, Aspergillus Niger, DNA, Proteomics and Functional Genomics Group