19 Ekim 2015 Pazartesi

Novor -- De novo peptide sequencing that's faster than your data acquisition

De novo sequencing has traditionally been a pain in the neck. If you are looking at low resolution MS/MS spectra there are often 10s, maybe 100s of possibilities out there to explain a given fragmentation spectra.  High resolution accurate mass MS/MS spectra with fragments that are often only a few 100 parts per billion off in mass accuracy? That improves things a LOT!  And that's why we're seeing things like de novo and wide mass accuracy windows making a resurgence.

You know what would be useful? A free (for academics...) super fast new de novo algorithm!  And that is what Novor is. Novor makes some intelligent assumptions based on biological data (from spectral libraries) and can take a shortcut or two that other de novo algorithms can't. What you end up with is a crazy amount of sequencing speed without a loss in match certainty since its based on real data.

Novor has already been picked up in the new version of the awesome DeNovoGUI and is about to (or has, I forget...) appear in the super cool PeptideShaker.

If you're an academic you should check this out!  If you aren't, you should check out their legal stuff or contact them to find out how you can use it.

You can read about Novor in this Open Access paper here.

And you can visit the RapidNovor website here.

18 Ekim 2015 Pazar

Happy birthday, SEQUEST!


Man, time flies when you're solving all the biological problems no one else can!  SEQUEST is 20 years old now!  To commemorate the fact that this awesome step forward for science will be old enough to have a drink with me next year JASMS has a special issue focusing on and around this accomplishment.



You can check the issue out here. Some articles are open access.

17 Ekim 2015 Cumartesi

An improved TMT node for PD 1.4!


Pretty soon I'm going to be rambling on about all sorts of improvement in reporter ion quan due to improvements we'll be seeing in Proteome Discoverer 2.1.

In the meantime, however, here's a great new node for PD 1.4 that improved TMT quan for this software package.


How's it work? Well, it takes a bunch of things into account, including the reporter ion isotope distribution!  I can't wait to give it a shot. Unfortunately, I think I've let my PD 1.4 licenses expire and I've got to get on that....

You can download this node along with installation and processing instructions from pd-nodes.org

16 Ekim 2015 Cuma

Amino acid composition and effects of mutations


That's pretty clear, isn't it?  Just kidding, I have NO idea what that is, but its a figure from this paper, that does actually make a lot of sense.

Here is the gist of it: mutations are gonna happen. If they didn't evolution, to a large degree, wouldn't happen either (hugely complicated higher organism silly reproductive procedures aside).  What Sahand Hormoz is proposing here is that the evolutionary pressure on essential proteins and their structures is to minimize the nasty effects of mutations is a mathematical certainty.  That's a long sentence and I think the grammar could use some work.


Here, I've stolen the genetic coding table from this Wikipedia page.   Say, you've got an essential leucine at this essential location. That's great! Cause many point mutations can happen to the section of the DNA and the protein won't change at all!  You can change UUA to UUG or CUA and you're still gonna have leucine!    Now, if it goes UAA...well you've got a truncated protein...so maybe this wasn't the best example.

There are also changes where we switch amino acids and it isn't all that bad either. For example, it would be better to have a mutation that changes a polar amino acid with a small functional group to another similar one, but changing to one with a huge non-polar, would be a whole lot worse.

From the genetic perspective, this makes all sorts of sense. What this guy did was show that you can calculate it with that Maths stuff to show that this wasn't accidental at all. This was evolutionary pressure over billions of years to find the best possible way to protect the proteins that life would end without.  Really cool read, even if you don't know what any of the equations mean!

Hey, Maryland and D.C (and surrounding areas!) Wanna come to the NIH for this awesome workshop?


Hey! I know this is self-serving, but enrollment for the NIH Proteome Discoverer 2.0 workshop number 2 is still kinda low. I don't think its in danger of being cancelled, but if more attendees register then I'll be more confident we can have this thing.

The program is starting to take shape.  The advanced concepts in the afternoon will definitely include:
How to combine quantitative whole protein and quantitative PTM (probably phosphoproteomics) into a single report  AND
How to combine the results of multiple TMT and iTRAQ experiments.

We are still taking suggestions for other concepts to cover!

If you want to come, please register at this link:  https://www.surveymonkey.com/r/?sm=PenWchsGrQuQZQmLH6odMzQoGuh4cDU%2fWF5ilEAy6oI%3d

15 Ekim 2015 Perşembe

MaxQuant summer school 2015 videos are all up!



It looks like all the MaxQuant videos are now available on Youtube:

These are the ones I've been able to find so far:

Basics 1 and 2







Protein quantification




Quantitative proteomics


From Discovery to Targeted Quan (with Brendan from Skyline)



Moving MS based quan to the clinic!


Label free quan


Recent developments in the Orbitrap (by Dr. Makarov!!!!!!)



Lysine acetylomics



An application interface for all those funny MS file types


All of our friends on the bioinformatics side of the proteomics world have been throwing out all these funny letters for years. They tend to start with an "m" and end with an "l" and have something random in the middle. mZmL, mzXmL, mzTab (no L! cheater!), mzIdentmL, and on and on. On cursory examination these are all attempts to store our data with better efficiency without the loss of data that we see when converting our data to MGF (where we lose almost all of our MS1 data!)

Problem is, that some of us have used these things. One or the other and the public repositories may have cool data hidden in one of these formats.

This new program (definitely meant for the bioinformaticians out there who can code and stuff!) is called ms-data-core-api.  It is an Application Programming Interface that should take care of all these formats for you. Adding this to your programs will allow you to pull data in from any of these sources and read the data in a unifying format so you aren't all jumbled in your downstream processing.

You can read about it at BioCode's notes here.

And it can be downloaded at GitHub here.