29 Aralık 2015 Salı

CPTAC shows high reproducibility in Orbitrap quan between systems AND methodologies!


Once in a while I run into someone who heard from someone else that Orbitraps aren't good for quantification.
...and I try really hard to not make this face...

Our good friends at CPTAC decided to make the ultimate comparison. Over 1,000 (one-thousand!!!) LC-MS/MS runs. From different mass spectrometers. From different institutes. With different quantification technologies. On xenografts! (That's a human tumor grown on a mouse. You don't get much more variable).

They compared iTRAQ quan with XIC based label free quan (peak area integration) and spectral counting. What did they find? I'll just quote it.

"If laboratories deploy different methodologies to analyze the differences between the same two complex samples, then they will assuredly see differences in the gene or protein lists produced by the two technologies. The degree of conformity observed in this study, however, was encouraging. When label-free data were analyzed by spectral counting rather than precursor intensity, the differences yielded a high degree of overlap. When iTRAQ rather than label-free methods are deployed, the differential genes were again quite similar. These overlaps suggest a degree of maturity in proteomic methods that has grown through years of development along multiple tracks.
At base, biologists need to know that differential proteomics technologies can produce meaningful results. Our assessment showed that biological pathway and network analysis is highly consistent across instruments."
Right?!? Ben's interpretation: We're still getting a subset of the data in something as complex as a human tumor. We can bias this subset by using completely different methodologies, but even on the most complex human samples and experiments, we're at a point where we are HIGHLY reproducible. And this is the global/fractionated stuff....

28 Aralık 2015 Pazartesi

Protein carbamylation is a hallmark of aging - and how to detect it


A recent paper in PNAS makes the statement in the title "Protein Carbamylation is a Hallmark of Aging. You can find it here.

They find that you can almost assess the age of a mammal by looking at the degree of carbamylation in the proteins of that mammal. I'm not 100% awake yet, so it took me a minute or two to remember what carbamylation was and why it puts up a little alarm in my head. Then I found the image above. Most of the time when I think about carbamylation, its cause its a sample prep issue.

Here is a paper that discusses this modification.  When I run Preview on a sample and it pulls up carbamylation as a modification to consider I've always assumed it is from a protein prep in which either excessive Urea was used, or Urea was used and the prep was performed at too high of a temperature. Turns out, it might be detecting old samples as well? Interesting thought, right?

Detection of this modification is very straight-forward in any search engine. In PD you just need to activate the modification in the Administration --> Maintain chemical modifications tab.


With this valuable new information, I expect y'all to get on reversing this aging stuff 'cause the more I experience it the more I realize I'm NOT a fan of it.

27 Aralık 2015 Pazar

Pinnacle -- the best translational software I've ever seen.


I've been wanting to talk about this one for months!!! Unfortunately, I do have a day job and there are rules I have to follow to keep that day job, so I held my tongue until I found out I was finally allowed to talk about it this week.

At HUPO I got to see Pinnacle. Pinnacle is software specifically meant for all you translational people out there. I know, there is a ton of software out there, but I'm going to argue that you ought to demo this one if:

1) You have so many clinical samples (especially high resolution ones) that you can't process them in anything like a reasonable amount of time
2) You are doing label free quantification
3) You are doing data-independent analysis (DIA, pSMART, WiSIMDIA)
4) You just want to use a piece of software that is graphically pretty.

This software is fast. Sick fast. It-shouldn't-possibly-be-this-fast FAST.  Put in HUNDREDS of Q Exactive Raw files -- targeted, untargeted, DIA, whatever -- and watch it pull the data out in minutes.

Wonder what the data quality is like? Just look at color and shape of the icons on the left (click on the pic above to zoom in) and get a feel for the quality OR look to the right of the peptide sequence where you ACTUALLY SEE THE INTEGRATED PEAKS.  Sorry to shout, but how cool is that?  "Wow, that is crazy upregulated! Should I investigate it? Nope, that is obviously just a poor integration. Better readjust that integration right now". In real time. Without changing the settings and reprocessing the data. Just fixing that peak. Click, click, done.

Pinnacle has a bunch of other functions. Its a thorough software package and you purchase the modules that you need for your work. You can also download a free trial version here that lets you process one dataset and see what I'm talking about.

26 Aralık 2015 Cumartesi

Interesting, though somewhat morbid, article on elite scientists and progress


I'm not entirely sure what to think of this. Partially because I'm having a little bit of trouble wrapping my head around it. Maybe part of the difficulty is that the article is from the National Bureau of Economic Research. Which, Wikipedia tells me, is a real thing.

Anywho...you can read the article I found on Vox here.

And the abstract for the original article is here (there is a $5 charge to download the complete article)

25 Aralık 2015 Cuma

Christmas Magic -- Multiply charged proteins ionized with no energy!


This is really interesting. What if you could just mix up your proteins, including the big ones with your matrix compound(s) and then magically get multiply charged species into your mass spectrometer? No energy. They just grab some protons and go flying into the air? Well, it sounds like you could save a lot of energy on lasers....AND....maybe you could finally give up on that weird old TOF in the corner that can go to 100kDa (you know...the one that is 8 foot tall and has accuracy within 1kDa...or 2...)

Well, that appears to be exactly what happens. What?!!?  I know!

Check out this paper from Sarah Trimpin for more details. Hey, if nothing else, it has one of the single most amusing abstracts I've ever read.

And its got this great chart!


24 Aralık 2015 Perşembe

CaspDB - A database of caspase cleavage products!


Another tool to help find identifications for unmatched MS/MS spectra!  Caspases are proteins that hang around just to destroy other proteins. They are a critical component of apoptosis and normal cell maintenance, and if you believe the recent in silico protein cycle predictions -- they are active constantly. If my mix of proteins I just harvested is full of incomplete, complete, modified AND degraded proteins, then all these unmatched spectra start to make sense.

Caspases have specific substrates for degradation and a bunch of them have been worked out. CaspDB is a new online tool to help you work with this this information. It is described in this new Open Access paper from Sonu Kumar et al.,.

While the paper is totally neat and all, you can go directly to check out this online tool here.  You'll quickly find out that this tool requires a good bit of pre-existing information before it is useful. Once you've got some data, you can use it to run through your protein of interest and different caspases to see if you've got stuff that makes sense. The paper goes forward to show how awesome these prediction tools are by going ahead and proving that a ton of their software predictions are totally true.

This is obviously a very powerful and interesting tool and this will generate some great data from the validation end. But first you need to get some observations....


...(how did we ever get anything done before this...????...)

Check out this thing!!!  Its called Pripper and, wait, we'll need this...


...to go WayBack to 2010....to this paper from Mirva Piippo et al., that describes Pripper. Pripper is a Java tool that will take any FASTA database you give it and will perform in silico caspase cleavages on that database and give you a new FASTA that has all the predicted caspase cleavage products.

If you're thinking "How can I trust a tool that is 5 whole years old?" Never fear, it has been updated multiple times (the version I just unzipped is time stamped from 2013). Oh. If you download Pripper here you might want to right click on the zip file, go to properties, click "unblock" then "apply" and THEN unzip it. Windows Defender on my PC blocked it as a threat.

Now you have a tool that will make you a predicted caspase cleavage FASTA that you can run against your samples. If it comes up with something really cool then you can go to the CaspDB and search those observations against their more advanced prediction models (and validated data!)


23 Aralık 2015 Çarşamba

What is a PFAM? And how do they deal with all this data?


Personally, I think the biologists and biochemists need to hurry up and annotate the function of every protein from every organism under every biological condition. Until they stop slacking and get that stuff done, we need to use some shortcuts to extract biological data from our peptide spectral matches. Fortunately, smart people have been working on this gap for us.

Gene Ontology (GO) is tricky stuff. If we don't exactly know what a gene does can we infer from its similarity to genes we better understand what the heck it does?

More tricky, and way more biologically relevant? Protein Ontology (PO?)!  One way of getting this data is via PFAM (which you can access here).  I'll be honest. I didn't really know what this is was for a long time. I just knew that it was an option in the Annotation node in Proteome Discoverer.  Cool, I have new column that says that all this stuff that is upregulated shares a PFAM ID (actually, I made that part up. Its never that easy, is it?)

Turns out that the people making PFAM are working really hard making this data:
1) More accurate
2) More relevant
3) More current

As you can imagine, all of this is hard, but...

(holy cow)...

Can you imagine what the 3rd one is like these days?

The amount of sequencing information in databases is increasing EXPONENTIALLY and the current tools for creating PFAM information increases at a linear rate. It doesn't take a stolen GoogleImage to show that this is a problem, but...I'm nervously waiting for an important phone call...so...


So, what do we do about it? Well, Robert Finn et al., say in this new OpenAccess paper, we fix the algorithms to deal with this glut of data. So they did.

When I clicked on this link in Twitter this morning, I honestly expected a dense paper that I probably would hardly be able to read and would likely not understand at all. I was pleasantly surprised to find that this team can seriously write and that I not only learned a lot about how PFAM works, but I also (think) I got a good understanding of their challenges and how their new algorithms power through in dealing with them. Solid and interesting paper that makes me want to add this column to all of my processed data from now on!