21 Aralık 2015 Pazartesi

PTMs in centromeres!


I had to dig deep in my brain and then finally just look at Google Images to remember what a centromere is and why its important.  Hopefully the nice sketch I found above clarifies it for you as well. Cause its the protein that holds chromosomes together. Its gonna be deeply involved in cell/chromosome division, sexual reproduction and probably all sorts of other things.

In this new paper from Aaron Bailey et al., in press (and currently open access) at MCP this group looked at the post-translational modifications that can show up on these important proteins.

They started with a HeLa cell line that had a stable affinity tag at some centromere and then immunoprecipitated to get at their proteins of interest. Chemicals were used to arrest the cells in certain stages of mitosis or something. Multiple enzymes, including LysC and AspN were used to get big chunks of the cleaned up protein for effective PTM identification and localization.

What did they find?


20 Aralık 2015 Pazar

BetterExplained -- a great site for math concepts


I seem to have forgotten all the little that I ever knew about Math. This site, BetterExplained, uses clever examples to either teach or remind you of what a match concept is.

16 Aralık 2015 Çarşamba

Open Genomics Engine


Sorry, this is something I just stumbled on that I didn't want to forget about!  I lost the password to my EverNote account...but it does look super cool, right? If you're into that weird DRNA sequencing stuff, that is...

14 Aralık 2015 Pazartesi

Use protein solubility to get around protein abundance issues in biofluids?


For biofluids, one of the biggest problems is the high abundance junk. "Junk" probably isn't the right word since evolution probably wouldn't have erred toward filling our fluids with albumin if it wasn't important, but...you know what I mean....

In an interesting take to this problem, Bollineni et al., tried a protein solubility approach. Rather than specifically depleting the most abundant proteins using an immuno-affinity approach, they used different concentrations of ammonium sulfate to precipitate or solubilize different populations of plasma proteins. This gave them a less directly biased way of fractionating out the high abundance things.

To my friends out there who are in the "do not deplete!" camp, sure, you're probably going to run into the same problems, like the fraction that has albumin will pull down tons of interesting things with it. But for people who will accept this loss in order to see the stuff that isn't at 1e9 copies per uL this might be an simple approach to see something different than what your Top4,10, or 14 depletion column is giving you.


13 Aralık 2015 Pazar

proBAMsuite! Great new proteogenomics tools!


Man, I love a software package with a catchy title. And I love a free software package that has a ton of promise!  proBAMsuite has all of these things!

Is a set of R tools that are meant to help you integrate the data from your next gen sequencing files with your LC-MS/MS spectra. This is an overview of the steps involved.


Of course, the process isn't trivial. The RNAseq data needs to be lined up and QC'ed and so do the MS/MS spectra and the PSMs and the Peptide matches. When we're looking at millions of measurements the number of false discoveries has to go up, just mathematically, nevermind the fact that not every MS/MS spectra or next gen read is as good as the others.

In order to control the false discoveries, the capabilities are in place to control the FDR at the PSM and peptide level. Even cooler, maybe, is this idea:  The decoy matches are kept and allowed to be mapped against the total genomics data, so you can get a good idea of the FDR at the complete, reassembled level!  Total system FDR.

Why would we go to all this trouble?

1) How bout more data about your protein than you'd maybe even want? Check out the suite's sweet output!



And, of course, more explanations for what those weird MS/MS spectra are!

Open access pre-release of paper here!

12 Aralık 2015 Cumartesi

The second version of the OpenMS LFQ nodes are available! Now for PD 2.1!


The label free quan nodes from OpenMS I keep going on about?  Version 2 is now available!  More stable, faster, and works in Proteome Discoverer 2.1.

You can get them here. Once this PC stops looking like this:


I'll install 'em and give 'em a good hard run!

Keep this good code coming, people!

10 Aralık 2015 Perşembe

Find unidentifed differentially regulated reporter biomarkers in reporter ion datasets!


I feel kind of smart for this one, though I'm afraid I'm getting to the point where I really really should get an indoor hobby of some kind since this is most of what I did last weekend. What do you guys do when its too cold to rock climb but you can't snowboard yet?

Anyway. I have access to an amazingly cool set of TMT/iTRAQ samples. I have access because there is a distinct and observable phenotype. Not a little one, either. The hundreds of samples in group 1 and group 2 are extremely different. Proteomics, so far, has shown just about nothing different between the two. Weird, right?  For years we've been suspecting a novel mutational system or PTM that we've just never seen before, but we've not been able to find a way to hunt it down.

So, here was the thought that killed this last weekend: What if I completely ignored the IDs? What if I only looked at the spectra that showed a significant difference at the reporter ion level?  And then I tried to figure out what they were later?

In PD 2.1 + Quan you can do this. There is a tab in your report that is your "Quan spectra".

You can actually go to that and look at every MS/MS spectra. You can see the RAW reporter values and you can even see your quantification spectra zoomed in.

So, you can actually go through and see all the stuff that is different. See the reporter ions above? This is exactly the trend I should be seeing in this sample set based on the phenotype. Exactly. And this MS/MS spectra is the most differentially regulated observation in this entire sample set of 1M or so MS/MS spectra. And this PSM shows up just like this three times in different, overlapping fractions. I think the precursor intensity for this is 1e6-5e6. More importantly, since in PD 2.1 we can plot our reporter ion intensities by their SIGNAL TO NOISE (yay!!!!!!), the S/N of these reporter ions are >500!!!

In sum, this is the perfect biomarker for this experiment and maybe the thing we've been trying to find in one form or another for 5 years (Holy cow, I don't think I'm exaggerating. Its 2015?!?!).  Not to get my hopes up to high or anything....

Where it gets difficult, however, is linking that back to the full fragmentation spectra.

For example, check this out, and I'd LOVE it if you guys had advice. I'm putting in a feature request and will be bugging the great people at PD.Support but I'll take any ideas I can get.


Anything from the Protein/Peptide/PSM and MS/MS spectrum can be checked and exported to .DTA, mgf, or whatever. Then I can do big DeltaM searches in Byonic or DeNovo GUI it or PEAKS it.

But I've got to go through one at a time and find the MS/MS spectrum info to export. Kinda looks like next weekends gonna be a wash if I can't find a shortcut (cause I have about 200 interesting things to look at now that I have NO idea what the fudge they are!)

I suspect I'm looking at a PTM but I don't have anything to match any of our normal suspects. Or...I'm looking at unique class-switch sequences in the variable regions of antibodies!  Either way, there are biomarkers in this dataset that traditional peptide searching can not identify and the dataset is just too big for Byonic WildCard, but here I've vastly reduced (computationally, at least...) the complexity of this problem!  Will I find my biomarkers this way? Who knows, but on some of these hard datasets we need every lead we can get, right?

Again, if you have any advice or thoughts on how I might simplify this, I'd love to hear it!!!