24 Aralık 2015 Perşembe

CaspDB - A database of caspase cleavage products!


Another tool to help find identifications for unmatched MS/MS spectra!  Caspases are proteins that hang around just to destroy other proteins. They are a critical component of apoptosis and normal cell maintenance, and if you believe the recent in silico protein cycle predictions -- they are active constantly. If my mix of proteins I just harvested is full of incomplete, complete, modified AND degraded proteins, then all these unmatched spectra start to make sense.

Caspases have specific substrates for degradation and a bunch of them have been worked out. CaspDB is a new online tool to help you work with this this information. It is described in this new Open Access paper from Sonu Kumar et al.,.

While the paper is totally neat and all, you can go directly to check out this online tool here.  You'll quickly find out that this tool requires a good bit of pre-existing information before it is useful. Once you've got some data, you can use it to run through your protein of interest and different caspases to see if you've got stuff that makes sense. The paper goes forward to show how awesome these prediction tools are by going ahead and proving that a ton of their software predictions are totally true.

This is obviously a very powerful and interesting tool and this will generate some great data from the validation end. But first you need to get some observations....


...(how did we ever get anything done before this...????...)

Check out this thing!!!  Its called Pripper and, wait, we'll need this...


...to go WayBack to 2010....to this paper from Mirva Piippo et al., that describes Pripper. Pripper is a Java tool that will take any FASTA database you give it and will perform in silico caspase cleavages on that database and give you a new FASTA that has all the predicted caspase cleavage products.

If you're thinking "How can I trust a tool that is 5 whole years old?" Never fear, it has been updated multiple times (the version I just unzipped is time stamped from 2013). Oh. If you download Pripper here you might want to right click on the zip file, go to properties, click "unblock" then "apply" and THEN unzip it. Windows Defender on my PC blocked it as a threat.

Now you have a tool that will make you a predicted caspase cleavage FASTA that you can run against your samples. If it comes up with something really cool then you can go to the CaspDB and search those observations against their more advanced prediction models (and validated data!)


23 Aralık 2015 Çarşamba

What is a PFAM? And how do they deal with all this data?


Personally, I think the biologists and biochemists need to hurry up and annotate the function of every protein from every organism under every biological condition. Until they stop slacking and get that stuff done, we need to use some shortcuts to extract biological data from our peptide spectral matches. Fortunately, smart people have been working on this gap for us.

Gene Ontology (GO) is tricky stuff. If we don't exactly know what a gene does can we infer from its similarity to genes we better understand what the heck it does?

More tricky, and way more biologically relevant? Protein Ontology (PO?)!  One way of getting this data is via PFAM (which you can access here).  I'll be honest. I didn't really know what this is was for a long time. I just knew that it was an option in the Annotation node in Proteome Discoverer.  Cool, I have new column that says that all this stuff that is upregulated shares a PFAM ID (actually, I made that part up. Its never that easy, is it?)

Turns out that the people making PFAM are working really hard making this data:
1) More accurate
2) More relevant
3) More current

As you can imagine, all of this is hard, but...

(holy cow)...

Can you imagine what the 3rd one is like these days?

The amount of sequencing information in databases is increasing EXPONENTIALLY and the current tools for creating PFAM information increases at a linear rate. It doesn't take a stolen GoogleImage to show that this is a problem, but...I'm nervously waiting for an important phone call...so...


So, what do we do about it? Well, Robert Finn et al., say in this new OpenAccess paper, we fix the algorithms to deal with this glut of data. So they did.

When I clicked on this link in Twitter this morning, I honestly expected a dense paper that I probably would hardly be able to read and would likely not understand at all. I was pleasantly surprised to find that this team can seriously write and that I not only learned a lot about how PFAM works, but I also (think) I got a good understanding of their challenges and how their new algorithms power through in dealing with them. Solid and interesting paper that makes me want to add this column to all of my processed data from now on!



22 Aralık 2015 Salı

Updated guide to connecting your NanoLC-MS!


Got a Thermo nanoLC? Wanna connect it to a Thermo mass spec? Want every frickin' part number and easy to follow diagrams?

TAAADAAA!!! This link will lead you to a new and updated version of the nanoLC connection guide. It is at PlanetOrbitrap so you might need to log in and then re-click the link to get directly in.

21 Aralık 2015 Pazartesi

PTMs in centromeres!


I had to dig deep in my brain and then finally just look at Google Images to remember what a centromere is and why its important.  Hopefully the nice sketch I found above clarifies it for you as well. Cause its the protein that holds chromosomes together. Its gonna be deeply involved in cell/chromosome division, sexual reproduction and probably all sorts of other things.

In this new paper from Aaron Bailey et al., in press (and currently open access) at MCP this group looked at the post-translational modifications that can show up on these important proteins.

They started with a HeLa cell line that had a stable affinity tag at some centromere and then immunoprecipitated to get at their proteins of interest. Chemicals were used to arrest the cells in certain stages of mitosis or something. Multiple enzymes, including LysC and AspN were used to get big chunks of the cleaned up protein for effective PTM identification and localization.

What did they find?


20 Aralık 2015 Pazar

BetterExplained -- a great site for math concepts


I seem to have forgotten all the little that I ever knew about Math. This site, BetterExplained, uses clever examples to either teach or remind you of what a match concept is.

16 Aralık 2015 Çarşamba

Open Genomics Engine


Sorry, this is something I just stumbled on that I didn't want to forget about!  I lost the password to my EverNote account...but it does look super cool, right? If you're into that weird DRNA sequencing stuff, that is...

14 Aralık 2015 Pazartesi

Use protein solubility to get around protein abundance issues in biofluids?


For biofluids, one of the biggest problems is the high abundance junk. "Junk" probably isn't the right word since evolution probably wouldn't have erred toward filling our fluids with albumin if it wasn't important, but...you know what I mean....

In an interesting take to this problem, Bollineni et al., tried a protein solubility approach. Rather than specifically depleting the most abundant proteins using an immuno-affinity approach, they used different concentrations of ammonium sulfate to precipitate or solubilize different populations of plasma proteins. This gave them a less directly biased way of fractionating out the high abundance things.

To my friends out there who are in the "do not deplete!" camp, sure, you're probably going to run into the same problems, like the fraction that has albumin will pull down tons of interesting things with it. But for people who will accept this loss in order to see the stuff that isn't at 1e9 copies per uL this might be an simple approach to see something different than what your Top4,10, or 14 depletion column is giving you.