5 Kasım 2015 Perşembe

A pan-cancer proteomic perspective on The Cancer Genome Atlas


Okay. (Ben slowly gathers thoughts...)...

Now I'm going to tell you about a paper that is so cool that even though I have no idea how they did it, I still think its worth sharing.  I'm hoping I'll figure it out as I write this.

First of all, its Open Access (yay!) and available here!  Second of all, its cool enough that 2 people sent it to me since it came out and this morning I thought I'd get it on the second read through.

What I do get:  The Cancer Genome Atlas is not a leather bound book that sits in a room that smells of rich mahogany....


...instead, it is a huge cohort of clinical cancer samples that have have been or are in the process of being studied with a ton of different genomics techniques. The homepage of the project is here.

Browsing through the papers that have been done on this Atlas (to construct this Atlas? that makes more sense...) shows that there is a lot of bioinformatics firepower at work here.

So...in this study this group took these samples and did an interesting protein array analysis of them. This is where I get foggy. The array they used is called an RPPA. This is a Reversed Phase Protein Lysate Microarray (wikipedia link) (and if are a Jove user, or care enough to register for a free trial, here is a video that shows how an RPPA works.)

Okay. So they are using fancy antibody arrays to show the presence/absence/abundance of proteins.  Got it. What do the arrays detect? Well, they went for a whopping 181 antibody probes! Wait? What? Just 181 targets? And the targets were selected based on what we know of current cancer pathways and stuff. My assumption is that the arrays are very fast and/or very cheap...or we would have done this with a mass spec and looked at hundreds of targets with PRM (people are routinely doing 700+ per assay these days on Q Exactives) or more with SRM, right?

But this is where it gets impressive -- monitoring all 181 targets on these arrays they looked at over 3,000 different samples...which is a lot...   And these samples have been previously clustered by neat things like disease type and primary driving mutation.  So, you can see how different genes interact with hundreds of samples of the same disease that follow the same -- or different cancer driving pathways.

Take home point for me is: For you guys out there generating insane amounts of clinical data, we need to steal more genomics tools! Cause these guys seem (at least...to an outsider...) to be able to do stuff with the data!


4 Kasım 2015 Çarşamba

Discoverer International User's Meeting!


Hey! I meant to put this up a bit ago. This was one of my favorite events all last year. My attendance this year isn't all that likely....though not out of the question yet! I still have vacation days. We'll see.

You can register here. Warning, if fills up fast!  Oh, and this is what Bremen looks like in December...


...yeah, it totally sucks...

Downstream analysis of proteomics data!


Alright!  This painfully thought-out and beautifully executed experiment yielded a big list of differentially regulated proteins! Woooo!  So...now what....?

This review from Karimpour-Fard et al., is a great place to start. This concise little piece in Human Genomics walks you through some tools and approaches that can help you figure out:

1) What is significant in your list (and all the stuff that isn't)

2) What those words mean that that stats and bioinformatics people are always using (Anova?)

3) How to extract some biologically meaningful data out of all that stuff.

A nice short review (and Open Access!) that might help you make that next step forward!

Shoutout to my aunt Beth who took this cool picture downstream off a bridge near home!



2 Kasım 2015 Pazartesi

SIM-XL -- Lets identify those crosslinked peptides!


Mapping protein-protein interactions in complexes is a tough job. We can go one of two ways with it:
1) The relative way: When I pull down proteins under condition A and under condition B,  I see relative upregulation of this protein, so it must be associated
and
2) The crosslinking way: Under these conditions I throw in a crosslinking compound, then pull everything down, digest and identify the crosslinked peptides.

Both are hard, but the relative way is a good bit simpler from the data processing perspective.  Analysis of crosslinked MS/MS spectra? Thats hard. There are some nice approaches like XComb and StavroX/MeroX.  SIM-XL is a new one. If you wonder why you might want to try a different piece of software, look at this result output (click to Zoom!)


Um....how frickin' cool is that?!?!?

Its a GUI driven interface with all sorts of cool graphical maps to help make sense of your crosslinked data. It'll accept all sorts of MS/MS converted data files (and if you've got the MSFileReader installed, it'll directly read Thermo .RAW files!)  and its even possible to map your data against spatial constraint data obtained from 3D protein structures to see if what you are seeing is possible at the biological level!

You can check out the SIM-XL website here.

And you can find the original paper by Lima et al., here.

31 Ekim 2015 Cumartesi

Happy halloween!


Remember the time I taught "Advanced Proteome Discoverer" dressed as a realistically scaled Q Exactive? Honestly, it was a little distracting and kinda uncomfortable so I didn't keep it on long!

My co-instructor managed to get a good picture!

30 Ekim 2015 Cuma

Thermo Fisher Cloud. The next step in your processing pipeline?



Alright, so...now I have big list of proteins....what do I do now? What a great question. There are lots of things. If you own an institutional license for an expensive pathway software, you could try that. You could go to KEGG. If you're one of those highly employable people who know R really well, there are ton of cool scripts and on and on.

One thing you might want to check out is the Thermo Fisher Cloud. Why?
Cause it looks pretty cool. And its free. And you get 10GB of free data storage on the Cloud just for registering and checking it out. Oh, and there are these tools I've never seen outside of papers on R scripts like Pathway Over Representation and Pairwise Significants that are super easy to use in this format. And if we generate interest in this then more tools will be added and faster. The bioinformatician behind the scenes in this project has some great insight into what this field needs and I think we'll continue to see more cool things added to this interface all the time.

You can register to use this resource here.

28 Ekim 2015 Çarşamba

I'm gonna see over 60 of you tomorrow?!?!


I just saw an update on the attendees for tomorrow's NIH PD workshop. 60+ people!  I'm super psyched. Sorry the blog has been slow lately. I started a new role recently for my day job and I've been putting all of my free time into new content for the workshop. There are people flying in from far away to attend!!!!?!?!  I don't want anyone to be disappointed.
Thank you PRIDE Repository and to you guys who put tons of cool experiments in there!

And to everyone who can't make it, I can't make promises yet, but I think at least some of the material should be accessible to you later. I'm working on it! Can not wait to get back to Maryland today!!!

EDIT: 10/29/15  So...I found out the hard way (after lugging a tripod and good camera into the NIH and through 3 security checkpoints..) that all video recording on NIH campuses is done by an organized and unionized group that considers any attempt to record on campus as a threat to their livelihood. However, for the price of a good used car, they will record a workshop for you.   We will have some slides to share, though!