How to analyze a paper

Reading and, more importantly, analyzing papers is a key part of developing a research project. It can be intimidating, however, especially if you don’t have a lot of experience analyzing papers yet. Of course the only way to get experience reading papers is to, well, jump in and start reading papers! But I think there are some pointers or ways to frame the experience that might be helpful when you’re starting out.

How do you structure reading vs. looking at figures? Do you read all the way through, then look at the figures? Do you immediately jump to the figure as soon as it is mentioned in the text? Do you look at the figures before reading? Something else? All of those are viable options and you should choose whatever works for your particular brain. For me, my general strategy is to read one section uninterrupted from start to finish, then have a look at the figure that section referred to. Often one section corresponds roughly to one figure, so this is a nice way to avoid the whiplash of going back and forth between figure and text, but also having a look at the data while the text is still fresh in your mind. I also usually wait to look at the supplementary figures until I’m finished with the main text and figures, unless there is something in the supplement I really need to know before moving on while reading the main text.

How deeply should I read the paper / how much time should I spend reading the paper? That depends! In an ideal world with infinite time, you would read each paper in-depth, but that is not realistic. Determine what information you need to get out of the paper and adjust your decision based on that. For example, is this foundational for your project? Are you going to present this in front of a large audience at journal club? In those cases you may wish to spend a lot of time on a deep-dive. In contrast, perhaps you are just trying to figure out whether to tag a protein of interest at the N or CTD, and are looking for relevant information– in which case, you can probably just look at the methods section. Note, however, that it’s not always clear what is going to end up being relevant for your research… the more you know the better you can move your project forward.

How much time should I spend following up on interesting citations? Again, that depends! Do the authors make a shocking statement, perhaps that contradicts your previous understanding, and attempt to support that with a citation? Does the citation point to a key reagent/method/result that would impact your project if true? Is the citation a recent review that covers topics relevant for your project, fellowship application, etc.? All of those might be good reasons to hunt down and read citations. Finding new literature that you want to read within a paper you are currently reading is a common rabbit hole that is often worth going down!

What information should I highlight? Try not to go crazy highlighting the whole thing, that defeats the purpose 🙂 Consider, as you highlight, what information you would like to easily find again when referring to the paper later. For example, perhaps you are writing a review on why werewolves turn with exposure to moonlight, and this paper discusses a potential novel mechanism in the third results section. Highlight that portion so you can easily find it when you start writing. Below I list things that I frequently highlight.

  • The first time an acronym you don’t know is introduced. You can refer back when the acronym is used again later, and you have forgotten what it stood for.
  • Key context (intro, background, etc.), especially something that is repeatedly used (e.g., “we are using sox32 as a marker for endoderm”– now every time you see sox32 you know you are supposed to think “endoderm”. Or “SB-505124 inhibits Nodal signaling.” Now when you see SB-505124 you know to expect Nodal inhibition.)
  • Techniques / reagents that you might want to try (please tell me about these too!)
  • Key results
  • Key ideas or interpretations
  • Surprising or unexpected results
  • Consider using the “Notes” function in your reference software to write down ideas that occur to you as you’re reading. You probably won’t remember them if you don’t!

How do I understand what is happening? Scientific papers cram a lot of information into a tiny space. (Paradoxically, the longer papers, like Cell papers, are often easier to digest because there is enough space to provide context– with shorter papers, the writer has to assume the audience knows a lot more.) It is easy to get overwhelmed, or to fall into the rhythm of looking at the words without really understanding what they mean. To start, I suggest that each time you encounter a term you have seen before in the main text, pause and reflect on what you already know about it. It could be a gene, technique, hypothesis, biological process, etc. For example, if they bring up Nodal signaling– What does Nodal signaling do? What effectors, target genes, and cell fates might be important in this story? Take a minute to remind yourself what you already know, which will help set up expectations, and then note if they claim anything surprising or unexpected based on your prior knowledge.

Below I list what I think through each time I analyze a figure:

  • What hypothesis are they trying to test?
  • What methods are they using? Do you know these methods?
    • Consider googling any methods you are not familiar with. The better you understand the method the easier it will be to interpret, but often you don’t need to know all of the specifics in order to make sense of their experiments. For example, if a graph has the title “qRT-PCR of sox32”, you know you’re looking at relative levels of sox32 mRNA in principle (as opposed to protein, e.g.). As long as you know what qRT-PCR measures, you can understand the experiment. BUT, the more you know about qRT-PCR, the better positioned you are to interpret their results, because you will understand the caveats and technical limitations. (Side note: It probably is a good idea to have an in-depth understanding of how qRT-PCR works 🙂
  • How does their experimental design test the hypothesis?
    • DOES their experiment design test the hypothesis??
    • Alternatively, the goal of the experiment might be to establish “baseline rules” for their system which will be built upon in later figures. In that case, consider what information they are trying to establish. Is it convincingly established?
  • BEFORE YOU LOOK AT THE RESULTS, consider what outcomes would support their hypothesis. What would contradict it?
    • It is MUCH easier to interpret data if you’ve “pre-digested” what you expect to see. BUT, be sure to look at what is ACTUALLY there, and not just what you EXPECT to see. Actively look for surprises.
    • Do their results support or contradict their hypothesis? Does it suggest a different hypothesis? Or is the experiment uninformative?
  • What controls are needed to interpret this experiment? Are they there?
  • What are the caveats to interpreting this?
  • Do you believe their interpretation? How do you interpret their results?
  • What experiments do you suggest to further test this?

Bonus suggestion: Right after reading the paper, write a summary or explain it to a lab mate. This is a great way to discover what parts you really understood, and what parts you did not.

A final note: Just because someone said something in a paper DOES NOT MEAN IT IS TRUE. A statement could be a misinterpretation of data, it could have been made in the absence of important context that changes the interpretation, it could be based on flimsy evidence, or it could be an outright lie. Thankfully in my experience the latter is rare, but all of these things are possible. Try not to approach a scientific paper as a “textbook” that explains reality– rather, treat it as an imperfect, perhaps valiant attempt to come to a better understanding of reality, and really try to grapple with it as “let me consider what I think this means” rather than “this paper is telling me the absolute truth about how this system works”. We are not doing science because we understand how things work, we are doing it because we decidedly DO NOT. Keep in mind the quote attributed to George Box: “All models are wrong, but some are useful.”