Friday, March 29, 2024

Reading Papers in Volume

I sat at my desk with a pile of academic papers next to me waiting to be read. I worked at a research lab and reading papers was part of my job. My colleagues amazed me by reading scores of academic papers, calling up the right paper at the right time. I couldn’t compete. I couldn’t even keep up. I had received my Ph.D. a few years prior and felt I should have been able to. I wondered if I just didn’t belong at a research lab.
 
Then I wondered if I was missing something. I searched the internet and quickly found I was. I found Steve McConnel’s How to Read a Technical Article and Greg Phillips How to read technical papers—in quantity! Steve McConnel’s article was itself based upon (and acknowledged) How to Read a Book, the classic by Mortimer Adler. I learned that efficient reading is a learnable skill, just like any other. Here is what I wish I had been taught when I started grad school.

First, a caveat. If you are in school, you need to learn to fully read individual technical papers, in detail and considering every aspect, before you can read them in volume.

A Close-Up Shot of Paper Clipped Documents
Image by Kindel Media



I am not an academic. While I have a Ph.D. and used to work in a research lab, I am not paid to do research or to publish papers. Yet I still find it valuable to read papers such as conference proceedings and journals, as well as blog posts and white papers. Reading helps me to stay up to date in my field and to do my job better. Adding other people’s ideas to my experiences helps me to come up with better solutions to my technical problems.


I am paid to solve problems, not to read. Therefore, I must make the most of my reading time. I cannot read everything. Instead, I read papers through the lens of my problems.

 
Selecting Papers to Read

I read papers in three different situations, all in service of doing my job better:
  1. I’m trying to solve a specific technical problem.
  2. I’m keeping up on a general professional interest.
  3. A colleague sends me something interesting.
When I try to solve a technical problem, it often involves a lot of very focused reading. I cast a wide net for papers, using keyword searches in Google Scholar, tracing references from other papers, and asking colleagues in person and beyond. I generate a large pile of papers that need to be thinned into a manageable collection.

For example, I collected and summarized papers on flaky tests for my colleagues. We run a very large number of correctness tests to ensure our software works properly and without bugs. Flaky tests—tests that sproadically fail—are a distraction. They slow development and they increase the risk of bugs slipping into our code base. Based on my reading, we reduced the pain of flaky tests.

Keeping up on general professional interests is a more continual background activity. I am an expert on performance testing of software. I follow academic literature on the subject, keep an eye out for relevant industrial papers, and look out for other resources, all so that I can remain an expert in the field. Important conferences are a rich source of academic papers in one place.

Finally, I enjoy new and interesting ideas. Colleagues forward things to me since they know I appreciate when they do so. These papers may be performance related, performance adjacent, or touch on other interests of mine such as peer groups, technical education, and DEI, among others.

I suspect that I could fill all my time reading the papers I accumulate and still not read all of them. I must be efficient if I want to get the most out of them.

How I Read Papers

When I have a set of papers to read I follow a specific process. I take the first paper and set a timer for 5 minutes. Within my five-minute limit, I proceed in the following order: I read the abstract, the conclusion, the introduction, and scan the figures in the paper. Doing so gives me the best opportunity to get a sense for each paper. I actively look for and record the questions that come to my mind as I scan the papers: questions to dig deeper on, questions about the parts that confuse me. “How do they do X?” “How do they prove Y?” “Could we use Z?”

At the end of the 5 minutes I review my notes. If I have no questions, I am done with this paper. If I have questions, I put the paper aside for a deeper read specifically to answer those questions. Then I pick up the next paper to scan. 

With this process I can scan 12 papers in an hour. That might not sound like much, but without my process in place, I could easily spend an hour or more fully understanding one paper.

A Deeper Read

After triaging the collection of papers, I have a smaller collection of papers, each with a list of questions to answer. I’ll pick up a promising paper. I review my notes and questions for it, then read the paper solely to answer those questions. The questions provide a focus, allowing me to read faster. If the introduction is not relevant to my question, I skip it. If I only have a question about the implementation, I go straight to the implementation section.

This all felt unnatural when I started doing it. I was not reading the paper as the author intended. That is okay. I (and you) do not owe it to the author to read their paper, nor to read it as it was “meant to be read.” Good readers and writers know this. We owe it to ourselves to get the most out of the papers we read. We only owe the authors our gratitude and an acknowledgement when we use their ideas. Sometimes I send a thank-you note to acknowledge ideas I find particularly helpful.

Remembering Papers


My wonder at my colleagues’ reading ability was not just because they read so much, but also because they remembered details and made use of those details. I must take notes if I want to do that.

I use two tools to take, organize, and remember my notes from technical papers: Zotero and Anki.

I use Zotero to organize all my academic papers. New papers go directly into a triage folder in Zotero. Zotero supports highlighting PDFs and extracting highlighted text. It also automatically collects the bibliographical data for each paper.

In Zotero, I use folders to organize papers by subject. When I want to find details on a given topic, I look through the appropriate folder and review my notes on the papers. Sometimes I rescan papers. Each part of this process is so much faster and more efficient than when I photocopied papers and stored them in a filing cabinet in graduate school.

While I can find things quickly in Zotero, simply using the tool does not put the information at the tip of my tongue. For ideas and details I want to talk about or otherwise have available in my mind, I create digital flashcards in Anki. See The Nerdiest Thing I Do for more on my Anki usage.

Reading to Learn to Read

I am able to read much more efficiently since that day sitting at my desk overwhelmed by papers. There are still people who can read and retain much more than I can, but my reading is no longer a weakness. It’s a strength.