Managing writing project metadata with org-mode
Blaine Mooers (he/him) - Pronunciation: Blane Moors, blaine-mooers@ouhsc.edu
The following image shows where the talk is in the schedule for Sat 2024-12-07. Solid lines show talks with Q&A via BigBlueButton. Dashed lines show talks with Q&A via IRC or Etherpad.
Format: 22-min talk ; Q&A: BigBlueButton conference room https://media.emacsconf.org/2024/current/bbb-project.html Etherpad: https://pad.emacsconf.org/2024-project
Etherpad: https://pad.emacsconf.org/2024-project
Discuss on IRC: #emacsconf-gen
Status: Q&A open for participation
Saturday, Dec 7 2024, ~7:40 AM - 8:00 AM MST (US/Mountain)
Saturday, Dec 7 2024, ~6:40 AM - 7:00 AM PST (US/Pacific)
Saturday, Dec 7 2024, ~2:40 PM - 3:00 PM UTC
Saturday, Dec 7 2024, ~3:40 PM - 4:00 PM CET (Europe/Paris)
Saturday, Dec 7 2024, ~4:40 PM - 5:00 PM EET (Europe/Athens)
Saturday, Dec 7 2024, ~8:10 PM - 8:30 PM IST (Asia/Kolkata)
Saturday, Dec 7 2024, ~10:40 PM - 11:00 PM +08 (Asia/Singapore)
Saturday, Dec 7 2024, ~11:40 PM - 12:00 AM JST (Asia/Tokyo)
Duration: 21:38 minutes00:00.000 Introduction 02:20.080 Starting a new writing project 04:05.480 The writing log 04:36.960 Starting the research paper 05:25.310 Outline 06:11.440 Another kind of writing log - accountability 07:17.458 Reducing switching costs 07:46.480 Motivation 09:31.520 Overview of the writing log 10:17.295 LaTeX preamble in opened drawer 10:42.668 Informative header 12:21.400 Four workflows 13:28.080 Project initiation workflow 14:56.960 Daily workflow 17:05.751 Metadata and metacognition 17:48.885 Periodic assessment workflow 18:56.960 Project closeout workflow 19:49.640 Conclusions 20:34.520 Acknowledgements
Description
The planning and writing of a scientific manuscript is an intricate process that requires focused effort. Scientists must make many decisions about what to include and exclude from the paper, often capturing these decisions in notes in the margins, appended notes, or external files. This ad hoc approach becomes unmanageable when the notes exceed the length of the manuscript, which is often the case. Nonetheless, these notes can be vital when responding to reviewers' critiques.
Great scientists like Linus Pauling effectively utilized laboratory notebooks to store metadata on his manuscripts. His cross-referencing system resembled that of Niklas Luhmann in his physical zettelkasten. These paper-based approaches have pros and cons, but they are no longer popular because of the hard work required to make them work well. In comparison, the org-roam-ui view of my zettelkasten provides a garden of endlessly forking paths I can wander in all day.
I sought a more focused approach to managing my attention and the metadata for one writing project. I developed a project-specific writing log for this purpose about a decade ago. The writing log helps me manage anxieties about forgetting where I left off on an interrupted project (Fear of Forgetting, FoF). In this talk, I will highlight the features of my writing log template in org-mode.
The first section supports gathering the initial thoughts about the project needed to assemble a central hypothesis around which to build the paper. Subsections support listing the experiments required to address the central hypothesis and the key discussion points. These subsections include plans for graphical items like images, data plots, tables, equations, and code blocks. Of course, this section will evolve as the results accumulate. When largely completed, this section supports drafting a quarter to a third of a manuscript on day one of the project.
The following two sections support project administration and assessment. The administration section includes plans to apply for funding and approvals for the work. The assessment section supports periodic checks of the project's current state, what holds the manuscript from submission today, and what is missing that makes a larger impact. This section includes a timeline and milestones to finish the project promptly. These can be displayed in tables that org-mode so strongly supports.
The central section of the template contains daily accounts of accomplishments, decisions, and correspondence about the project. I read this section after a hiatus to resume work on the project quickly. An open-ended to-do list and a section for collecting ideas for future projects follow the daily log. The last section contains protocols and guidelines for the various tasks involved in completing the project.
Here, context switching between the writing log and the manuscript is fine because it usually happens only at the beginning and the end of the writing session. My project-specific approach keeps my mind focused on the project at hand and my FoF under control. I share my writing log template in org-mode on GitHub.
About the speaker:
Blaine Mooers is an associate professor of Biochemistry and Physiology at the University of Oklahoma. He uses X-ray diffraction to study the molecular structure of proteins and RNAs important in disease. He writes grant applications, progress reports, manuscripts, lectures, seminars, and talks each year in Emacs. To control his fear of forgetting (FoF), he uses an external document, the writing log, to store metadata about each writing project. He switched from using LaTeX to Org-mode recently. He will discuss the features of the writing log and the joys of editing it in Org-mode.
Transcript
[00:00:00.000] Introduction
Good morning. I'm Blaine Mooers. I'm an associate professor of biochemistry and physiology at the University of Oklahoma Health Sciences in Oklahoma City. I'm going to be talking about the utilization of Org mode to write a specific kind of log file for thinking about writing projects, in particular research articles. I have stored a template for this file on GitHub. You can find it at Mooers Lab. If you go to the landing page and scroll down to Emacs-related, you'll find a link to it. I am a structural biologist. I utilize X-ray crystallography to determine the structures of proteins and nucleic acids that are important in human health. Our workflow is shown across the top. We start out with a purified material that we crystallize as shown by that elongated rod-shaped crystal on the left. We will mount that in a cold stream and collect diffraction data with X-rays in the instrument to the right. That instrument will generate an image like the one to the right where you see a bunch of spots. That's a diffraction pattern from the crystal. After rotating the crystal for one degree, we'll rotate the crystal 180 degrees to get a full data set that we'll process with a computer. This will lead to the chicken-wire map of electron density shown further to the right. Then on the far right, we have compared electrostructures of two drug molecules from two different structures, overlapped after superimposing a wild type protein and a mutant protein. We're trying to analyze how the mutant was preventing one of the drugs from binding. These kind of analyses we can develop that are drugs. In this case, the drugs are being used to treat lung cancer.
[00:02:20.080] Starting a new writing project
When I start a new writing project, I will assign it a number. In this case, I'm developing a review article about the detection of crystals in images collected with microscopes like the image in the upper left. The article is about the utilization of AI to help with that detection of crystals. I start the name of the folder with this index number, and I store the manuscript folders in the top level of my home directory to ease navigation. Whenever I pop open a terminal window, I just enter 0573, hit TAB to autocomplete the name of the folder, and I'll be right in the appropriate folder. I also use that index number to label the names of the files. I start every project with three files: a manuscript, the log file that I'll be talking about today, and an annotated bibliography, which is kind of like one on steroids. Annotated bibliography for the 21st century, not the 20th century annotated bibliography you worked on as an undergraduate. I have developed templates not only for Org Mode, but also for other markup languages, like R Markdown and LaTeX. I actually developed this log file template over a dozen years ago in LaTeX. I also have developed it for Typst. Typst is independent of LaTeX. It's inspired by LaTeX, but it's written in Rust, and it's extremely fast.
[00:04:05.480] The writing log
My writing process involves having the writing log at the center of the process. That's where I began the writing project. On the right, I have the manuscript and all its components highlighted in yellow. On the right, hopefully I said on the right, I have the manuscript with all its components highlighted in yellow. On the left, I have the annotated bibliography.
[00:04:36.960] Starting the research paper
When I start a research paper, I will do this after I have built up a strong idea from various sources, and then I'll sit down and go through a series of steps outlined in the writing log to develop that central hypothesis into several paragraphs that are used in the introduction of the manuscript. The rest of the manuscript is built around that central hypothesis, so the results section will include experiments that address the central hypothesis, and it will exclude experiments that have nothing to do with it. Likewise, the discussion points address the central hypothesis.
[00:05:25.310] Outline
When I'm done developing that introduction in, say, three or four hours, I'll have an outline in hand. At least for the results and discussion section, the outline will be detailed down to at least a sub-heading level. I'll move those components over to the manuscript on the right. As work is done to address that central hypothesis, the manuscript will be updated. Also as exploration of the literature continues, new ideas will flow in to the manuscript through the log file.
[00:06:11.440] Another kind of writing log - accountability
You've probably heard of another kind of writing log, which is more of an accountability tool, a tool you use to hold yourself accountable in terms of your commitment to work on your writing projects. So, this idea of carrying out this documentation is supported by research done by Robert Boice. He found that those academics who record their writing are four times more productive than those that do not. Those that actually share their writing with colleagues are nine times more productive. This is sort of a case in point. This is a snapshot of a Google sheet of such a writing log that I was sharing as part of a Google workbook. I was sharing it with three other colleagues. I had the possibility of them taking a peek at my Google sheet, and that possibility I found to be highly motivating.
[00:07:17.458] Reducing switching costs
As you can see, on July 24th, 2023, I worked on five different writing projects. This would not have been possible if it had not been for having five separate writing logs where I could figure out where I had started and where I would report the day's progress before maybe taking a break and then switching to another writing project. The writing log helps reduce switching costs between projects.
[00:07:46.480] Motivation
My motivation for developing this project-specific log that I'm presenting here is to support clearer thinking about the science that I'm trying to do, hopefully leading to better science, as well as accelerating the completion of the writing project. The secondary purpose is to enable working on multiple writing projects in parallel. This is important to be able to harness your subconscious. If you work on project A for a few hours in the morning, say early morning, then late morning you work on project B. While you're working on project B, your subconscious is busy working away on project A. As a result, perhaps the following morning, when you wake up or while you're taking a shower or commuting, new ideas will emerge for projects A and B as a result of these background jobs that you have launched. If you don't work on project A, then you're not going to get the benefit the following morning. The side effects of using this writing log are that it reduces the fear of forgetting and also reduces the fear of losing momentum. These are two barriers to attempting to carry out work on multiple writing projects in a given day. This problem of dealing with multiple writing projects is one that is not discussed in books about writing. It's apparently a very difficult problem. I think my writing log is a successful solution to that problem.
[00:09:31.520] Overview of the writing log
This is an overview of the writing log in Org mode. It has various components. I don't have time to go through all of them in detail, but you can see its structure. We get this summary view when you open up the file. You have this in the header for a startup command overview. Then I just click on the heading and hit TAB to see the contents below. So normally, I'm just going to go straight to the daily log. In this case, it starts on line 944.
[00:10:17.295] LaTeX preamble in opened drawer
I don't have to scroll all the way down to it, because thanks to the support for folding of these sections in Org mode, if I open up the drawer labeled :PREAMBLE:, you can see that I have imported a number of LaTeX packages to enhance the format of the PDF file that is upon export.
[00:10:42.668] Informative header
I have commands that are listed below at the bottom for providing a fancy header. This header has the current date as well as a running title and the current page number and total number of pages. You can see in the center the header at the start of page 2. You can see the bottom of page 1 where the page number is at the bottom of the page. These headers are very useful if you happen to print out several log files and their corresponding manuscripts and take them with you to work on them while traveling. Invariably, the pages will get intermingled, and you'll have to sort them out when you return home. These headers ease that problem. You can see that the table of contents that begin the writing log is hyperlinked to various sections. In addition to the table of contents, the log file, of course, will support various graphical objects like images, tables, equations, code listings. I also have added LaTeX support for an index, a list of acronyms, glossary, mathematical notation, and literature cited. It takes no effort to add these in, so why not have them available? These features are also available in the annotated bibliography template, which helps support making that annotated bibliography far more relevant and interesting.
[00:12:21.400] Four workflows
This shows a list of four workflows that I'm going to discuss, since I don't have time to go through each of the items. Obviously, project initiation occurs on day one. If I have a three- or four-hour block of time, that's sufficient to finish project initiation. Then the daily workflow is obviously what occurs every day to move the project forward. The periodic assessments are done on a monthly or weekly basis, generally on the weekly basis as the submission deadline approaches. Then after you have received the galley proofs and sent them back, there are a few chores that need to be done in terms of project closeout. This is an example of a protocol that could be followed to do that, and an example of the kinds of more or less appendix material that could be included in the writing log to help get these things done.
[00:13:28.080] Project initiation workflow
This shows a project initiation section of the workflow. I go through a series of sections that include advice about what I need to do to complete each section. The rationale section asks me like, why are you doing this? Why should you do this? Why not somebody else? Those sort of fundamental questions. Then I have a drawer labeled guidance that I have, and that headline immediately above, I have this :noexport: keyword so that guidance is not written out upon export to the PDF unless you want it. If you want it, you have to remove the :noexport: tag. Then I have the response to these questions--in this case, a list of journals that I'm targeting for submission of this review article. I have a plan B journal picked out in case the editors decide to reject it. Having a plan B journal picked out is a decision you can make at the time of submission, so that you're prepared to move quickly if the article is rejected.
[00:14:56.960] Daily workflow
This shows the daily workflow section. Each entry has a date. I sometimes annotate the dated entries with a small phrase to highlight certain events. Within a given entry, I'll have a list of accomplishments. That's sort of the bare minimum of what I include. This just demonstrates how relatively brief these entries are. Just whatever distinct accomplishments were made are listed. Sometimes I'll include the goals for that day. I'll always include the correspondence related to the project. I'll copy and paste an email into a quote environment from LaTeX. I have a snippet template for auto-generating these entries. It will insert the date, for example, in the subheading. Then below that, I'll have the next action, following David Allen's Getting Things Done approach where you identify the next thing that needs to be done. That may have come from a to-do list that's indicated below that. Beyond that, there's sections for some writing accountability, and then a reminder to go about updating your Zettelkasten and Org-roam if you have come across any nuggets of knowledge you want to add to your Org-roam. Then below that, there's another section for the storage of additions to be made to the manuscript. Maybe they're not ready to go yet, so this provides a spot for them to be incubated, a sandbox, if you will, where you have room to develop them further before they're ready to be transferred over to the main manuscript. I also have a section there too for the incubation of new ideas for new projects.
[00:17:05.751] Metadata and metacognition
So this kind of metadata and metacognition about the project are often stored in commented out regions or in comments, like MS Word documents. These are often stripped out in the rush to submit the manuscript, and they're quite often lost. Yet they can be invaluable, not only for the preparation of future manuscripts, but they can be very invaluable for responding to critiques by reviewers. This writing log provides ample room for the safe storage of such information, such knowledge.
[00:17:48.885] Periodic assessment workflow
Then periodically, every several months or weeks, we'll carry out an assessment of the project. We go through a checklist for the completion of the manuscript. We also have a timeline with milestones identified. Of course, Org has these wonderful tables that are very dynamic. If you need a wider column to accommodate a new entry, it self-adjusts. These self-adjusting tables are one reason why I was attracted to Org mode, because coming from LaTex, where trying to make changes to tables is quite difficult. Below that, there's a section to make assessments. There are four questions that I address about the status of the project. One really good question is, why can't you submit this project today? What's holding it back? Other such existential questions are important to ask from time to time.
[00:18:56.960] Project closeout workflow
Then finally, the project closeout workflow. So this is in the form of a checklist. This checklist in the main template is already included, but you could include it from an external file. Of course, that checklist will be only in the PDF when it's included in this fashion. It won't be in the Org file, but you can view that checklist by clicking on its file path. It serves as a link that will open up in an Org buffer. The advantage of taking a modular approach to this sort of appendix material is that you can update your protocols and the updated protocols will be available to all log files across all projects.
[00:19:49.640] Conclusions
In conclusion, this project-specific log file helps narrow the focus on one project. It provides space to harbor the thinking about that project, and it helps support the project initiation and sustain its momentum and facilitate its completion. The side effects of using this log file for one project is that it dampens the fear of forgetting, the fear of losing momentum, which inhibits us working on more than one project in a given day.
[00:20:34.520] Acknowledgements
I would like to thank my friends at the Oklahoma Data Science Workshop. We hold this workshop every third Friday at noon central time by Zoom. It's open to participation by people from all around the world. Send me an email if you are interested in the applications of computing to scientific research. I participate occasionally in these Emacs meetups, and I have shared this writing blog with members of the UK Research Software Engineer group through the Emacs Research Slack channel. My efforts are supported by funding from these grants. I'll be happy to take any questions.
Captioner: sachac
Questions or comments? Please e-mail blaine-mooers@ouhsc.edu