Back to the schedule
Previous: Babel for academics
Next: Reproducible molecular graphics with Org-mode

Managing a research workflow (bibliographies, note-taking, and arXiv)

Ahmed Khaled

CategoryOrgMode CategoryOrgRoam

Q&A: maybe live
Duration: 8:47

This talk was also streamed at an alternate time for APAC hours: https://libreau.org/past.html#emacsconf21

If you have questions and the speaker has not indicated public contact information on this page, please feel free to e-mail us at emacsconf-submit@gnu.org and we'll forward your question to the speaker.

00:00 Introduction 00:51 Elfeed 02:30 org-ref 03:50 BibLaTeX 05:48 Notes and org-roam

Description

Configuration I use in Doom Emacs as part of my academic reading/notetaking workflow

Researchers and knowledge workers have to read and discover new papers, ask questions about what they read, write notes and scratchwork, and store much of this information for use in writing papers and/or code. Emacs allows us to do all of this (and more) using simple text interfaces that integrate well together. In this talk I will talk about the following:

a. Using elfeed and elfeed-score to read new papers from arXiv. b. Using org-ref to import arXiv papers of interest into a local bibliography. c. Using Emacs hooks with biber and rebiber in order to keep the local bibliography clean and up-to-date with conference versions of papers. d. Using org-roam and org-roam-bibtex to take linked, searchable notes in org on research papers.

This text-based workflow allows for keeping everything accessible under version control and avoids the platform lock-in of binary formats (e.g. Mendeley). I will share my Doom Emacs configuration for this workflow, but it is not limited to Doom.

Discussion

  • Are there any good packages for emacs/Lisp libraries that are similar to Matplotlib/Pyplot/Numpy?
    • use numpy with org-mode and babel
      • plotting is a bleak spot in the lisp space, racket has a built in plot library that is probably the best on that front
  • are these helper functions public?
  • this talk just gave me an idea, I organize repos inside ~/code/{github.com,gitlab.com,gnu.org,etc}/author@repository-name.git - and I can instead use a single directory and use this strategy for projectile-switch-project where author is one column, repository name is another, git remote is another, etc

Outline

  • 5-10 minutes: I will demo the packages I use in 5 minutes.

Transcript

[00:00:00.480] Hello, everyone. My name is Ahmed and I am very happy to be here. Today I'll talk about my academic workflow inside Emacs. So the main needs that I have is to keep up with the current research in my field, and to be able to take notes, and write, and use these notes later in writing my papers inside Emacs. Emacs is a great program for this because it is very extendable and we can basically write whatever we are missing. It replaced my earlier proprietary workflow that involved using Mendeley and Visual Studio Code and many other tools in order to do the things that I'll show today.

[00:00:51.760] So the first concern that I have is to keep up with new papers. To do that, I use this package called elfeed. Elfeed is basically just an RSS reader, and here I fetch all the papers that I need from arXiv, which is the main source of papers in my field and many other scientific fields. It allows me to view these papers with the abstracts and so on. In order to simplify viewing and searching for relevant papers, I used this other package called elfeed-score, and elfeed-score enables me to assign a numerical score like this to each of the archive entries. This numerical score is very simple. It's just based on matching things. So, for example, we can ask elfeed to explain this. So if we say = x, then this just says that this matches three rules for a score of 76. This paper. This is simply because I am searching for these keywords that are very interesting to me, such as neural networks or federated learning. And now, if I see a paper here that I am interested in...

[00:02:30.239] Let's say I'm interested in this paper about Gaussian Process Inference, then I want to store it in my local library. So I want the PDF and I want to be able to cite it in the future. To do that, I use a package called org-ref that allows me to fetch papers from arXiv. So here I wrote a helper function, this elfeed-entry-to-arxiv that automatically gets this paper. It asks me where to put it, it completes with my default libraries, and then it fetches the paper from arXiv and places it in this folder, and also places it in my bibliography file which is written in BibLaTex. So here, if we search for this paper now, we find that it is in our library. This library interface is from a package called citar, and I have customized it quite a bit to display all of the papers in my library in this format.

[00:03:50.560] This just reads from a BibLaTeX file. So if we open it like this, you'll see that this is the the entry that it placed. One of the interesting things here is that org-ref actually doesn't really fetch all of the entries in this format. Moreover, I want all the entries in my file to look quite similar, and to have this very similar look, and the way I accomplish that is by using several tools and chaining them. So in order to see this... So here, this is the function that I used to... This is basically run as a hook after each time Emacs modifies the bibliography file, and it runs rebiber which gets the conference versions of papers that I fetch from arXiv, because arXiv is a pre-print directory, and then biber normalizes the arXiv file to have a consistent look, and then I apply just some substitutions which I like more. Finally, I have the whole thing under version control. This function, reformat-bib-library, I make it into a hook and I run it every time after I save. It just checks if the current buffer is the main bib library. We will just reformat the library. This allows me to keep the library looking all consistent like this. By the way, all of the code is available. You don't have to get it from the video. I will attach it as a GitHub gist.

[00:05:48.720] One of the things that are really important is that I want to be able to keep notes on papers that I read. For example, here are some of my existing notes. Now, let's add a note to the paper that we just got. So the the pipeline here is that I use citar with embark, which is another library, but you can use any other library just for completion and acting upon completion, like ivy, and I ask it to open notes and then it asks me how to capture it. So these capture templates are handled by the org-roam package, which is a very, very interesting package for note-taking. org-roam, among other things, allows us to write linkable notes in Org mode, and moreover, it is very extensible. There is another package called org-roam-bibtex that allows us to attach these nodes to bibliography files, which is what I'm doing right now. For example, I set up the capture template such that when I press s for short bibliography reference, it will make a new headline in my "Reference Notes" note, and I can write things here (so, for example, "seems interesting") and then note here that it added this paper to ROAM_REFS, so this means that when I look at these papers using citar, it will be able to find this note. Similarly, we can also add long-form notes. For example, if I do this and I add r, it will create an entirely new file that I can take detailed notes in. The strengths of org-roam is that I can do things like linking papers. For example, here are several books that I am reading. This file just collects these books so that I can find them for easy reference. Of course, I can link these files from inside. You can see here that I also use org-cite to cite other files, and I can act upon this and open the notes corresponding to this other book. So I'm a little short on time. I cannot go into detail on everything, but I will share my configuration, and I hope that this will inspire other people to also use Emacs for their academic workflows. Thank you so much. captions by sachac

Back to the schedule
Previous: Babel for academics
Next: Reproducible molecular graphics with Org-mode