Writing up a PhD: some numbers
Today I finally submitted my PhD thesis (🎉) and what better way to celebrate that day than doing some data analysis and visualization, right? So, let’s get some idea on how much time the actual write-up took after all and how I spent my time during these weeks. Luckily, I’m running RescueTime on my machines. The service allows you to monitor how, when and how long you’re using your computer. In addition to this it also keeps track of which programs you’re using, which websites you’re visiting and additionally even classifies these things into very productive, productive, neutral, distracting and very distracting usage. In short: Writing, good. Facebook, bad.
Thanks to the RescueTime API it’s rather easy to export this data and end up with a nice spreadsheet. If Python is your language of choice, there is even an easily usable example on GitHub. As I did more or less daily Git commits of my writing progress, I know that my writing process (and all the related paperwork, which included a bazillion forms) took from 2017-07-10 to 2017-09-20. Let’s have a look on how much time I spent working during this time. To have some baseline comparison I quickly grabbed data for a similar timeframe (from 2017-05-03 to 2017-07-03, bound by traveling before and after). I plotted the cumulative time I worked during these time periods, categorized by the alleged productivity levels, showing some fun differences between the time before the writing (subfigure A below) and during the writing process (subfigure B below). Not only has the total productive time doubled during the writing period, the weekends spent writing can be seen by the constant slopes for long periods as well.
We can also have a quick look into which activities kept me most busy during the 72 days of writing (subfigure A below). Don’t judge me, but I did actually end up writing my thesis in Microsoft Word. If you’re a LaTeX-follower and are mad now, maybe have a look at this paper before judging me too harshly (If you’re a dedicated Word user and feel smug now, look at this critique of the paper). In any case, not really surprisingly, I spent around 30% of my time somehow interacting with MS Word. Even combined, my second and third most heavily used pieces of software during the time – Terminator (my favorite terminal on Linux) and iTerm2 (my favorite terminal on Mac OS) – don’t come close in total usage time. There’s also some offline time (that you can manually log with RescueTime), like the Bioinformatics Open Source Conference which took place in Prague in July and further PhD related offline duties, as journal clubs, group seminars, and general mentoring things. If one looks at the cumulative hours spent for some of the activities (subfigure B below, the grey backdrop is the cumulative sum of productive time) one can nicely see how basically all other things give way for the final writing and editing.
Still, somehow there’s a lot of time missing from these plots, which is categorized as productive but didn’t make the cut for the top ranking tools. Which is somewhat expected, as RescueTime performs its activity-classification even on the level of websites for all web-related stuff (how else would it know when I’m slacking off, tweeting about whatever?). For that reason, all literature research on the web is basically absent, as there’s a thousand different scientific journals, which will all be put into their own tiny activities. Luckily there’s some higher-level classification provided in RescueTime as well. If we look at the Category level this can be nicely seen. Here, things like iTerm2, Terminator, Atom, GitHub etc. all grouped into General Software Development. And it is basically only now that the category of General Reference and Learning pops up, made up of lots of visits to all the different journals.
It’s fun to track the own writing progress like this and to be able to get a retrospective picture of the time put in. For the next thesis I’d totally do that again. I’ll deliver some plots on how much coffee I had as soon as I can calculate those numbers. 😂