Project workflow at SporeData.


Over the years, we have progressively polished a project workflow together with our collaborators in academia, government, and the corporate world. Below we provide some of the pillars behind this workflow.

  1. Written summary of the project. At the beginning of each project, we provide a written summary outlining the problem we are addressing and its importance, the methods used to address that problem, and our deliverables along with their expected impact. This summary comes down to an abstract where we replace the Results section by a mock version outlining the format of the final results. These summaries ensure that all collaborators understand, from the beginning, what the project will accomplish. This summary also serves as a guide for all subsequent steps in the project.
  2. Weekly deliverables. Once we start the project, we will send our collaborators weekly updates. Collaborators review these updates, and we update the project to realign it with their expectations. These iterative cycles are crucial to ensure that the final project aligns with our collaborators’ expectations.
  3. Mock results. When faced with complex datasets requiring extensive and time-consuming data management, we usually cannot promise immediate results. In these situations, we will often generate mock results through simulation analyses. Although mock results bear no relationship with the final results based on real-word, simulated data, they will get both our collaborators and ourselves to think about the overall analytical strategy as well as about which variables and datasets we might be missing.
  4. Responsiveness to feedback. The primary reason for our short, iterative cycles is that they allow us to be in sync with our collaborators’ expectations. Without them, each collaborator would likely have a different idea about the project’s focus and ultimate deliverables. These iterative loops also keep the project flexible. For example, if the initial design turns out not to be realistic, the flexibility provided by this workflow allows us to change its course and guarantee success.
  5. Dynamic projects with flexible payment schedules. To align our dynamic workflow with a payment model, we have developed the concept of Data Science support. In this model, our sponsors will pay for a certain number of person-hours from our Data Scientists. We keep an online spreadsheet that is constantly updated to demonstrate where our Data Scientists spent these hours. In contrast with in-house staff, this model results in a highly cost-efficient model for our sponsors, since they only pay for the time spent on their projects, and they can audit how Data Scientists spent this time in the project. Just note that we do not apply the support model when the project is in response to a competitive proposal where we have worked in an “at-risk” model.