Phenotyping: high precision in the creation of patient cohorts based on electronic health record data.

SporeData services

Jack is on a mission. He is attempting to create a cohort of patients in the Intensive Care Unit with a diagnosis of ruptured cerebral aneurysms. While Jack is excited as this would become a significant platform for his research program, the challenges of selecting patients while using an electronic health record (EHR) keep on increasing. For one, ICD-10 codes in the EHR are not that entirely reliable. Either they will miss cases (false negatives) or will include the ones who are not with an aneurysm. [Read More]

Data analysis workflow at SporeData

SporeData services

Researchers will often ask us about our data analysis workflow. Most of our projects fall into two main categories. First, when a project starts after a proposal is awarded Second, when we provide Data Science support for research teams, Departments, Health Systems, companies, governments, or other organizations. Here are our main principles: Continuous delivery. The overarching principle behind our workflow is continuous delivery. In other words, we deliver a new set of results every week, usually on a Friday. [Read More]

Project workflow at SporeData.


Over the years, we have progressively polished a project workflow together with our collaborators in academia, government, and the corporate world. Below we provide some of the pillars behind this workflow. Written summary of the project. At the beginning of each project, we provide a written summary outlining the problem we are addressing and its importance, the methods used to address that problem, and our deliverables along with their expected impact. [Read More]

Redcap and custom item banks for Computerized Adaptive Testing (CAT).

Novel methods

Although PROMIS (Patient-Reported Outcomes Measurement Information System) has created several Computerized Adaptive Testing (CAT) systems, at this point, most of them target general conditions. In contrast, the literature has repeatedly demonstrated that condition-specific assessment tools tend to be more sensitive. This characteristic translates into a higher probability of detecting real differences between treatments, as well as smaller sample sizes in clinical trials. We clean and prepare your dataset. This phase of the project ensures that your dataset is in a format that is ready for analysis. [Read More]

Machine learning calculators

Novel methods

We build state of the art machine learning-based calculators, with the following characteristics: Specific to your patient population. Data scientists create your predictive model based on data from your patient population. This population-specific data will ensure that the predictions are as precise as possible to your patients. We build different models for a range of patient outcomes, including mortality, complications, readmissions, and cost. Personalized risk calculation. After our models demonstrate adequate precision, we turn them into prediction calculators. [Read More]

Risk sharing model for grant proposal writing.

SporeData business models

Below we explain our risk-sharing model for research proposals: Risk sharing concept. The concept behind risk-sharing is that when we come in as a subcontractor in your proposal, we will be sharing the risk of the proposal not being funded. In other words, if the funding agency awards the project, SporeData is paid to deliver the Data Science services outlined in the proposal. If the agency does not award the project, then you have no costs, and we will usually attempt to reshape the proposal and resubmit. [Read More]

Data Science support.

SporeData services.

SporeData provides Data Science support services to a variety of groups in the US, including academic, government, and startup teams. Below we give details on these services. Traditional statistics. We cover a wide spectrum of traditional statistics, including longitudinal analyses (survival and mixed models), spatial statistics, and causal models. Machine learning models. We have a mature pipeline to process machine learning models at three levels: prognostic models (precision medicine), Natural Language Processing (conversion of free text to a spreadsheet format), and image processing (automated recognition of imaging signs and diagnoses). [Read More]

Improving the quality of communication in grant proposals.

Novel Designs

We have been working on a series of improvements to our grant writing pipeline and thought that it would be interesting to share these features with our collaborators. Plots and tables to represent preliminary work. We have now further integrated our AI-based research design system – something that we previously mentioned in one of our previous posts – to create tables and plots based on preliminary work relevant to the grant project. [Read More]

Communicating results through transmedia.

Novel Designs

Transmedia means that the same healthcare information is told to patients using different forms of media, with each media adding different perspectives. For example, while an infographic might list risk factors that a patient should avoid, a video can tell a story about how a specific patient changed her behavior to avoid those same risk factors. Below we describe how SporeData creates transmedia: Clinical topics. Our transmedia covers therapeutic options and their indications, risk factors, causes, and prognosis. [Read More]

Affordable multisite trials

Novel Designs

At SporeData, we do not focus on data collection. Instead, we dedicate all of our time to the design and analysis of studies. This focus means that we can design and analyze your trial while also training your company staff to monitor each participating site. Since we are not coordinating the data collection, you have a significant cost reduction. Here are some details about our services: Research design. Our trial designs include a large number of options, including patient-centered methods, cluster trials such as stepped-wedged, Bayesian adaptive, n-of-1, equivalence, cross-over, among others. [Read More]