Survival Analysis

Understanding how long it takes for an event to occur is important for many applications. For predictive maintenance, for example, we want to know which pieces of hardware have the shortest time until failure. For customer support, how long until new tickets are resolved? Survival analysis, also known as time-to-event modeling, duration analysis, and reliability analysis, is the family of tools that help us answer these kinds of questions.

Applications of survival analysis (that aren't clinical research)

Survival analysis has been a standard tool for decades in clinical research, but data scientists in other domains have mostly ignored it. Here are some applications for which you might use survival analysis, to jump-start your creative data science engine.

How to compute Kaplan-Meier survival curves in SQL

Decision-makers often care how long it takes for important events to happen. In this article, I show how to compute Kaplan-Meier survival curves and Nelson-Aalen cumulative hazard curves directly in SQL, so you can answer time-to-event questions directly in your SQL-based analytics tables and dashboards.

How to construct survival tables from duration tables

The survival table is the workhorse of univariate survival analysis. Previously, I showed to build a duration table from an event log; now I show how to take the next steps, from duration table to survival table and estimates of survival and hazard curves.

How to plot survival curves with Plotly and Altair

Survival curve plots are an essential output of survival analysis. The main Python survival analysis packages show only how to work with Matplotlib and hide plot details inside convenience functions. This article shows how to draw survival curves with two other Python plot libraries, Altair and Plotly.

How to build duration tables from event logs with SQL

Duration tables are a common input format for survival analysis but they are not trivial to construct. In our last article, we used Python to convert a web browser activity log into a duration table. Event logs are usually stored in databases, however, so in this article we do the same conversion with SQL.

How to convert event logs to duration tables for survival analysis

Survival models describe how much time it takes for some event to occur. This is a natural way to think about many applications but setting up the data can be tricky. In this article, we use Python to turn an event log into a duration table, which is the input format for many survival analysis tools.

Conversion rate modeling: worth the effort?

Conversion rates are essential for understanding and optimizing a business. In this article, we compare conversion rate modeling to a common analytics approach and show how to decide between the two methods.