Conversion rate modeling: worth the effort?
Conversion rates are foundational for understanding and optimizing a business. Decisionmakers may ask, for example:

What's the clickthrough rate on our landing page?

What fraction of users sign up for our premiumtier service?

How long does it take users to upgrade to a new version of our product?

What fraction of job candidates accept our offer?
These are deceptively simple questions. They are tricky to answer correctly because successful conversions are delayed by variable lengths of time, and because we often don't get any explicit signal at all for failed conversions.
In this article, we'll compare conversion modeling to the common analytics approach, and explore how to decide between the two. Throughout, we'll use a hypothetical example where we want to know the rate at which new users sign up for our premiumtier service.
TL;DR

A common approach to computing conversion rates is to pick a fixedlength observation window. Quick and easy, doable entirely in SQL, but crude.

Conversion models describe both the conversion rate and how long it takes for conversions to happen. They are more powerful than the fixedwindow method but require some investment to learn, implement, and communicate.

For a given task, if you can choose an observation window length that captures almost all conversions with high certainty and is short enough to serve as useful feedback for business decisions, go with the fixedwindow method. If not, or you want to optimize timetoconversion instead of the final rate, go with a conversion model.
The quickanddirty approach
Take our example of new users signing up for a premiumtier service. To be concrete, let's say today is March 10th. To keep it simple, suppose we get one new user per day. Our data look like the following plot, where each user is represented by a row. The circles show when new users arrive, stars indicate conversions, and hourglasses indicate the users who have not (yet) converted.
Four users have converted so far, but it took each of them a few days to think about it. We have no signal from the other either users.
The naïve conversion rate computation is simply the number of converted users divided by the total number of users (4/12 = 33% in our toy example). This approach is not a good idea; users who arrived more recently haven't had enough time to make up their minds yet.
A better way—especially when comparing conversion rates across segments, cohorts, or experiment variations—is to use fixedlength observation windows. In our demo scenario, we might choose a 7day window, for example. The clock for a user (or session, job offer, etc) starts as soon as they arrive.

If the user converts within the window, we count a success. In our running example, users B, D, and F fall into this category.

If the user doesn't convert during the observation window, we count a failure. Even if we know they convert later, we still count a failure. Users A, C, and E are in this category (user A did convert, but after the window closed).

If the user's observation window is still open, we ignore them entirely.
The conversion rate is the number of successful conversions divided by successes plus failures. With this method, our example conversion rate would be 3/6 = 50%, quite a bit different from the naïve result.
To track this metric over time, we group users into timebased cohorts and compute the conversion rate for each cohort that has a full observation window. Cohorts still under observation should not be reported. If we use daily cohorts for our example (and assume now that we have many new users per day), our tracking plot would look like this:
A powerful alternative
Our hypothetical example shows how the standard approach can fail: user A converted after the observation window ended, so they were counted as a failure. We could lengthen the observation period to avoid this, but the longer the window, the longer we have to wait to observe the metric and the slower our decisionmaking cadence gets.
Conversion models avoid this delicate balance by modeling timetoconversion as best as we know it so far, instead of binary conversion status. In our scenario, for example, we don't know if or when user E will convert, but we do know their timetoconversion is at least 8 days. In tabular form, this looks like
user id  minimum time to convert (days)  has converted 

A  10  true 
B  3  true 
C  10  false 
D  3  true 
E  8  false 
Conversion models adjust statistically for the cases when conversion hasn't happened yet (known as censored data). The canonical model output is a plot that shows the estimated conversion rate as time passes from the moment conversion becomes possible, e.g. a new user arrives.
The conversion curve for our running example shows that about 35% of new users convert within 7 days, but 72% of users convert if we just wait another couple weeks. Our fixedwindow approach was very misleading because we chose a bad window length.
For more on how conversion models work, we highly recommend Erik Bernhardsson's posts on the topic (first, second), as well as his Convoys Python package. Conversion models are closely related to survival analysis, which has an extensive literature. Two introductory sources we like are the Lifelines Python Package and Keith McNulty's book Handbook of Regression Modeling in People Analytics. We will dig into conversion model interpretation and decisionmaking in a future article.
Which approach should I use?
If you can find a good observation window, use the fixedwindow method
The fixedwindow approach is simpler and faster than conversion modeling. It can be done entirely in SQL without leaving your data warehouse or BI platform, requires no explanation for business operators, and you don't have to spend any time upfront learning new models and software packages.
To use the fixedwindow approach, however, you must be able to pick an observation window that
 is long enough to catch almost all conversions with high certainty, and
 is short enough to serve as useful feedback for product and business decisions.
Estimating clickthrough rates on a landing page, for example, is a great candidate for the fixedwindow approach because a window of 12 or 24 hours is likely long enough to capture almost all clickthroughs but still short enough that the output metric can serve as feedback for optimization. The job offer acceptance example would also work well with a fixedwindow approach; the observation window would have to be several weeks long, but the optimization cadence for this use case is typically even slower.
If you can't pick an observation window that balances these criteria, then you should switch to conversion modeling. Premiumtier signup rate is a good example of this: conversions may be delayed for weeks or even months, which is far too long for most businesses to wait for feedback on missioncritical product changes.
Pro tip: even if you prefer a fixedwindow approach, fitting a conversion model is a great way to check whether the fixedwindow approach is feasible and to choose the length of the observation window.
If timetoconversion matters, use a conversion model
Maximizing the conversion rate is not the only way for a business to optimize; we can also reduce the timetoconversion. In cases with high conversion rates, this may be the only optimization avenue left. If reducing conversion time matters, you should invest in conversion modeling.
The canonical model output shows both the final asymptotic conversion rate and the distribution of timetoconversion. Suppose we A/B tested a new email marketing campaign to persuade new users to sign up for our hypothetical premium service. The blue curve below describes the control group and the orange describes the treatment group. The final conversion rate is the same for both groups (72%), but the treatment group converted much more quickly. (60% conversion at 5 days for treatment vs. 12 days for control).
In this scenario, the fixedwindow approach would either incorrectly show different conversion rates (with a short window), or miss the timetoconversion impact altogether (with a long window).
A good example where timetoconversion is more important than conversion rate is new product rollout and adoption. In this use case, there isn't any observation window that makes sense; the adoption rate rises steadily until the whole user base is on the new product version (or an even newer version comes out). Conversion modeling is a much more natural fit for this scenario.
Other things to consider

Both conversion rate approaches we've discussed apply only to situations with binary outcomes (e.g. clickthrough or not, premium signup or not, etc). If your task has more than two possible outcomes, you'll either need to collapse the outcomes into two categories or use a different method.

Don't confuse fixedlength observation windows with calendarbased observation windows, where the observation period starts on the same date and ends on the same date for all users, regardless of when they arrived. Unless the group enrollment window is very short relative to the observation window, late arrivals will have less time to convert than early arrivals—the original problem we wanted to avoid.
What next?
We now have a better sense of when conversion modeling is worth the investment vs. the quickanddirty fixedwindow approach. Check out Erik Bernhardsson's blog posts (here and here) for more on the nuts and bolts of conversion modeling. In our next installment, we will take a closer look at how to interpret conversion models and conversion curves and how to make business decisions based on them.