Simultaneous Testing Docs #670

dpannasch · 2024-01-29T15:26:47Z

Motivation / Description

Changes introduced

Linear ticket (if any)

Additional comments

Typo fix

RCGitBot · 2024-02-01T19:32:00Z

Previews

temp/configuring-experiments-v1.md

See contents

Before setting up an experiment, make sure you've created the Offerings you want to test. This may include:

Creating new Products on the stores
Setting Offering Metadata, and/or
Creating a RevenueCat Paywall

You should also test the Offerings you've chosen on any platform your app supports.

Setting up a new experiment

First, navigate to your Project. Then, click on Experiments in the Monetization tools section.

[block:image]
{
"images": [
{
"image": [
"https://files.readme.io/c7867f1-Screenshot_2024-01-29_at_10.45.53_AM.png",
"Screen Shot 2022-11-30 at 3.39.08 PM.png",
2510
],
"align": "center"
}
]
}
[/block]

Select + New to create a new experiment.

[block:image]
{
"images": [
{
"image": [
"https://files.readme.io/62d3199-Screenshot_2024-01-29_at_10.46.43_AM.png",
null,
"Creating a new experiment"
],
"align": "center",
"caption": "Creating a new experiment"
}
]
}
[/block]

📘 Setting up an Offering for your treatment

If you've not setup multiple Offerings before, you'll be prompted to do so now, since you'll need at least 2 available Offerings to run an experiment.

The treatment Offering represents the hypothesis you're looking to test with your experiment (e.g. higher or lower priced products, products with trials, etc).

For App Store apps, we recommend setting up new products to test as a new Subscription Group so that customers who are offered those products through Experiments will see only that same set of products to select from their subscription settings.

Required fields

To create your experiment, you must first enter the following required fields:

Experiment name
Control variant
- The Offering that will be used for your Control group
Treatment variant
- The Offering that will be used for your Treatment group (the variant in your experiment)

Audience customization

Then, you can optionally customize the audience who will be enrolled through your experiment through Customize enrollment criteria and New customers to enroll.

Customize enrollment criteria

Select from any of the available dimensions to filter which new customers are enrolled in your experiment.

Dimension	Description
Country	Which countries are eligible to have their new customers enrolled in the experiment.
App	Which of your RevenueCat apps the experiment will be made available to.
App version	Which app version(s) of the specified apps must a new customer be on to be enrolled in the experiment.
RevenueCat SDK version	Which RevenueCat SDK version(s) of the specified SDK flavor must a new customer be on to be enrolled in the experiment. (NOTE: This is most likely to be used in conjunction with features like RevenueCat Paywalls which are only available in certain SDK versions)

New customers to enroll

You can modify the % of new customers to enroll in 10% increments based on how much of your audience you want to expose to the test. Keep in mind that the enrolled new customers will be split between the two variants, so a test that enrolls 10% of new customers would yield 5% in the Control group and 5% in the Treatment group.

Once done, select CREATE EXPERIMENT to complete the process.

Starting an experiment

When viewing a new experiment, you can start, edit, or delete the experiment.

Start: Starts the experiment. Customer enrollment and data collection begins immediately, but results will take up to 24 hours to begin populating.
Edit: Change the name, enrollment criteria, or Offerings in an experiment before it's been started. After it's been started, only the percent of new customers to enroll can be edited.
Delete: Deletes the experiment.

🚧 Sandbox

Test users will be placed into the experiment Offering variants, but sandbox purchases won't be applied to your experiment.

If you want to test your paywall to make sure it can handle displaying the Offerings in your experiment, you can use the Offering Override feature to choose a specific Offering to display to a user.

Running multiple tests simultaneously

You can use Experiments to run multiple test simultaneously as long as:

All audiences being enrolled in running tests are mutually exclusive (e.g. either two tests have exactly the same audience, or have fully unique audiences)
A given audience has no more than 100% of its new customers enrolled in experiments

If a test that you've created does not meet the above criteria, we'll alert you to that in the Dashboard and you'll be prevented from starting the test, as seen below.

[block:image]
{
"images": [
{
"image": [
"https://files.readme.io/6e94d93-Screenshot_2024-01-29_at_11.22.45_AM.png",
"",
""
],
"align": "center"
}
]
}
[/block]

Examples of valid tests to run simultaneously

Scenario #1 -- Multiple tests on unique audiences

Test A is running against 100% of new customers for your App Store app
Test B, targeting 100% of new customers for your Play Store app, can also be run since its targeted audience is mutually exclusive with Test A, and no more than 100% of each audience's new customers are being enrolled in running experiments

Scenario #2 -- Multiple tests on identical audiences

Test A is running against 20% of new customers in Brazil
Test B, targeting 40% of new customers in Brazil, can also be run since its targeted audience is identical with Test A, and no more than 100% of that audience is being enrolled in running experiments

Examples of invalid tests to run simultaneously

Scenario #3 -- Multiple tests on partially overlapping audiences

Test A is running against 100% of new customers for your App Store app
Test B, targeting 100% of new customers in Brazil, cannot be run because there is partial overlap between the audience of Test and the audience of Test B (new customers using your App Store app in Brazil).
1. To run Test B, either Test A will need to be paused, or the audience of Test B will need to be modified to exclude new customers from the App Store app.

Scenario #4 -- Multiple tests on >100% of an identical audience

Test A is running against 20% of new customers in Brazil
Test B, targeting 100% of new customers in Brazil, cannot be run because the targeted audience would have > 100% of new customers enrolled in experiments, which cannot be possible.
1. To run Test B, either Test A will need to be paused, or the enrollment percentage of Test B OR Test A will need to be modified so that the total does not exceed 100%.

📘 Editing running experiments

When an experiment is running, only the percent of new customers to enroll can be edited. This is because editing the audience being targeted would change the nature of the test, rendering its results invalid.

FAQs

[block:parameters]
{
"data": {
"h-0": "Question",
"h-1": "Answer",
"0-0": "Can I edit the Offerings in a started experiment?",
"0-1": "Editing an Offering for an active experiment would make the results unusable. Be sure to check before starting your experiment that your chosen Offerings render correctly in your app(s). If you need to make a change to your Offerings, stop the experiment and create a new one with the updated Offerings.",
"1-0": "Can I run multiple experiments simultaneously?",
"1-1": "Yes, as long as they meet the criteria described above.",
"2-0": "Can I add multiple Treatment groups to a single test?",
"2-1": "No, you cannot add multiple Treatment groups to a single test. However, by running multiple tests on the same audience to capture each desired variant you can achieve the same result.",
"3-0": "Can I edit the enrollment criteria of a started experiment?",
"3-1": "Before an experiment has been started, all aspects of enrollment criteria can be edited. However, once an experiment has been started, only new customers to enroll can be edited; since editing the audience that an experiment is exposed to would alter the nature of the test.",
"4-0": "Can I restart an experiment after it's been stopped?",
"4-1": "After you choose to stop an experiment, new customers will no longer be enrolled in it, and it cannot be restarted. If you want to continue a test, create a new experiment and choose the same Offerings as the stopped experiment. \n \n(NOTE: Results for stopped experiments will continue to refresh for 400 days after the experiment has ended)",
"5-0": "What happens to customers that were enrolled in an experiment after it's been stopped?",
"5-1": "New customers will no longer be enrolled in an experiment after it's been stopped, and customers who were already enrolled in the experiment will begin receiving the Default Offering if they reach a paywall again. \n \nSince we continually refresh results for 400 days after an experiment has been ended, you may see renewals from these customers in your results, since they were enrolled as part of the test while it was running; but new subscriptions started by these customers after the experiment ended and one-time purchases made after the experiment ended will not be included in the results."
},
"cols": 2,
"rows": 6,
"align": [
"left",
"left"
]
}
[/block]

temp/experiments-overview-v1.md

See contents

Experiments allow you to answer questions about your users' behaviors and app's business by A/B testing two Offerings in your app and analyzing the full subscription lifecycle to understand which variant is producing more value for your business.

While price testing is one of the most common forms of A/B testing in mobile apps, Experiments are based on RevenueCat Offerings, which means you can A/B test more than just prices, including: trial length, subscription length, different groupings of products, etc.

You can even use our Paywalls or Offering Metadata to remotely control and A/B test any aspect of your paywall. Learn more.

📘

Experiments is available to Pro & Enterprise customers. Learn more about pricing here.

How does it work?

After configuring the two Offerings you want and adding them to an Experiment, RevenueCat will randomly assign users to a cohort where they will only see one of the two Offerings. Everything is done server-side, so no changes to your app are required if you're already displaying the current Offering for a given customer in your app!

📘

To learn more about creating a new Offering to test, and some tips to keep in mind when creating new Products on the stores, check out our guide here.

[block:image]{"images":[{"image":["https://files.readme.io/229d551-experiments-learn.webp","ab-test.png",null],"align":"center"}]}[/block]

As soon as a customer is enrolled in an experiment, they'll be included in the "Customers" count on the Experiment Results page, and you'll see any trial starts, paid conversions, status changes, etc. represented in the corresponding metrics. (Learn more here)

📘

We recommend identifying customers before they reach your paywall to ensure that one unique person accessing your app from two different devices is not treated as two unique anonymous customers.

Implementation requirements

Experiments requires you to use Offerings and have a dynamic paywall in your app that displays the current Offering for a given customer. While Experiments will work with iOS and Android SDKs 3.0.0+, it is recommended to use these versions:

SDK	Version
iOS	3.5.0+
Android	3.2.0+
Flutter	1.2.0+
React Native	3.3.0+
Cordova	1.2.0+
Unity	2.0.0+

If you meet these requirements, you can start using Experiments without any app changes! If not, take a look at Displaying Products. The Swift sample app has an example of a dynamic paywall that is Experiments-ready.

🚧

Programmatically displaying the current Offering in your app when you fetch Offerings is required to ensure customers are evenly split between variants.

Visit Configuring Experiments to learn how to setup your first test.

Tips for Using Experiments

Decide how long you want to run your experiments

There’s no time limit on tests. Consider the timescales that matter for you. For example, if comparing monthly vs yearly, yearly might outperform in the short term because of the high short term revenue, but monthly might outperform in the long term.

Keep in mind that if the difference in performance between your variants is very small, then the likelihood that you're seeing statistically significant data is lower as well. "No result" from an experiment is still a result: it means your change was likely not impactful enough to help or hurt your performance either way.

** Test only one variable at a time**

It's tempting to try to test multiple variables at once, such as free trial length and price; resist that temptation! The results are often clearer when only one variable is tested. You can run more tests for other variables as you further optimize your LTV.

Run multiple tests simultaneously to isolate variables & audiences

If you're looking to test the price of a product and it's optimal trial length, you can run 2 tests simultaneously that each target a subset of your total audience. For example, Test #1 can test price with 20% of your audience; and Test #2 can test trial length with a different 20% of your audience.

You can also test different variables with different audiences this way to optimize your Offering by country, app, and more.

** Bigger changes will validate faster**

Small differences ($3 monthly vs $2 monthly) will often show ambiguous results and may take a long time to show clear results. Try bolder changes like $3 monthly vs $10 monthly to start to triangulate your optimal price.

** Running a test with a control**

Sometimes you want to compare a different Offering to the one that is already the default. If so, you can set one of the variants to the Offering that is currently used in your app.

** Run follow-up tests after completing one test**

After you run a test and find that one Offering won over the other, try running another test comparing the winning Offering against another similar Offering. This way, you can continually optimize for lifetime value (LTV). For example, if you were running a price test between a $5 product and a $7 product and the $7 Offering won, try running another test between a $8 product and the $7 winner to find the optimal price for the product that results in the highest LTV.

temp/experiments-results-v1.md

See contents

Within 24 hours of your experiment's launch you'll start seeing data on the Results page. RevenueCat offers experiment results through each step of the subscription journey to give you a comprehensive view of the impact of your test. You can dig into these results in a few different ways, which we'll cover below.

Results chart

The Results chart should be your primary source for understanding how a specific metric has performed for each variant over the lifetime of your experiment.

By default you'll see your Realized LTV per customer for all platforms plotted daily for the lifetime of your experiment, but you can select any other experiment metric to visualize, or narrow down to a specific platform.

📘 Why Realized LTV per customer?

Lifetime value (LTV) is the standard success measure you should be using for pricing experiments because it captures the full revenue impact on your business. Realized LTV per customer measures the revenue you've accrued so far divided by the total customers in each variant so you understand which variant is on track to produce higher value for your business.

Keep in mind that your LTV over a longer time horizon might be impacted by the renewal behavior of your customers, the mix of product durations they're on, etc.

You can also click Export chart CSV to receive an export of all metrics by day for deeper analysis.

📘 Data takes 24 hours to appear

The results refresher runs once every 24 hours.

If you're not seeing any data or are seeing unexpected results, try:

Ensuring each product that is a part of the experiment has been purchased at least once

Waiting another 24 hours until the model can process more data

When you stop an experiment, the results will continue to be updated for the next 400 days to capture any additional subscription events, and allow you to see how your Realized LTV matures for each variant over time.

Customer journey tables

The customer journey tables can be used to dig into and compare your results across variants.

The customer journey for a subscription product can be complex: a "conversion" may only be the start of a trial, a single payment is only a portion of the total revenue that subscription may eventually generate, and other events like refunds and cancellations are critical to understanding how a cohort is likely to monetize over time.

To help parse your results, we've broken up experiment results into three tables:

Initial conversion: For understanding how these key early conversion rates have been influenced by your test. These metrics are frequently the strongest predictors of LTV changes in an experiment.
Paid customers: For understanding how your initial conversion trends are translating into new paying customers.
Revenue: For understanding how those two sets of changes interact with each other to yield overall impact to your business.

In addition to the results per variant that are illustrated above, you can also analyze most metrics by product as well. Click on the caret next to "All" within metrics that offer it to see the metric broken down by the individual products in your experiment. This is especially helpful when trying to understand what's driving changes in performance, and how it might impact LTV. (A more prominent yearly subscription, for example, may decrease initial conversion rate relative to a more prominent monthly option; but those fewer conversions may produce more Realized LTV per paying customer)

The results from your experiment can also be exported in this table format using the Export data CSV button. This will included aggregate results per variant, and per product results, for flexible analysis.

🚧 Automatic emails for poor performing tests

If the Realized LTV of your Treatment is performing meaningfully worse than your Control, we'll automatically email you to let you know about it so that you can run your test with confidence.

Metric definitions

Initial conversion metric definitions

Metric	Definition
Customers	All new customers who've been included in each variant of your experiment.
Initial conversions	A purchase of any product offered to a customer in your experiment. This includes products with free trials and non-subscription products as well.
Initial conversion rate	The percent of customers who purchased any product.
Trials started	The number of trials started.
Trials completed	The number of trials completed. A trial may be completed due to its expiration or its conversion to paid.
Trials converted	The number of trials that have converted to a paying subscription. Keep in mind that this metric will lag behind your trials started due to the length of your trial. For example, if you're offering a 7-day trial, for the first 6 days of your experiment you will see trials started but none converted yet.
Trial conversion rate	The percent of your completed trials that converted to paying subscriptions.

Paid customers metric definitions

[block:parameters]
{
"data": {
"h-0": "Metric",
"h-1": "Definition",
"0-0": "Paid customers",
"0-1": "The number of customers who made at least 1 payment. This includes payments for non-subscription products, but does NOT include free trials. \n \nCustomers who later received a refund will be counted in this metric, but you can use "Refunded customers" to subtract them out.",
"1-0": "Conversion to paying",
"1-1": "The percent of customers who made at least 1 payment.",
"2-0": "Active subscribers",
"2-1": "The number of customers with an active subscription as of the latest results update.",
"3-0": "Churned subscribers",
"3-1": "The number of customers with a previously active subscription that has since churned as of the latest results update. A subscriber is considered churned once their subscription has expired (which may be at the end of their grace period if one was offered).",
"4-0": "Refunded customers",
"4-1": "The number of customers who've received at least 1 refund."
},
"cols": 2,
"rows": 5,
"align": [
"left",
"left"
]
}
[/block]

Revenue metric definitions

[block:parameters]
{
"data": {
"h-0": "Metric",
"h-1": "Definition",
"0-0": "Realized LTV (revenue)",
"0-1": "The total revenue you've received (realized) from each experiment variant.",
"1-0": "Realized LTV per customer",
"1-1": "The total revenue you've received (realized) from each experiment variant, divided by the number of customers in each variant. \n \nThis should frequently be your primary success metric for determining which variant performed best.",
"2-0": "Realized LTV per paying customer",
"2-1": "The total revenue you've received (realized) from each experiment variant, divided by the number of paying customers in each variant. \n \nCompare this with "Conversion to paying" to understand if your differences in Realized LTV are coming the payment conversion funnel, or from the revenue generated from paying customers.",
"3-0": "Total MRR",
"3-1": "The total monthly recurring revenue your current active subscriptions in each variant would generate on a normalized monthly basis. \n \nLearn more about MRR here.",
"4-0": "MRR per customer",
"4-1": "The total monthly recurring revenue your current active subscriptions in each variant would generate on a normalized monthly basis, divided by the number of customers in each variant.",
"5-0": "MRR per paying customer",
"5-1": "The total monthly recurring revenue your current active subscriptions in each variant would generate on a normalized monthly basis, divided by the number of paying customers in each variant."
},
"cols": 2,
"rows": 6,
"align": [
"left",
"left"
]
}
[/block]

📘 Only new users are included in the results

To keep your A and B cohorts on equal footing, only new users are added to experiments. Here's an example to illustrate what can happen if existing users are added to an experiment: an existing user who is placed in a cohort might make a purchase they wouldn't otherwise make because the variant they were shown had a lower price than the default offering they previously saw. This might mean that the user made a purchase out of fear that they were missing out on a sale and wanted to lock in the price in anticipation of it going back up.

FAQs

[block:parameters]
{
"data": {
"h-0": "Question",
"h-1": "Answer",
"0-0": "What is included in the "Other" category in the product-level breakdown of my results?",
"0-1": "If the customers enrolled in your experiment purchased any products that were not included in either the Control or Treatment Offering, then they will be listed in the "Other" category when reviewing the product-level breakdown of a metric. \n \nThis is to ensure that all conversions and revenue generated by these customers can be included when measuring the total revenue impact of one variant vs. another, even if that revenue was generated from other areas of the product experience (like a special offer triggered in your app).",
"1-0": "Why do the results for one variant contain purchases of products not included in that variant's Offering?",
"1-1": "There are many potential reasons for this, but the two most common occur when (1) there are areas of your app that serve products outside of the Current Offering returned by RevenueCat for a given customer, or (2) the offered Subscription Group on the App Store contains additional products outside of that variant's Offering. \n \nFor the first case, please check and confirm that all places where you serve Products in your app are relying on the Current Offering from RevenueCat to determiner what to display. \n \nFor the second case, we recommend creating new Subscription Groups on the App Store for each Offering so that a customer who purchases from that Offering will only have that same set of options to select from one when considering changing or canceling their subscription from Subscription Settings on iOS.",
"2-0": "When I end an Experiment, what Offering will be served to the customers who were enrolled in that Experiment?",
"2-1": "When an Experiment is ended, all customers previously enrolled in it will be served the Default Offering the next time they reach a paywall in your app.",
"3-0": "How can I review the individual customers who were enrolled in my experiment?",
"3-1": "When using the Get or Create Subscriber endpoint you'll be able to see if an individual subscriber was enrolled in an experiment, and which variant they were assigned to, and can then pass that fact to other destinations like an analytics provider like Amplitude & Mixpanel, or your own internal database.",
"4-0": "How can I review the individual transactions that have occurred in my experiment?",
"4-1": "Our Scheduled Data Exports include the experiment enrollment of each subscriber in the reported transactions, and by subscribing to them you can receive daily exports of all of your transactions to analyze the experiment results further.",
"5-0": "How can I filter my results by other dimensions like Country in the Dashboard?",
"5-1": "Our Dashboard only supports filtering by Platform today, but if there are specific countries you're looking to distinctly measure results for you can instead run simultaneous tests targeting each set of countries. Then, each tests results will tell you how the experiment performed in that country set so you can determine where the change was and was not successful"
},
"cols": 2,
"rows": 6,
"align": [
"left",
"left"
]
}
[/block]

temp/experiments-v1.md

See contents

Take the guesswork out of pricing & paywalls

RevenueCat Experiments allow you to optimize your subscription pricing and paywall design with easy-to-deploy A/B tests backed by comprehensive cross-platform results. With Experiments, you can A/B test two different Offerings in your app and analyzing the full subscription lifecycle to understand which variant is producing more value for your business.

📘

Experiments is available to Pro & Enterprise customers. Learn more about pricing here.

[block:image]
{
"images": [
{
"image": [
"https://files.readme.io/93dfcae-cross-platform-data.webp",
"",
""
],
"align": "center"
}
]
}
[/block]

With Experiments, you can remotely A/B test:

Product pricing
Product offers (e.g. trial length, trial presence, paid introductory offers, etc.)
The number and mix of products offered
Paywall imagery, copy, layout, and more with our Paywalls or Offering Metadata

Plus, you can run multiple tests simultaneously on distinct audience or subsets of the same audience to accelerate your learning.

Get started with Experiments

If you're just getting started, make sure your app is ready to use experiments
If you've not yet created an Offering for your test hypothesis, check out our guide to get started
For instructions on how to set up an Experiment in your Project, read our guide on configuring experiments
Once the data from your Experiment starts coming in, learn how to interpret the results

dpannasch added 4 commits January 29, 2024 10:26

Update experiments-v1.md

e66ba88

Update experiments-overview-v1.md

dde7e26

Update configuring-experiments-v1.md

9b637ba

Update experiments-results-v1.md

8d11fde

dpannasch marked this pull request as ready for review January 29, 2024 17:10

dpannasch requested review from gragera, Carlosedo and lburdock January 29, 2024 17:11

Update configuring-experiments-v1.md

9845216

Typo fix

dpannasch marked this pull request as draft April 1, 2024 12:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simultaneous Testing Docs #670

Simultaneous Testing Docs #670

dpannasch commented Jan 29, 2024

RCGitBot commented Feb 1, 2024

Setting up a new experiment

Required fields

Audience customization

Starting an experiment

Running multiple tests simultaneously

Examples of valid tests to run simultaneously

Examples of invalid tests to run simultaneously

FAQs

How does it work?

Implementation requirements

Tips for Using Experiments

Results chart

Customer journey tables

Metric definitions

Initial conversion metric definitions

Paid customers metric definitions

Revenue metric definitions

FAQs

Take the guesswork out of pricing & paywalls

Get started with Experiments

Simultaneous Testing Docs #670

Are you sure you want to change the base?

Simultaneous Testing Docs #670

Conversation

dpannasch commented Jan 29, 2024

Motivation / Description

Changes introduced

Linear ticket (if any)

Additional comments

RCGitBot commented Feb 1, 2024

Previews

Setting up a new experiment

Required fields

Audience customization

Starting an experiment

Running multiple tests simultaneously

Examples of valid tests to run simultaneously

Examples of invalid tests to run simultaneously

FAQs

How does it work?

Implementation requirements

Tips for Using Experiments

Results chart

Customer journey tables

Metric definitions

Initial conversion metric definitions

Paid customers metric definitions

Revenue metric definitions

FAQs

Take the guesswork out of pricing & paywalls

Get started with Experiments