Teach yourself growth marketing: How to perform growth experimentation through A/B testing

Without customers, there can be no business. So how do you drive new customers to your startup and keep existing customers engaged? The answer is simple: Growth marketing.

As a growth marketer who has honed this craft for the past decade, I’ve been exposed to countless courses, and I can confidently attest that doing the work is the best way to learn the skills to excel in this profession.

I am not saying you need to immediately join a Series A startup or land a growth marketing role at a large corporation. Instead, I have broken down how you can teach yourself growth marketing in five easy steps:

  1. Setting up a landing page.
  2. Launching a paid acquisition channel.
  3. Booting up an email marketing campaign.
  4. A/B test growth experimentation.
  5. Deciding which metrics matter most for your startup.

In this fourth part of my five-part series, I’ll take you through a few standard A/B tests to begin with, then show which tests to prioritize once you have assembled a large enough list. Finally, I’ll explain how to run these tests with minimal external interference. For the entirety of this series, we will assume we are working on a direct-to-consumer (DTC) athletic supplement brand.

A crucial difference between typical advertising programs and growth marketing is that the latter employs heavy data-driven experimentation fueled by hypotheses. Let’s cover growth experimentation in the form of A/B testing.

It is important to consider secondary metrics and not always rely on a single metric for measuring impact.

How to properly do A/B tests

A/B testing, or split testing, is the process of sending traffic to two variants of something at the same time and analyzing which performs best.

In fact, there are hundreds of different ways to invalidate an A/B test and I’ve witnessed most of them while consulting for smaller startups. During my tenure leading the expansion of rider growth at Uber, we used advanced internal tooling simply to ensure that tests we performed ran almost perfectly. One of these tools was a campaign name generator that would keep naming consistent so that we could analyze accurate data when the tests had concluded.

Some important factors to consider when running A/B tests:

  • Do not run tests with multiple variables.
  • Ensure traffic is being split correctly.
  • Set a metric that is being measured.

The most common reason for tests getting invalidated is confounding variables. At times it isn’t obvious, but even testing different creatives in two campaigns that have different bids can skew results. When setting up your first A/B test, ensure there’s only one difference between the two email campaigns or datasets being tested.

Teach yourself growth marketing: How to perform growth experimentation through A/B testing by Ram Iyer originally published on TechCrunch

Eppo, a product experimentation platform, raises $19.5M for expansion

Despite the demand for platforms that let developers experiment with different versions of apps, the infrastructure required remains relatively complex to build. Beyond data pipelines and statistical methods, and experimentation infrastructure relies on analytical workflows often sourced from difficult-to-configure cloud environments.

Plenty of startups have emerged in recent years to abstract away the app experimentation infrastructure, including Split, Statsig, and Optimizely. A more recent arrival is Eppo, which today emerged from stealth with $19.5 million including a $16 Series A Menlo Ventures and a $3.5 million seed round led by Amplify Partners.

According to CEO Che Sharma, Eppo was inspired by his experiences building experimentation platforms as an early data scientist at Airbnb and Webflow, a website builder. “Nothing in the commercial landscape provided the power of experimentation systems like Airbnb, which meant building the same system over and over,” he told TechCrunch via email. “I built Eppo to leverage the modern data stack and the latest in causal inference literature, allowing companies to tie product team efforts to business metrics like revenue, with boosted statistical power.”

Sharma concedes that the app experimentation space is becoming congested, if not saturated, with competitors. But he says that Eppo is differentiated by its analysis tools, which use confidence intervals to make it ostensibly easier to understand and interpret the results of a randomized experiment. Eppo also supports experimentation with AI and machine learning models, leveraging techniques to perform live experiments that show whether one model is outperforming another.

Sharm claims that Eppo is one of the first commercial platforms to include CUPED variance reduction, an approach that tries to remove variance in a metric that can be accounted for by pre-experiment information. For example, say a property-booking company runs an experiment aiming to increase the number of daily bookings that they receive. The number of bookings per property per day can range from zero to thousands. But the average bookings-per-day for each property can often be determined before the experiment; through CUPED, this knowledge can be used to test whether properties start to receive more, less, or about the same number of bookings-per-day after the experiment compared to before it.


Image Credits: Eppo

“Of all products in the modern data stack, experimentation has one of the clearest relationships to revenue return on investment because it injects C-suite- and board-level metrics into every decision a product team makes,” Sharma said. “Especially in tough recession markets, the C-suite needs their product teams to probably drive business metrics like revenue. Without experimentation, product teams are in a constant cycle of shipping, pointing at engagement- and click-level vanity metrics, but never having confidence that the business’ financial outlook has improved from their work.”

Sharma also asserts that Eppo is more privacy-preserving that most experimentation platforms because it performs all of its data computation in the cloud, on Snowflake. As opposed to collecting clicks, engagements, and other personally identifiable information, the Eppo platform only stores aggregated, anonymized experiment results.

“We are major evangelists of a new way of building analytics products that is much more privacy-focused,” he said. “Other experimentation platforms require sending the universe of data to them, essentially storing replicas of each customer’s own data ecosystem.”

Of course, even the best experimentation software isn’t helpful if employees don’t use it. Buy-in can be tough to achieve, in part because experimentation can expose the true, sometimes lower-than-anticipated success rate of a product development. Even at tech giants like Google and Bing, the vast majority — about 80% to 90% — of experiments fail to generate positive results.

But Sharma, while declining to answer a question about revenue, says that uptake remains strong. Eppo’s customer base grew over the past year to include Goldbelly, Netlify, Kumu, and at least one unnamed Fortune 50 company, he said.

“We have seen a resurgence in the interest of experimentation with the recent market downturns. Across our existing customers and our customer pipeline, we have seen this pattern: layoffs are centered on teams building net-new future product lines that won’t return revenue quickly, and are instead centering on core product development with a focus on revenue which is inherently experimentation-centric,” Sharma said. “Concretely, despite many customers having layoffs, across the board none of the experimentation teams have had layoffs.”

With the new funding, San Francisco, California-based Eppo plans to expand its team from 15 employees to 25 by the end of the year.

Experimentation for product teams: Part 1

Experiments allow us to measure the impact of our products and determine what works and what doesn’t. They should be a quick and inexpensive way for product managers to validate and prioritise their ideas. In the first of this two-part series we’ll dig into the types of experimentation product managers can use and reveal some [...] Read more »

The post Experimentation for product teams: Part 1 appeared first on Mind the Product.

Sydney-based startup Upflowy raises $4M to optimize web experiences with its no-code solution

The Covid-19 pandemic has affected consumers’ behaviors and purchasing patterns; data-driven decision-making is even more crucial to ensure that companies’ products or services genuinely benefit users in times of uncertainty. The demand for SaaS products that enables online transactions has dramatically increased during Covid-19, according to CEO of Upflowy Guillaume Ang.

Upflowy thinks it has the tools to help businesses generate high-performing user flow. The Australia-based startup, which just raised $4 million, has built a platform that offers drag and drop tools for A/B testing and personalization on the web and mobile apps, and the best part is businesses don’t need to know any code to engage with it. The latest funding was led by Counterpart Ventures, in addition to returning investors Tidal, Global Founders Capital, Black Nova and Antler.

Getting visitors on a website or app to sign up for sales requires significant time and cost, and as a result, many businesses struggle to achieve that, Ang told TechCrunch. To help entrepreneurs and marketers, especially startups, boost conversion rates and user flows, Ang and two other founders, Matthew Browne and Alexandre Girard, founded Upflowy in 2020. The startup says, for too long, businesses have been dependent on development or engineering teams that are consumed with improving the products and don’t have time to support marketing endeavors.

Upflowy founders (from left to right): CTO Alex Girard, CIO Matthew Browne, CEO Guillaume Ang

The startup will use a good part of the new capital to enhance its platform capabilities by leveraging data science areas like predictive personalization and developing additional features. It also wants to support the team by increasing its headcount to over 30 full-time employees. 

“After seeing low-engagement forms lead to as much as a 60% drop in conversion, translating into a huge waste of advertising spend presented a huge uplift opportunity for businesses. This is just the first step. More effectively qualifying leads to the right product and personalizing the sales approach is key to converting into sales,” Ang told TechCrunch. “Upflowy’s data visualization and A/B testing interface mean that understanding their customers’ drop-off and behavior becomes a lot clearer, paving the way for experimentation and optimization.”

Hundreds of businesses now use Upflowy, Ang said, adding that it has a range of clients from B2B tech, SaaS and healthcare to B2C companies like fashion brands and a national sports team.  

With the latest improvements in weekly user growth for the last few weeks, the startup also has seen 40% growth in its activation rates and its monthly user base has doubled, according to Ang.

“The Australian tech scene is driving innovation globally. Upflowy was born out of this growing market of talent,” Ang said in a statement. “We are already active and tested on a global stage to provide the validation of our platform. A signup flow is often the first interaction a prospective customer has with a business, and we are the first to make it easy to create and take them to live – improving the flow of information and ultimately ensuring prospects can be moved through the funnel in a smarter way.”

The COVID-19 pandemic created a catalyst to start Upflowy as a remote company from the beginning. In the early stages of 2020, being a remote-first business was a fairly new concept, but the startup has been able to source talent from all over the world, Ang said. Upflowy is due to set up a base in the U.S. this year to increase its presence in the region.

“Upflowy has managed to solve an issue that nearly every company faced,” said former managing director of APAC Optimizely Dan Ross, who invested in Upflowy. “There are currently no other tools on the market that give teams the ability to quickly create, test and iterate on full sign-up flows and feed data straight into any other platform, which are looking to convert visitors into customers.”

“Modern organizations need simple, no-code solutions that remove the friction between data collection and customer experience,” Patrick Eggen, co-founder and general partner at Counterpart Ventures, said in a statement. “The market is full of clunky solutions that rely on engineers to create web experiences, which inhibits testing and improvement. Upflowy is in the unique position to re-envision this market, enabling teams to create the web experiences that consumers need and demand.”

AWS adds user monitoring and A/B testing to CloudWatch

Amazon CloudWatch was introduced way back in 2009 to help AWS customers view data about their cloud usage and spending. Today at the dawn of AWS re:Invent, the company’s cloud customer conference taking place in Las Vegas this week, the cloud division announced a couple of enhancements to the product.

Amazon has been building on the types of data provided by CloudWatch, and today it added user monitoring. With Real User Monitoring, AWS customers can understand when there is a problem with a deployment and take corrective action before customers really begin to feel it.

“Amazon CloudWatch RUM will help you to collect the metrics that give you the insights that will help you to identify, understand, and improve this experience. You simply register your application, add a snippet of JavaScript to the header of each page and deploy,” Amazon’s Jeff Barr wrote in a blog post announcing the feature.

This doesn’t exactly fall under the category of stunning innovation. It’s something companies like AppDynamics and New Relic have been doing for years, but as with most things Amazon they are providing a soup-to-nuts experience for customers inside AWS, and this type of monitoring lets you know when things could be going wrong with your AWS application.

The other new feature is a new experiments tool called CloudWatch Evidently, which helps developers set feature flags and run A/B tests inside an application they are building on top of AWS. Rather than just updating an app for every user, developers may want to test it on a limited subset of users and see if the new feature breaks anything, or if users prefer a particular approach or design more.

They can limit the people who see a new feature by setting a feature flag in the code and setting up the parameters for that feature. In addition, you can do A/B testing, another form of experimentation, that lets you test features with a certain subset of users to see which feature or design people prefer.

Neither of these is new either. Companies like Split.io have been doing more broad feature flag management for some time, and companies like Optimizely have been building companies around A/B testing.

CloudWatch Evidently is already available in 9 Amazon cloud regions with pay-as-you-go pricing, while CloudWatch RUM is also available now in 10 regions at a cost of $1 per 100,000 events collected.

79% more leads without more traffic: Here’s how we did it

In this case study, we’ll show how we used research-driven CRO (conversion rate optimization) techniques to increase lead conversion rate by 79% for China Expat Health, a lead generation company.

Help TechCrunch find the best growth marketers for startups.

Provide a recommendation in this quick survey and we’ll share the results with everybody.

Before-and-after screenshots of the mobile version of ChinaExpatHealth after a marketing test.

Image Credits: Jasper Kuria

The mobile site view on the left labeled “before” is the control ( “A” version) while that on the right labeled “after” is the optimized page (“B” version). We conducted a split test aka A/B test, directing half of the traffic to each version, and the result attained 95% statistical significance. Below is a description of the key changes made.

Used a headline with a more compelling value proposition

The headline on the control version is “Health Insurance in China.” If I am an expat looking for health insurance in China, at least I know I am in the right place but I don’t immediately have a reason to choose you. I have to scroll and infer this from multiple elements.

For revenue-generating landing pages it is best to always follow the Bauhaus design aesthetic (from architecture). Form follows function, ornament is evil!

The winning version instantly conveys a compelling value proposition: “Save Up to 32% on Your Health Insurance in China,” accompanied by “evidentials” to support this claim — the number of past customers and a relevant testimonial with a 4.5 star rating (by the way, it is better to use a default static testimonial rather than a moving carousel).

As the famed old-school direct response marketer John Caples taught us, “The reader’s attention is yours only for a single instant. They will not spend their valuable time trying to figure out what you mean.” What was true in Caples’ 1920s heyday is doubly so in the mobile age, when attention spans are shorter than a fruit fly’s!

Benchmark-backed Optimizely confirms it has laid off 15% of staff

Optimizely, a San Francisco-based startup that popularized the concept of A/B testing, has laid off 15% of its staff, the company confirmed in a statement to TechCrunch. The layoff impacts around 60 people, and those laid off were given varied levels of severance. Each employee was given 6 months of COBRA and was allowed to keep their laptops.

“As with so many other businesses globally, Optimizely has been impacted by COVID-19. Today, we have had to make a heartbreaking decision to reduce the size of our workforce,” Erin Flynn, chief people office, wrote in a statement to TechCrunch, adding that “today’s difficult decision sets up our business for continued success.”

The startup was founded in 2009 by Dan Siroker and Pete Koomen, on the idea that it helps to have customers experience different versions of the website, also known as A/B testing, to see what iteration sticks best. A year after founding, the startup went through Y Combinator in 2010 and in 2013 it signed a lease for a 56,000-square-foot office in San Francisco.

Optimizely last raised $50 million in Series D financing from Goldman Sachs, bringing its total venture capital secured to date at $200 million. Other investors include Index Ventures, Anddreessen Horowitz and GV.

In June, Optimizely said it handles more than 6 billion events a day. Customers include Visa, BBC, IBM, Wall Street Journal, Gap, StubHub, and Metromile.

Optimizely was not listed as applying for a PPP loan, a program created by the government to help businesses avoid laying off staff. The loans were met with controversy in Silicon Valley, as some thought venture-backed businesses should turn to investors, instead of the government, for extra capital.

Optimizely’s layoffs are somewhat surprising given recent earnings reports that show that enterprise SaaS companies have broadly benefited from the coronavirus pandemic. In an online work world, infrastructure and software services become more vital by the day. Box, for example, helps people manage content in the cloud and it beat expectations on adjusted profit and revenue. So why is Optimizely struggling?

There are a ton of reasons for layoffs beyond what the market thinks about a business. Optimzely’s customers are a mix of heavy-hitters in enterprise, but also include businesses that have struggled during this pandemic, including StubHub and Metromile — both of which had layoffs.

While the pace of layoffs is slowing down, cuts themselves aren’t disappearing. As the stocks show us, it’s a volatile time and businesses are looking for ways to stay financially safe.

Product Distribution Models

One reason why product management is so complex is because context matters. Different product distribution models fundamentally change the way that products are built, validated, and launched. When I visualize the different product distribution models, I think of a spectrum from left to right:  On the left are internal products. These are typically platform products or internal productivity tools, and ... Read More

The post Product Distribution Models appeared first on Product Manager HQ.

Data-Driven Blunders and how to Avoid Them

“We are a data-driven company”. You’ll travel far and wide before you find a tech startup that doesn’t pride itself on this claim. It’s become a tech staple – alongside free beer on Fridays, table football, and all the fruit you could dream of. And, while the logic behind a data-driven approach is undeniable, too often the expectations that come with it aren’t met. This data-driven approach permeates events, dashboards, metrics, and reports, and leaves most of us feeling less like Neo at the end of The Matrix and more like a dog whose owner just hid a tennis ball after pretending to throw it – confused, our excitement transmuted into frustration so deep we feel like chewing on our favorite plush toy.

Let’s be clear, good tracking and hypotheses validation with data is essential for any product manager. The problem arises when we expect data to be the “secret sauce” that will immediately improve all aspects of our product, and that the answer to every question is always more (events, dashboards, tests). This borderline zealous belief in “data” as the answer to all our prayers is dangerous for many reasons.

For starters, any system (for example all our fancy algorithms, events, tests…) can’t accurately predict the outcomes of a more complex system than itself (that of us humans deciding what to buy and how). This is a principle of the universe, so there is little we can do other than attach a healthy amount of scepticism to our test results, and repeat them when possible.

On the bright side, sometimes we aren’t facing complexity problems, simply flawed practices. Thankfully there’s plenty we can do to identify and tackle these. There are probably as many flawed data practices as there are product managers in the world, but in my experience they can be roughly grouped into three categories.

Improper Testing

Have you ever run an A/B test and ended up with more questions than answers? More often than not, I see this happening as a byproduct of the sheer amount of variables that are involved in human interactions. The more variables, the more data you need for a statistically significant result. A simple test to determine if colour A or colour B has a higher conversion rate on a call to action (CTA) requires thousands of impressions to yield consistent results. Any added variables (shape of the CTA, copy, position) multiplies the number of impressions you’ll need, without mentioning other metrics that might be affected: What if sign-up goes down even though conversion goes up for that particular CTA? Or conversion drops slightly for the other CTA but the size of the basket doubles? Adding one single extra layer of complexity to a test systematically raises the number of necessary impressions for statistical significance.

So we need to be extremely precise on what we are testing and the associated impacts we can foresee: If we suspect (and we should) that our CTA test might impact not just conversion, but also basket size and retention, we should include those metrics in our test reporting. We also need to create our test in a way that, should there be a clear winner, we are able to say with certainty what element exactly had the impact. If A is red, with rounded corners, and has different copy from B, which is blue and has square corners, you might get a clear winner, but you will have a hard time determining which of those three different elements delivered the impact.

Keep it simple and test one variable at a time. You will have to run multiple tests or create more than two variables, which might feel like a waste of valuable time, but compared to getting consistently unactionable results or worse, false positives, you’ll be happy you put in the extra effort. If the test you want to run requires too many variables then you’re probably better off doing some qualitative testing first (focus groups, user interviews, guerilla testing…) and shaving off the excess variables before setting up an A/B test.

Be sure, also, to measure not simply the metric that you’re aiming to move, but also those that you think might be affected. But be careful to not overdo it, otherwise you might fall into the second deep data hole we all want to avoid.

Too Much Data, too Often

Imagine you are trying to lose a few kilos. You set yourself realistic goals and provide the means to reach them. You start to eat healthier, exercise at least three times a week, and vow to cut down on those midnight grilled-cheese sandwiches.

You could stay true to your new habits and only check the scales every week, giving yourself and your body enough time to adapt and start shedding weight consistently, and only making any changes after you’re sure of the impact.

Or, you could start weighing yourself multiple times a day, overreacting to even the slightest weight gain. You try to compensate by skipping meals, going out to run multiple times a day, only to then become too exhausted to do any sport for the rest of the week and binge eat snacks every time you skip a meal. By the end of the month you will weigh exactly the same as on day one, and you’ll be extremely frustrated at yourself for the wasted sacrifice.

Just like diets, product improvements, often take time to have any visible effects. If you release a product or feature and start looking at the KPIs every waking hour, changing things on the fly when the KPI doesn’t immediately move in a positive direction, you will be acting against background noise and ruining any results by course-correcting against it. Oscillation of metrics (within reasonable margins) is perfectly normal and expected; there are lots of external factors that can have a small influence on any given KPI: weather, day of the week, bank holidays, available stock, website outages, discounts from the competition, seasonality…  Furthermore, users might need some time to adjust to the feature or learn how to use it, thus delaying the visibility of its impact.

Back to our CTA example: if conversion drops on the first day of testing, and you immediately add more variables to the test in an attempt to “fix” it, you will not only forfeit any measurable results for the test, you might be killing a change that would have made a positive impact later on. Sometimes users have a weird day, sometimes they need time to get used to change, sometimes you just get unlucky on the first few days and get the “bad” users on your test. Wait it out, check the data only when you might learn something from it, and check it at the level of granularity that makes the most sense, often weekly (if not even monthly) to prevent daily fluctuations from throwing you off. Very few metrics will be consistent on a daily basis and paying attention to background noise fluctuations will only be a waste of effort.

Picking the Wrong North Star Metric

A North Star metric is the single KPI that works best as a proxy for success for your product and business. North Star metrics are popular because they avoid the problem of having too many data points to measure. A single KPI that you know gives you an indication of success means you don’t need to dig in deeper into other KPIs. As long as that North Star is pointing in the right direction, you will save time and cognitive load for you and your team.

But North Star metrics are not easy to set. Products are complex and multifaceted and it’s hard to stick to a single overarching metric. On top of that, truly good North Star metrics are often lagging, and that interferes with their purpose of serving as a simplified proxy to understand daily if your business is going well or not. So, while a metric like Customer Lifetime Value might be a much better North Star than Conversion to Order for example, many companies and PMs tend to take the latter: conversion is easy to grasp, it’s a good enough proxy for success, and it’s real-time. Conversely, Customer Lifetime Value can be very complex and require months before being set, but is a far better proxy for success than conversion.

Let’s be clear, there’s no perfect North Star metric: Any single metric will be unable to grasp all the ins and outs of even the most simple of businesses. But there are better and worse North Star metrics. To start, the metric should be somehow paired with the value you bring to your customers. If you’re bringing true value to users then all other metrics will fall into place. Additionally, your North Star shouldn’t be easily “hacked”: if you can artificially boost a metric then it’s not as good a proxy of success as you thought. There are a few ways of boosting conversion while having a horrible impact on the business: fill your website with dark patterns, make every button a CTA to confirm your purchase, give discounts to customers to the point of making them unprofitable, halt your acquisition ads and let only organic users reach your website, sign your customers automatically (and inconspicuously) to a monthly subscription…  All these initiatives would earn you a bonus if attached to raising conversion, but they would also probably bring your company down and you with it. While there may be ways to raise the Customer Lifetime Value which have negative side effects, they’re probably less numerous than Conversion to Order.

As an alternative to using your North Star as the single go-to daily metric, try to select a few “health” metrics within your product that you check daily. Remember than slight fluctuations are normal and that other departments might be running initiatives (like offering discounts or reducing acquisition) that influence them. Keep your North Star as a proxy for the long-term health of your product, not as an indicator of the current status; As with our diet example, your North Star could be your weight: it’s a good indicator of your general success but it takes time to move and is influenced by many factors; While the amount of exercise you’ve done that week along with how healthy you’ve eating would be the daily KPIs you can check and course correct.

In Summary

We live in a time of abundance of data and data processing capacity, but even if we can add an event to every single action in our app or website, and create a dashboard for every single KPI in our business, we are still limited by our own cognitive capacity.

With data, quality is even more important than quantity. You only have the capacity to process and draw conclusions from a finite amount of data, so make sure it’s the bit that will help you improve your product and get a better understanding of your users, not the bit that will make you feel like your human made the tennis ball vanish mid-throw one more time.

The post Data-Driven Blunders and how to Avoid Them appeared first on Mind the Product.