Why am I not getting the expected benefits from using AI in marketing?

Interest in the topic of artificial intelligence (AI) is not waning. Almost every day there are announcements of further breakthroughs in this field. The public is most excited about the achievements of so-called generative AI. They are the most spectacular and most impressive. Talking to a computer in natural language, a computer painting pictures or creating a movie based on a script given to it appeal to the imagination. However, it is worth remembering that AI is also more mundane models that, operating in the background and without attracting as much attention, play an important role in many business processes including marketing. Sometimes, however, they do not deliver the expected benefits and their performance is sometimes disappointing. What mistakes can contribute to such situations and what can be done to avoid them?

Problem worded incorrectly

Sometimes a fundamental problem arises at the very beginning of an AI project. It is a clear disconnect between the actual business need and the definition of the problem the AI team sets out to solve. For example, a marketing team has a goal of reducing the number of departing customers. It intends to use a special action involving coupons with attractive discounts to do so. Naturally, the budget for this activity is limited. As a first step, the team wants to identify the customers most at risk of churn.

So he commissions the AI cell to develop a model that estimates the probability of leaving for each customer. Team AI does its job brilliantly. It builds a model with very high prediction accuracy. So the marketing department decides to use the model and qualify those with the highest probability of leaving for action until the budget is exhausted. The action happens. Quite a few at-risk consumers stay. Everyone has the feeling of a job well done and a budget reasonably used. But was the budget really used optimally? Could something have been done better? It turns out that yes.

Instead of the probability of leaving, one could predict the chance of a positive reaction to the action. A seemingly minor difference. However, it could yield dramatically better results. The variant used qualified those most at risk. Among them, however, were people who could not be persuaded to stay by any action. These are very often precisely the people at the top of the at-risk list – frustrated with customer service, disappointed with the quality of the product, already looking for an alternative supplier for some time. Using the budget for these consumers, slightly lower (but still high) risk consumers were left out, who were nevertheless more likely to change their decision thanks to the action. A certain portion of the budget was wasted on trying to convince those who could not be convinced. At the same time, the opportunity to convince those whose decision could still be influenced was missed. A better, more precise definition of the problem in the context of the expected business effect would have made it possible to benefit much more from the opportunities offered by the use of AI algorithms.

Inappropriate measures of success

In many situations, adopting the wrong indicators can lead to wrong decisions. They are also sometimes associated with giving up tools that. One thriving Polish company had a custom-built AI predictive system that allowed for personalized recommendation of an offer to be included in a mailing. The company had a very broad product portfolio and many competing offers. The system created was to select a communication that was relevant to the consumer and at the same time maximized the possible profit. It was also a matter of not bombarding the consumer with too many messages. The main concern was that “spammed” customers would opt out of receiving mailings. It boasted that the “unsubscribe” rate remained very low. Under intense pressure of sales performance, however, they began to see limiting the number of messages as an obstacle to achieving goals. Managers assumed that increasing the number of messages sent would bring more sales. They were only concerned about increased unsubscription rates.

A quasi-experiment was conducted to increase the number of messages while observing sales and quit rates. Sales increased and an increase in the abandonment rate was not observed. This encouraged further increases in the number of messages until the maintenance of the aforementioned AI tool was abandoned altogether. The company thus took a step backward. The model was discarded in favor of “expert” qualification of consumers for communications. The churn rate, which remained stable, kept decision-makers convinced that the number of mailings, of course, if it remained within, as they put it, “the limits of common sense,” did not discourage customers from subscribing. The voices of the data science team, which tried to convince them to take a broader view of the problem, were ignored. A schoolboy mistake was made.

Managers ignored the apparent downward trend in the open rate. The advanced model was discarded and considered an unnecessary cost. They failed to consider that consumers may be saturated to the point where they start ignoring messages from that sender. They stop opening them, and as a result, they also don’t care to click on the “unsubscribe me” link. The possibility of maintaining a long-term relationship, and the possibility of generating profits from the communication in the future, has been sacrificed for the short-term sales effect.

Mistakes in communication between marketing and AI teams

The common denominator of the two situations cited earlier is, in fact, the lack of adequate communication between the marketing team and the AI team. In the first case, more information could have been communicated to the AI team regarding the business objective and context (including budget constraints) of the project. This would have provided an opportunity for a more adequate definition of the problem and a fuller exploitation of the possibilities offered by advanced modeling. Consequently, this would have translated into better budget utilization and higher ROI. In the second case, more weight should have been given to the concerns raised by the AI team about the definition of the problem and the measure adopted. This would have avoided, costly in the long run, the wrong decision to return to old methods and reject the potential of AI.

The success of the project and the full realization of the AI opportunity requires good communication and interaction between marketing experts and AI experts. Avoiding the following mistakes can help achieve this:

  • too broadly defined business objective (“we want to reduce the number of departing customers” is several levels of detail too few),

  • vague definitions of fundamental concepts (sometimes it is a challenge to define what it means that a customer has left),

  • failure to define the context and actual business objective in the brief given to the AI team,

  • concealment by the marketing team from the AI team of deficiencies in understanding the specifics and capabilities of AI solutions,

  • related excessive expectations of the project’s results, or recognition in advance that AI cannot help solve a given marketing problem,

  • the AI team’s concealment from the marketing team of deficiencies in understanding of marketing issues and the project context,

  • the related limitation of the AI team to a literal interpretation of the brief provided,

  • The use of industry “newspeak”,

  • excessive focus on technicalities at the expense of the AI team’s loss of business perspective and the real purpose of the project.

Summary

The examples cited in the article are just the tip of the iceberg. Some may see both situations as simple, even schoolboy mistakes. That’s fine. It means that they are already at a higher level of understanding of the specifics of working with AI projects. However, there are pitfalls lurking there as well. For others, even these two cited examples may be eye-opening, make them reflect and look for similar problems in their own projects. That’s a good thing, too. It means they are taking another important step on the road to more fully realizing the potential that lies in marketing applications of AI. In any situation, it’s important to remember that good communication and cooperation between the data science/AI team and the marketing team is needed to apply AI successfully.

How to get out of the RFM trap

(R)ecency, (F)requency, (M)onetary value is a classic of marketing analysis. We all know it. Many of us use it. Each of us understands it. But do we really? RFM analysis undoubtedly has many advantages, which is why it has been in use for many years. However, it is also worth learning about its disadvantages and limitations. In order to use it properly and not try to solve problems with it that it cannot solve.

RFM in practice

The RFM approach is a customer segmentation method that is based on three main indicators:

  • Recency (last purchase): determines how (in)long ago the customer made a purchase.
  • Frequency (frequency of purchase): measures how often a customer makes purchases.
  • Monetary (value of purchases): determines how much the customer spends during each purchase.

Assuming that we have data on customer transactions, we calculate the value of the three previously mentioned indicators (R, F, M) for each customer. Then we divide the values for each indicator into groups (for example, 3), where group I is the top 1/3 of customers in terms of a given indicator, group II to the middle 1/3 of customers, and group III is the weakest 1/3 of customers. For better understanding, let’s consider the example of a particular customer. This customer last made a purchase 15 days ago, which is quite recent. He buys on average 2 times a month, which is a high frequency for this consumer base. His average receipt, however, is a mere £50, which is a very low value compared to other customers.

Thus, our example customer is in the top groups in terms of R and F and in the weakest group due to M. Thus, he can be labeled R-1, F-1, M-3. With this division, the consumer can end up in 1 of 27 segments (3R x 3F x 3M = 27). It is worth noting that for each group can also be categorized into more than 3 compartments. It all depends on how many RFM segments you want to get. As I mentioned earlier – when dividing into 3 we get 27 segments. If we divide into 4 we get 64 segments. If we divide into 5 then the number of segments will increase up to 125. If we divide into 6 we will get 216 segments, and if we decide to divide the range of each of the 3 variables into 10 compartments then the total number of segments will reach as many as 1000.

What advantages does RFM have?

Among the advantages of the RFM approach we can point out:

  • Simplicity and ease of interpretation – this is due to the small number of variables (three), the simple to explain manner in which the analysis is carried out and, consequently, the intuitive interpretation of the resulting segments.
  • Relative speed of conducting the analysis.
  • No need for specialized software.
  • Relatively low requirements as to knowledge of statistical methods.

What does the simplicity of RFM analysis lead to?

Unfortunately, listed among the advantages, simplicity is also the source of one of the most significant drawbacks of the RFM approach. Limiting the number of dimensions to 3 (R,F,M) makes it easier to interpret the results, but at the same time narrows the resulting consumer profile. Summarizing a consumer’s transactional activity using 3 numbers, is often an oversimplification. It can give a very distorted (and even falsified) picture of the customer. The following example illustrates this.

The diagram shows four customers with noticeably different buying patterns, but identical characteristics in terms of last purchase, frequency of purchase and seniority (first purchase).

The same problem also applies to the value of spending. A customer who always spends £50 and a customer who spends once £1 and once £99 will have the same average receipt value.

Another limitation of the RFM method is that it ignores non-transaction aspects of consumer behavior. For example, such as interactions with marketing communications directed to them, contacts with various points of contact with the company (e.g., complaints, calls to the call center) or demographic aspects (e.g., in some industries, frequency and spending may change with age).

Among the limitations of RFM, the focus on history is also worth mentioning. RFM summarizes consumer buying behavior from a certain point in time to a certain point in time. It helps to segment customers based on their past transactions. However, RFM says nothing about their future behavior. By itself, it has no predictive power.

There are various variations and expansions of the RFM model. New variables are added to it (e.g. RFD, RFE, RFM-I) or the way they are calculated is modified. The goal is to overcome the limitations mentioned earlier. However, they do not change the fundamental problem and that is the attempt to describe complex consumer behavior with a few aggregated numbers.

What is the RFM trap?

Thus, RFM can become a trap primarily when:

  • it is the main (or even the only) tool we use to analyze and plan communication strategies and activities.
  • RFM results are overinterpreted, i.e., conclusions are drawn from them for which the method provides no basis.

In the first case, we omit an important part of the consumer data that could be used, we work on the basis of a very narrow picture of the consumer, and we risk combining within a single segment consumers with completely different behavior patterns.

The second case is primarily about using RFM to predict future consumer behavior especially at the micro level. This can manifest itself, among other things, in the assumption that the consumer who achieves the highest values of RFM indicators will remain in the best “segment” in the future . This, of course, may be true. However, this is not determined solely by past purchases, but by many other factors that RFM, as a rule, does not take into account.

What are the alternatives to RFM?

As we noted earlier, the main limitations of using the RFM approach boil down to:

  • An oversimplified view of the consumer and thus a high risk of segmentation error.
  • Lack of predictive capabilities for consumer behavior.

Multidimensional segmentations based on machine learning are excellent for solving the first problem. In their case, the number of factors taken into account can be almost unlimited. Segments are defined algorithmically. Consumers are also assigned to them in the same way. The cost of greater detail is slightly more difficult interpretation. However, an experienced analyst can visualize segments in such a way that they are easily understood by managers.

The prediction problem is best solved with dedicated methods and algorithms for building predictive models. Such models are not only limited to the analysis of historical data, but recognize specific patterns of behavior that allow predictions of the future. Such prediction can be carried out at the level of a single consumer (rather than a segment) and allow full personalization of actions. Among the specific methods worth mentioning here are behavioral sequence models based on deep machine learning.

Summary

The well-known and popular RFM model can be a very useful tool. Provided, however, that it is properly interpreted and used with an awareness of its limitations. It’s a great tool to get a “bird’s eye view” of the consumer base. However, when you want a more detailed picture and a prediction of future consumer behavior, you should turn to tools specialized for solving such problems.

What can I do to make customers more willing to read my emails? A guide to effective email marketing

When Ray Tomlinson, an American engineer and programmer, sent the first-ever email message in late 1971, he could not have realized what applications his invention would find. And he certainly wouldn’t have thought that someone would use this type of message to convince others to buy his products. Yet, more than 50 years after the first message, email marketing remains one of the most important channels for marketing communications. The challenge, however, remains in maintaining the effectiveness of this channel. AI tools can help with this. Before using them, however, it is worth asking yourself where to start?

Effective email marketing – how to get started?

I know that what I’m about to write, especially if someone reads it out of context, will sound trite, but sometimes it’s worth going back to the fundamentals. And the fundamental principle of email marketing can be summarized like this: if the customer doesn’t open the email, he won’t know what we wanted to communicate to him. And if he doesn’t, there will be no chance to perform the action we wanted to convince him to do. The adventure of email marketing should therefore start with getting the customer to open our mailing at all. Meanwhile, marketers know from their open rate statistics that usually most emails are ignored or even deleted without being opened. Why?

In our considerations, we will omit messages that are clearly spam. If I don’t know the sender’s address, didn’t sign up to receive such emails, or they look suspicious, the smartest thing I can do is delete them as soon as possible. So we are interested in all the other emails and the answer to the question of why recipients don’t read them. Well, according to a report by SARE, among the main reasons for deleting emails without opening them, recipients indicate:

  • I get too many messages from one sender (31.9% of responses),
  • the title is not interesting (33.4%).

These reasons account for more than 65% of all cases. Interestingly and positively from the point of view of those responsible for marketing communications, both of these factors are influenced by the sender and he is able to better target and personalize his email marketing efforts. So, it can be said that just send fewer messages to customers’ inboxes and write more interesting titles , and the return on investment will be higher. Simple, right? Unfortunately, we all know that’s not quite the case. I would even venture to say that in many cases it will simply be difficult.

Running mailing campaigns. How to improve their effectiveness?

For the first reason, the difficulty is determining how much is “too much.” Is once a week too much? Or is only once a day too much? Or maybe for one user three times a week is too much, but for another even four times a week is still ok? Or maybe… Well, that’s exactly it. The different scenarios are actually endless. It is impossible to list them all, let alone test them. On top of that, after all, we shouldn’t assume that a customer’s interest and patience are invariable over time. The only way to solve this confusing problem can be AI models based on machine learning. Based on historical and constantly incoming new data, they are able to make highly accurate predictions and optimize the appropriate shipping frequency for each individual consumer. In doing so, they are constantly improving and adapting to changes in consumer expectations. They are able to catch even very subtle signals of “overheating” of target groups and recommend reducing the rate. Someone may say: but all this looks complicated and probably expensive to implement and maintain, we’d better be careful and just send emails less often. It is hard to disagree with such a position. However, this is not the optimal strategy. With customers inclined to open your messages more often and respond with a purchase, you are losing a large portion of potential revenue this way. Thus, you are not using the potential of your contact base.

Let’s move on to the second problem, which is the uninteresting title. I can already hear the voices that appear in the heads of some of the readers. – He is about to write us something about testing and AI, and after all, we do testing without AI too, and a lot of it.

SARE’s report, cited earlier, shows that about 84% of senders conduct tests before sending campaigns. However, only less than 17% have A/B/X testing. One of the companies I have been in contact with, for example, conducts tests of the message title. Three variations of the title are prepared. Then a group of about 15% of the contact base is randomly selected. This group is divided into three equal parts, each of which receives one title variant. The variant that achieves the highest open-rate in the pilot mailing is then sent to the remaining 85% of the recipients. So we have testing, we have segmentation, we have optimization, it’s ok. However, the question that comes to mind is: what did you actually test and what question did you get the answer to? Did you definitely choose the best variant of the title, or only the best among the three proposed? How do you know that there are not 20 other variants, each of which is better than the winning one among the three tested? We don’t know, but even if we wanted to test more news topics we would get small groups. So what can we do to make the title in email marketing effective and translate into higher conversions?

Email marketing vs. Artificial Intelligence

Yes, some of the readers have surely already guessed, now it’s time to write about AI. Based on deep neural learning, the model is able, by accumulating data on currently and historically sent messages, to predict the response and estimate the most likely opening rate of any title. It will even work for a title you have never sent before. It really will. And even if the model is a little less sure of its prediction in such a case it will let you know. Just remember that a title alone is not enough information. After all, going back to the fundamentals (or clichés, as you prefer), it is important not only what we say but to whom we say it, and even when we say it. The same message can be understood completely differently: one will be pleased and the other will be offended.

So an AI model that understands the title of the message, but is detached from the context, is not enough. Deprived of information about the groups of recipients, the characteristics of customers, the history of the company’s relationship with them, their purchase history, the moment. What is needed is a system that integrates data about these phenomena and provides the AI model with the appropriate context. The general scheme can be seen below.

With a properly defined, trained and calibrated model, we can test different variants of titles and get information as in the examples below. The number of variants can be arbitrary, as can the number of segments. And best of all, we don’t have to send a single email to your recipients to run the test and estimate the expected open rate. We can conduct everything using a computer simulation.

Conducting effective email marketing – summary

When asked many years later what the content of the first email ever was, Ray Tomlinson said he couldn’t remember. The most important thing was that the message arrived at the recipient’s address. The content was not important. It didn’t matter at all. In marketing communications, the exact opposite is true. The mere arrival of an email is not enough. The content of the email is important. An interesting title is also important. Because without it, a significant portion of recipients will not read the content.

3-fold increase in conversions due to targeting of mailings based on predictive model

Today’s consumers are constantly inundated with messages from various brands. Many brands send multiple messages, through multiple channels. This makes it difficult to attract and keep the consumer’s attention for a long time. At the same time, it is easy for the consumer to become tired of the communication and pay less and less attention to it. Thus, it becomes more important than ever to choose the right content, to send the most tailored message to the consumer, and to limit messages that are not interesting and only increase the risk that the consumer will become insensitive to the message.

Predictive modeling is helping to solve the problem. Systems based on machine learning are able to predict consumer interest in a particular type of message or offer with a high degree of accuracy. The article uses a concrete and current example (from May 2023) to show how to apply the aforementioned tools in practice. Due to the highest standard of confidentiality, the numbers we will present will be scaled or shown as indexes. However, they will faithfully represent the observed differences and effects.

The problem with the traditional approach to email targeting and the need for change

The organization to which the example relates, like many others, for many years used a method of so-called “maximizing revenue” from its communications base through broad and frequent mailings. That is, in practice, information about an offer was sent to all consumers who had permission to communicate through a given channel. In a few cases, using expert criteria, the communicated base was narrowed down somewhat. However, this was based on simple criteria such as: has ever bought the promoted product before, has not bought product X in the last 6 months, is in a woman over 55, etc. The results were very good for a long time, and no one saw the need to change the process used. At some point, however, a slow decline in the email open rate (the so-called “open rate”) began to be observed. The downward trend began to be pronounced. Combined with the declining number of newly acquired consumers, this led the organization to wonder if it was possible to work better with the existing base. What can be done to reverse the trend of declining interest in the communications being sent?

The decision was made to test the integration of machine learning and predictive analytics into the process of selecting consumers for mailing campaigns. We prepared a predictive modeling system that generates “tailor-made” scoring models for each campaign. The general architecture of the system is shown in the diagram below.

architecture of predictive modeling system

Use of predictive modeling in mail targeting

For the purpose of training the model, more than 100 variables from the areas listed in the diagram were used as input data. The model is built on the basis of advanced algorithms, able to cope with such a multitude of attributes and extract from them as much information as possible about the actual profile of the consumer. The final result is an estimate of the probability of interest in a given communication by each consumer. This is then used for the final selection of consumers for the campaign.

The results of the changes in the communication targeting process met (and even exceeded in some aspects) expectations. To prove the usefulness of the model, we conducted experiments. Half of the base was subjected to selection by the old way, while the other half was selected using the model’s prediction. It should be noted here that in both groups we used exactly the same emails – the same subject, exactly the same creation. Also, the timing of the mailing was the same. Therefore, none of these factors could have affected the results of the experiment. The only difference between the groups was the way consumers were selected.

Effects of using predictive modeling

In the group targeted with the model, it was possible to reduce the size of the communicated group by nearly 14 times – for every 100 communicated with traditional criteria, there are only 7 communicated according to the process based on the predictive model.

At the same time, such a small group generated similar (only about 2% lower) sales.

This was achieved by significantly higher (3 times) conversion in the group assigned to the campaign in the new way. And also a much (4 times) higher average receipt value in that group.

Narrowing the communicated group allowed to limit it to those really interested in the offer. This is evidenced by a much higher open rate (3.2x higher) and click to open rate (almost 2x higher). The click to open rate in this case is calculated as CTOR = LC/LO, where LC is the number of consumers who clicked on the link from the email, and LO is the number of consumers who opened the email. While the open rate is highly dependent on the subject line of the email, a higher CTOR indicates actual interest in the content and offer that is included in the email.

Targeting mailings based on predictive model – summary

By using an advanced data science tool in the form of a predictive model, it was possible to achieve:

  • better matching of communications to consumer interests and needs
  • a significant reduction in the number of communications in a given campaign with minimal damage to the sales result (just over 2%)
  • reduction of communication “overload” – the consumer will receive communication less frequently but it will be better tailored in the new process

The exact impact of the model and the new targeting process on the trend of open and click-through rates of mailings, can only be studied over a longer period and requires at least several months of observation. However, the first recorded results look promising and give reason to expect a reversal of the clear negative trend seen in the months before the introduction of the scoring model.

Finally, it is worth noting that an advantage of the system is the openness of its architecture to new data sources. If new variables become available, they will be automatically incorporated into the model training process and used for prediction. Another important feature of the described solution is the model’s ability to update itself as new data arrives, including data on executed campaigns and their effectiveness. As a result, the model will automatically adapt to the changing needs and behavior of consumers and their reactions to the communications sent. This guarantees the usability of the system over the long term as well.