I have no more than a mathematics A-Level and am certainly not a developer, but I thought it worth writing a very simple blog post to discuss what machine learning brings to the party when compared with plain old segmentation.


Segmentation can be extremely simple, often using only one data point. A website may serve different content to different nationalities, or perhaps different content for men as opposed to women. The reasons for this are obvious – there may be different delivery details or pricing depending on where you live, or gender-specific categories and products.

Every marketer is also familiar with segmentation as a means of optimising email campaigns. Rather than blast everybody with the same message, marketers might switch things up based on attributes such as recency of purchase, frequency of purchase, and monetary value of purchases (traditional RFM analysis).

Using paid social media, such as Facebook ads, marketers may take things a bit further, taking advantage of Facebook’s abundance of data to target, for example, men on the East coast of America over the age of 50 with household earnings above $200,000 and with an interest in sailing.

When segmentation of an audience is taken to its logical conclusion, too much data becomes a problem. The more segments you create, the more difficult it becomes to understand the relationship between each variable, and exactly what the success or failure of a particular campaign means.

The more segments that are manually assigned, the more content variations that have to be decided upon. The point is that segmentation is often a very effective but manual process, with little sophistication.

Collaborative filtering

Collaborative filtering is a type of recommender system that can be used for product or content recommendation. At a simple level it looks at relationships between users and products/content and uses heuristics (assumptions) to predict what users will like. In the simplest terms – a user will like a product that a similar user likes, or a user will like a product that is similar to a product they have already liked.

These types of models don’t have to be particularly sophisticated, and in the past may not have strayed into machine learning territory.

However, they can be more sophisticated. A latent factor model is another method of filtering – using many factors (possibly hundreds) about users and products in order to explain user actions (e.g. purchases/ratings).

This method is useful because it can use implicit user feedback, looking at past browsing behaviour for example, and does not need to know what a user likes in order to make predictions about them.

The algorithms involved in latent factor models are examples of machine learning.

Content analysis

Going further down the route of machine learning, and perhaps the easiest part to understand for the layman (me), is the method of analysing products or content to give them meaning (semantics).

Advances in natural language processing (a much-publicised product of machine learning and deep learning) mean that content can be explored for explicit and implicit meaning. This is a kind of contextual analysis – very simply, if the content talks about pugs, this is linked to dogs, and other parts of that phrase or paragraph may be interpreted in a doggy context.

Fairly obviously, the ability to find meaning in unstructured content will be helpful in recommending things that a user may like.

Implicit semantics make use of wider learning, allowing characteristics of a product to be implied by looking at third-party information. If a product title includes reference of James Bond, for example, theoretically a recommender system could analyse the internet (e.g. Wikipedia) for context around James Bond, and incorporate this information into recommendations (Aston Martin, secret agents, Sean Connery etc.).

Semantics are also important in user behaviour – knowing what page of a particular third-party website a user browsed isn’t that useful unless you can understand the content on that page, for example. All this, as you might of gathered, is the kind of technology employed by search engines for some time now.

This semantic analysis is what sentiment analysis is all about, too, and a big area of new marketing solutions that learn. If you want to optimise an email subject line, your algorithms need to look for the sentiment behind particular words, phrases and sentences, in order to model open rates based on the emotions they evoke.

It’s not just natural language processing that is of interest here. Computer vision means that marketers can potentially use recommender systems that look at the visual affinity between images (e.g. product photographs) and perhaps eventually video (e.g. movies) if Google’s progress is anything to go by.

So, is this the study of everything?

There’s a question to be raised here. If machine and deep learning keep jumping forward and we are able to find meaning in every bit of information and every user action, won’t that make marketers obsolete and everything automated, fully optimised and democratic?

Though big data analysis seemingly has no limits (marvel at the ability of Google to diagnose retinopathy effectively), it’s important to note that efficient recommender systems are not simply the study of everything. There still needs to be an understanding of what it is that is most likely to inform marketing success.

Let me heavily quote Quora user Ethan Macdonald who properly articulates this in the context of a logistic regression algorithm (used in machine learning):

‘I think one key thing to recognize is the importance of selecting good features for the task at hand. There are an infinite number of features you could use to describe a user, but some of them are more useful abstractions for the classifier and the task.

‘For example, I could describe a user by the precise location of every molecule in their body, and encode this information as a set of features where one location is one molecule. If I feed this to a logistic regression algorithm the computation will probably be prohibitively slow and the final classifier would likely be useless. On the other hand, I could instead encode the user as features by more important abstractions: their gender, hair color, whether or not they wear glasses, etc.

‘..One key thing to notice is that the features in the first example can not be easily extended to describe multiple users: each molecule is encoded as its own feature and people don’t generally share molecules. Features like “has brown hair”, on the other hand, can be used to describe many different users — even users that have never been encountered in the training set. In a standard machine learning classification task, we want to learn the best setting of the parameters that allows us to use the information in the features to generalize to examples that we have not seen before.’

In short, recommender systems can not draw meaning from everything, they will always be an extension of the sort of common sense logic we use in very basic segmentation – finding factors (even if latent) that explain the success of a message/product within a particular audience.

In summary…

I know Econsultancy’s readership includes some heavyweight AI experts and so I am prepared to be called out on some of what I have written above, but it’s what I have drawn from a layman’s reading of some of the literature.

The easiest way to think about machine learning in personalisation is not as a magic and giant leap away from plain segmentation, but simply as a statistically powered refinement of those same instincts, also incorporating some kind of feedback and optimisation.

See, easy.

And if you don’t want to take my shaky word for it, why not attend Econsultancy and Marketing Week’s event about AI in marketing? Supercharged takes place in London on 4 July 2017. Buy your tickets here.