The usefulness of general purpose LLM-powered chatbots is still a subject for debate. ChatGPT and Google Gemini are described as ‘doomed’ in a recent article in Intelligencer by John Herrman. The thrust of the piece is that these chatbots are ‘unscoped’ (indeed the interface of ChatGPT hints at ways you might use it) and that this causes issues if people view these tools as a source of objectivity, able to answer any question, despite the limitations of their data or models. Far better to use this type of tech in specialised applications such as for customer service, says the author (“…a chatbot for everyone and everything is destined to become a chatbot for nobody and nothing”).

For customer service, the chatbot UI is at least well-established, though the error which earlier this year allowed one customer to successfully prompt the chatbot of parcel delivery firm DPD to swear and compose self-critical poetry shows that these specific applications are not impervious to problems.

In this article, I’ll look at some use cases for LLMs within ecommerce UX, such as AI-powered search and shopping assistants, as well as road testing some brand examples and investigating what might be best practice in this area.

And if you’re interested in training, explore Econsultancy’s AI in Marketing short course, ecommerce and CX training, bespoke academies and skills benchmarking.

Jump to:

GenAI + GUI (going beyond the chat box)

Stepping back a little, AI chat is clearly not the solution to every problem in business, something my colleague Rebecca Sentance reflected on in her post about the ‘copilot era’, asking “Does every user experience need a conversational assistant?”

This “rush towards AI chat and conversational UI”, as Nielsen Norman Group VPs Sarah Gibbons and Kate Moran describe it in their blog post, has thrown up examples of good and bad use cases. The VPs question the value in LinkedIn’s AI-Powered Premium Experience, which suggests follow-up questions you may want to ask at the bottom of LinkedIn posts, and then opens a chat window displaying the answer. For Gibbons and Moran, this produced content that “did not add much,” with the authors’ describing this use case as a solution to an “arguably non-existent” issue.

Besides finding a problem that is in need of fixing, a good rule of thumb, according to Will Grant, Gartner analyst (and one time author of Econsultancy’s UX Best Practice Guide) is that, “the best AI experiences often go beyond the chat box, incorporating graphical user interface (GUI) elements to enhance user interaction and engagement.” “In fact, for many use cases, chat is the worst interface of all,” he adds.

Grant points to good examples of blended GenAI and GUI elements in SaaS software such as Adobe Firefly and Salesforce Einstein Copilot (see his LinkedIn post for screenshots).

AI chat as time-saver (and complexity killer?)

One of the most exciting prototypes I’ve read about is GOV.UK Chat, which – given the volume of detailed and technical content on GOV.UK website (700,000+ pages) – seems like it could eventually complement the excellent work that Government Digital Service (GDS) has pioneered in content design.

gov.uk chatbot prototype
GOV.UK Chat prototype. Credit: GOV.UK

Writing on the GOV.UK blog in January, Director Chris Bellamy says the first experiment from the new GOV.UK AI Team “was to see if a LLM-powered chatbot can reduce complexity, save people time and make interactions with government simpler, faster and easier.”

The chatbot was designed to respond to user questions “in the style of GOV.UK, based only on published information on the site.”

You can see from the video below how the chatbot includes a prominent dropdown showing source pages for the information presented in the chat, allowing the user to go further.

 

The findings of this first experiment make for interesting reading. A majority (70%) of users surveyed said that the responses were useful, but the GOV.UK blog highlights “known issues associated with the nascent nature of this technology,” adding that, “Overall, answers did not reach the highest level of accuracy demanded for a site like GOV.UK, where factual accuracy is crucial.” Hallucinations were also observed in response to some “ambiguous or inappropriate queries”.

GDS are iterating on this chatbot, and have other AI use cases in the works. Martin Lugton, Head of Product at GDS, in his blog post, ‘What’s next for digital government and Government as a Platform?’, discusses the value of AI for creating better services and shares an image of another prototype, this time for service designers, which uses “a large language model to suggest improvements to make a form question easier to read.”

Clear use cases for LLMs in online retail

AI has been around for years in the advertising and retail industries powering ad auctions and product recommendations, with even relatively useful (though narrow focused) chatbots in use for a decade, such as Bank of America’s Erica, launched in 2016.

However, the recent improvement in LLMs and the evolution of cloud-based martech looks set to democratise more sophisticated functionality.

For example, head to the website of data and AI platform Databricks (founded in 2013 and last year making $1.6bn in revenue) and you’ll find a ‘Solution Accelerator’ page detailing how retail customers can use “pre-built code, sample data and step-by-step instructions” to take advantage of LLMs for use cases such as:

  • enhancing product search;
  • building an LLM-enabled chatbot;
  • automating product review summarization;
  • and building common sense product recommendations.

Look at Amazon’s AI experiments from the last year, too, and you’ll see similar themes – weeding out fake reviews, helping sellers write product listings, and a chatbot.

Some brand examples of LLM ecommerce experiences

For this article, I thought I would investigate some recent frontend uses of LLMs, such as product discovery tools and LLM chatbots, to see how easy/fun/useful they seem.

Rodo – car marketplace with new ‘AI search’

Rodo is an ecommerce platform that allows its dealer partners to sell and lease vehicles to customers. Last month the marketplace launched the beta version of an AI-driven search interface, which you can ask questions of naturally, as you would a dealership salesperson (see my screenshots below).

The automotive sector seems ripe for these kind of experiments, given the purchasing journey often necessitates lots of research undertaken on third-party publisher websites. An ecommerce ‘search ideas’ experience that helps the customer discover the right car (as well as the right price) is an intriguing concept.

Avoiding the complexity of faceted search

Nathan Hecht, CEO of Rodo, highlights the complexity of faceted search when sifting through thousands of car listings online, saying in a press release, “Conventional interfaces burden users with menus, dropdowns, toggles, and an overwhelming array of choices.” Whilst I agree with this sentiment, it’s also the case that some users will want more choice than others, and this is one of the difficulties in being fairly prescriptive with any kind of AI search results or even chat. Rodo circumvents this issue by offering the option to go back to ‘classic search’, as well as including calls to action further down the page to explore either new or used cars, which again returns a familiar filtered page of listings.

The Rodo beta is intended to allow users to ask questions such as “I’m looking for the best deal on a vehicle for a large family,” and get “immediate results with all relevant discounts and rebates factored into a great monthly price.” Interestingly, Hecht says the tool “mimics the in-store interaction of a customer and a salesperson” without using “an impersonal ‘bot’ or ‘chat’ style interface”. This makes the point that often users don’t like chat interfaces if they only serve to make a search process more laborious/longwinded, or if they (despite their LLM) live within the slightly uncanny valley.

I had a go with Rodo and found it a very quick and enjoyable experience.

In this first screenshot you can see how the search field offers suggestions to the user, and these cycle every second or so, and are nicely varied.

Image showing Rodo’s suggested searches. Source: Rodo.com

Don’t cringe too hard, but I started by asking about cars for cool dads. And the results pretty much hit the mark. Four cars were returned – the ‘best match’ was a VW Golf in a hero block at the top with a carousel of great imagery, and the other suggestions included a convertible Mini and a Dodge Charger. Pretty solid ‘cool dad’ territory.

Response to my search for great cars for cool dads using Rodo’s AI-powered search. Source: Rodo.com

I tried testing the LLM a little more and asked about a car with “great build quality” but that “won’t cost the Earth”.

The results were again very useful (Mazda CX30) and the only slight quibble I had with the nice light box explaining why this was the best deal is that though it ticked the box for the competitive price part of my query, the other information was more about practicality than quality. But, hey ho, maybe great build quality isn’t one of the selling points within the features database.

‘Why is this the best deal?’ – a rationale for my returned search results on Rodo’s AI-powered search. Credit: Rodo.com

For their part, the team at Rodo seem upfront about the fact that this beta will still have plenty to learn. Head of Product Daniel Buxbaum wrote on LinkedIn last month: “What we’ve built in under 2.5 months, from design to production, is far from perfect. It’ll get answers wrong some of the time, just like most AI/LLMs. The vehicle comparisons are a bit dirty. The results take a while to load. It’s very much a beta; “v1” of many.”

Zero previous knowledge of AI

What is fairly astonishing though is that Buxbaum, in paying tribute to the Rodo team for making the beta happen, writes that they combined their own dataset with “an existing LLM, and with *zero* previous knowledge of AI.” This shows where the market is currently with AI (advancing rapidly), and why so many are excited about what it could mean for digital products.

Automotive is an interesting sector for software in general, given the industry is getting to grips with connected cars (read more about the transformation in our recent interview with Valtech Mobility).

LLMs have made their way into vehicles, as you’d expect, with Volkswagen, for example, presenting cars at CES 2024 with ChatGPT integrated into their voice assistants (planned for production in Q2). Given the question we explored earlier of unscoped chatbots, it’s interesting that the VW press release says the new chat function will not only be used to control the infotainment, navigation, and air conditioning, but also to answer general knowledge questions.

I’m torn between thinking this would be a dream to help educate my kids, or a nightmare of conversational dead ends. But the reality is it’s likely that ChatGPT will help to make voice command better able to understand a wider range of commands.

As The Verge’s Andrew J. Hawkins writes, as it stands, “Most vehicle voice assistants are [currently] pretty rote, able to do things like turn on seat heaters or window defrosters — but lack conversational skills and typically fall short of more complex navigational requests. False positives and the need to vocally repeat instructions are common.”

Rufus, the expert shopping assistant from Amazon

Rufus, Amazon’s ‘expert shopping assistant’, launched in beta last month, and the announcement summarised its functionality, saying customers can:

  • Learn what to look for while shopping product categories.
  • Shop by occasion or purpose.
  • Get help comparing product categories.
  • Find the best recommendations.
  • Ask questions about a specific product while on a product detail page.

I haven’t used Rufus, as it’s only been rolled out for a small set of users, but Techcrunch has an excellent review. The author asks, “What are the best Valentine’s Day gifts for gay couples?” and is pleasantly surprised to see some LGBTQ+ related items in the product listings. However, Rufus did make some missteps, such as returning a women’s vest in a search for men’s leather jackets and, notably, returning results when asked negative questions such as “worst gifts for parents”.

How does chat work with promoted products?

Given Amazon’s reputation for user experience and the fact that it has already employed chat interfaces successfully for customer service, Rufus seems like it could be a valuable addition. How it deals with promoted products will be interesting, given that arguably some customers might turn to chat to cut out the work of narrowing down their options, which is where Amazon’s Choice or sponsored products would traditionally play a role.

Another interesting dynamic to watch out for with Rufus will be how it deals with more subjective categories such as fashion. In Europe, however, Amazon is a popular place to search for unbranded basics, so in this sense, there may not be much call for Rufus to design whole outfits for an important occasion.

Layla – Instagram chatbot and app for travel inspiration and booking

The Layla app and Instagram chatbot was launched late last year by Beautiful Destinations, a strategy, creative and content studio in the world of travel that boasts some large audiences on social media platforms (26m on Instagram for example).

Jeremy Jauncey, founder of Beautiful Destinations, told Techcrunch that the journey from travel inspiration to booking via advice can be made less complex, across fewer websites (similar to the point made earlier about car buying). Indeed, Instagram has played such a key role in this regard, and Layla can not only be opened on that platform but also includes content from creators across the Beautiful Destinations network, before pulling information on hotels and flights from partnerships with Booking.com and Skyscanner (all done with easy-to-use GUI elements beside the chat thread).

I asked Layla (using the web app) about “somewhere peaceful in Northern France I can drive to off the ferry”, without realising the engine only returns flight results (and not other forms of travel). I was slightly surprised that although the chatbot picked up on the ferry element, and referenced it when recommending Normandy, it didn’t explain it only enabled flight booking but just went right ahead. Perhaps this is due to the fact that the LLM is chiefly parsing queries and displaying answers (as per Techcrunch), before handing off to Layla’s own recommendation engine.

The best bit was undoubtedly when the bot served up a useful and detailed itinerary including visits to Bayeux, Mont Saint-Michel and Giverny, with lots of dining suggestions for breakfast, lunch and dinner.

Packed itinerary for Northern France courtesy of Layla. Credit: Justasklayla.com

Chat can be tiring

It may be a matter of taste, but I didn’t particularly enjoy the longer chat elements of the experience. You can see an example below, Layla’s first response to me was a paragraph about Normandy. While none of the copy or information was uncanny or inaccurate, I would have preferred (much like the Rodo interface) a choice of two or three destinations, with some bullet points for each, rather than feeling like I had an interlocutor. Each business will make a call on whether they wish to have a chatbot with its own face and personality.

In the end, I think I got a little tired with Layla. Is too much inspiration a bad thing? I may be missing the point here, as by logging in the user can build a bucket list of favourite destinations in the Layla app, and so the tool is very much a companion to your typical Instagram travel scrolling. And was the Layla app preferable to browsing travel blogs and online travel agents? Yes, without a doubt. In fact, I could see the value of an app like Layla for a big hotel chain or airline – to increase direct bookings, allowing me to explore everything a particular brand can offer.

Example of Layla’s LLM-powered responses, which felt a little longwinded for me. Credit: Justasklayla.com

Carrefour – charming chatbot offering recipes

French retailer Carrefour announced in June 2023 three ways it would be working with GPT-4, one of which is a shopping chatbot on carrefour.fr.

The chatbot, named Hopla, is available on the home page and can help users choose products based on budget, menu ideas or food constraints. Notably, the bot can give you a recipe and then show you a list of products ready to add to your basket.

I liked the fact that Hopla is a little cartoon egg (not a fake human) and that the bot is really clearly marked as ‘in training’, ‘an experiment’ and ‘still learning’. The bot also shares a privacy policy and asks for consent from the user to process their data.

Carrefour’s Hopla assistant making clear it is in testing, and offering its privacy policy. Credit: carrefour.fr

I asked Hopla what ingredients I would need for a croque monsieur and it quickly provided an answer. I thought it slightly strange that Hopla, after offering me a recipe as well as the option to view the products, only gave me a button response for one of these options (‘buy the products’). It was simple enough to type the word recipe into the chat field, and Hopla duly responded, but it left me wondering if the lovable little bot would rather I move to checkout.

Hopla chatbot offering the user ingredients and recipe for their suggested dish. Image: carrefour.fr

When I viewed the products, I hit a very nice landing page with bread, Emmental and butter featured. Zero results returned for ‘ham’ was possibly an issue with me using Hopla with the help of my browser’s translation function.

The recipe Hopla provided was a nice surprise, delivered in the chat but also with the option of downloading a PDF of the instructions at the click of a button. You can see below that the recipe document wasn’t perfect (missing a list of ingredients, for example), but this feature alone is a solid reason for using Hopla to help with, for example, a click and collect order where you may want to quickly add ingredients for a new or special meal.

Personally speaking, I would use this kind of feature and possibly instead of published recipes (note that the Carrefour website has recipes on traditional web pages, too).

Recipe PDF provided by Hopla on the Carrefour website. Image credit: carrefour.fr

Carrefour is one of several retailers and grocers testing or rolling out this sort of tech. Instacart last year launched a ChatGPT-powered search tool, and Walmart announced an AI-powered search feature for iOS at the CES trade show, which allows shoppers to ask for products that might match party themes, for example.

Conclusion 

There are lots of obvious hygiene factors to get right with AI-powered search and chatbots in ecommerce – privacy, speed (in returning results but also not waffling) and a mixture of UI elements beyond chat.

But my main takeaway from using the tools covered in this post is that if ecommerce sites can take care of product inspiration and research for the customer, then that could be a big timesaver – not just by generating ideas but by doing away (in some cases) with lots of sorting and filtering on listings pages. The only issue is perhaps how willing the average ecommerce brand will be to potentially distract customers from purchasing with ‘too much’ inspiration, and how important it is to offer the right level of choice to shoppers in the UI – too many products or too few and they may go elsewhere.

Feature image: Shutterstock