3. URL structure

A common problem for ecommerce sites is system-generated URLs that contain strings that mean nothing to users.

This is often because the platform generates internal references for site components, for example a numerical code for a product category, which will be included in the URL by default unless over-written using an optimised URL structure.

Your platform may well support an SEO friendly URL in addition to the system generated one. You need to make sure that the SEO friendly URL is the one being served to users, sitemaps and search engines.

The table below is an example of how you can map out the URL structure for your website. It’s important to map out all different page types and create a consistent URL structure.

This also has an SEO benefit as it ensures you are using contextually relevant URLs

Page type

URL structure














Another common issue with URLs is the indexation of URLs that you don’t want in the search index.

For example, a product page may have session IDs generated, which creates an exponential number of versions of the URL.

If these URL parameters aren’t identified and managed effectively, the end result is content duplication for search and cluttered analytics reports where multiple URLs exist instead of a single version.

This makes data analysis and reporting complicated and unnecessarily time consuming.

One Client had more than 100,000 such URLs showing in their Webmaster Tools that had been sat there for more than 12 months.

It really pays to keep a close eye on your indexation status and be proactive in addressing issues as they arise, otherwise you risk diluting your SEO efforts by clogging the index with irrelevant URLs.

A note of caution: be very careful when filtering out parameters so you don’t block valid URLs from being indexed. If you don’t know what a parameter does, get someone more technical to evaluate for you.

Google Webmaster Tools is a useful starting point to identify URL parameters. Go to the Crawl > URL parameters report. You’ll notice the helpful warning “Use this feature only if you’re sure how parameters work. Incorrectly excluding URLs could result in many pages disappearing from search”.

Google WMT

Key tips:

  • Create a URL taxonomy that defines how each page type URL is structured.
  • Identify URL parameters that are generated by the website and use the relevant tools to block ones that relate to pages you don’t want indexed by search engines.
  • Use dashes (“-“) to separate words rather than underscores (“_”) – don’t ever use slashes (“/”).
  • For URL parameters generated by faceted navigation, use “key-value” pairs rather than non-standard encoding e.g. http://www.mysite.com/category?county=hampshire&sid=123

4. Data formats

Where do you begin?

There are many different data types that help a website work. These include:

  • URL structure (see above).
  • Catalogue naming convention.
  • Product numbering structure.
  • Order codes.
  • Returns & refund codes.
  • User personal data.

… to name but a few.

You need to understand what format the data needs to be captured in to enable the back-end processing, such as order management and financial reporting and reconciliation.

There’s no right answer for how to do this, indeed most clients I work with have a different approach, but there is a methodology you can follow.

Let’s use the example of defining data tables for the checkout, the most important processes on a transactional website.

There are four steps to think about:

  1. Map out each data field that is required. Streamline this because the less data entry, the less chance for drop-out.
  2. Define the type of data that is being captured e.g. is is text only, or is it a drop down field so the user can’t enter any data, instead they must select from the list.
  3. Define data formatting requirements e.g. what is the maximum # characters that the data field can support?
  4. Set whether the field is required or not i.e. does the user have to enter a value into this field to progress? 


Data field

Data type

Data formatting





2 parts:

(i) house number (valid numerical value only max 4 digits) or house name (text max 30 characters).

(ii) postcode (valid UK postcode formats only).





As above.




Drop down.

List of all card types from the database table for payment methods. No typing from user, can select only 1 option.


Refunds & returns is another interesting data challenge. One client I worked with years ago created a major reconciliation headache because they generated refund codes that weren’t related to the original order.

The refund code had a different structure to the order code and there was no flag in the database to associate the two. This created a big manual mess that took a lot of time (not to mention opportunity cost) to resolve.

A far neater solution is for the refund to use the exact same order number but have a prefix or suffix attached to differentiate. For example, if order number 12345 is cancelled and refunded, the refund code generated is R_12345.

In the admin console, within the order record there is an option to process the refund so that the refund code is flagged against the order and the order status updates accordingly. This helps tally up reports and reduce stress levels.

You quickly realise how much thinking has to go into the data structures and underlying processes. Take the example of the drop down list for the payment card type. The website needs a database table to call on to pull in the available options.

So where does that table sit? What is the data input point for setting-up the options? Is there a maximum number of options you can support? If an option becomes unavailable, how easy is it to turn this individual item off without breaking the data field?

With regards to required data fields, you then need to define the process for handling exceptions.

First of all, is it clear to users what they need to enter e.g. do you make it clear that the password field must be a minimum of eight characters and contain at least one number?

Then you need to define the error scenarios and how the site responds to manage the user experience. For example, if an invalid password is entered, is the error message shown in-line to be as contextually relevant as possible.

I’m going through this process at the moment for a new site. Even though I’ve done it many times for clients (and years ago when I was in-house), it never ceases to amaze me how much time it takes to refine your thinking so that the data structures are logical and the front-end user experience is of a high enough quality.

I’m already on the #3 iteration of the project creation data structure and it’s still not nailed!

I think it pays to think about the user first. What’s the minimum amount of data you can capture to effectively support the process or feature they’re using? Start with this and you won’t bloat data capture requirements.

You can then look for neat UI design techniques to subtly ask for more data without interrupting the user journey.

What do you think? What tips and tricks have you got to share to help others plan their information architecture?