Guest Post: How AI can solve the problem of a lack of standardisation in travel

Guest Post: How AI can solve the problem of a lack of standardisation in travel

Nezasa’s Manuel Hilty assesses the state of play in travel sub-sectors and asks how more data uniformity can be achieved

Manuel Hilty, founder of travel booking a reservations platform Nezasa, assesses the state of play in travel sub-sectors and ask how more data uniformity can be achieved

One of the fundamental challenges in travel technology is the lack or absence of standardisation that we face in many areas of travel tech.

It can be very frustrating connecting to supply systems for travel products, and one of the main reasons is they can be so different in both behaviour and the data they provide.

How should we deal with this? Should there be much more standardisation or are there alternative ways to address this problem? I aim to address these questions in this article.

The Status Quo

To start, let’s have a look at how standardised the different product and data types really are.

Flights: One of the most standardised travel products.

A powerful industry organization (Iata) in combination with an oligarchy of distribution systems (the large GDSs) and a limited amount of suppliers were able to standardise many aspects of flight booking and fulfilment, some via official standards and some via de-facto standards created by the GDS.

In recent years, those standards were not fast enough to adapt to changing requirements of the airlines especially regarding increasingly important ancillary sales.

A new standard (New Distribution Capability) was created to address this problem, which resulted in not only one more standard but also one that leaves more room for interpretation and innovation.

Overall standardisation: 3/5

Car Rental: There is a surprising number of rental car APIs that follow the Open Travel Alliance standard. The ACRISS classification of features is also widely used.

This makes rental cars a category with a high degree of standardisation as long as we do not include other vehicle types like campers or motorbikes.

Overall standardisation: 4/5

Accommodation: Hotels are less standardised than flights.

There is some de-facto standardisation provided by companies like Germany-based GIATA.

It provides basic property data (addresses, coordinates, etc), Facts (standardised amenities such as WIFI availability, facilities available and much more) and multi-lingual descriptions.

Facts are quite similar across multiple providers and there are even industry standards such as the Germany-based Global Types.

Some providers offer mapping services that help determine when two hotels are actually the same, which greatly facilitates the use of several suppliers at once and the filtering of duplicates.

What’s equally non-standard in the distribution of accommodations is the booking handling.

There are several stages of reservations and bookings (with examples like ‘booked unless not paid by X’) that may vary from supplier to supplier.

There are industry players that work on consolidating the hotel landscape.

Hotel switch providers like TravelgateX, Anixe, Gimmonix or do the aggregation work for the rest of the industry and allow everyone to not care about the differences between the suppliers too much.

However, it’s not possible to fully abstract from those differences, even with a switch.

Overall standardisation: 3/5 

Activities and Excursions: There are pretty much no official or de-facto standards for activity data. Even worse, the data received is often not even structured.

Getting the duration of an activity in machine-readable form over an API is a frequent challenge with many supply sources, and you can find a vast number of ways to write ‘child’, ‘adult’, ‘teenager’ etc, even within one supplier API.

It’s also very hard to determine whether two activities are actually the same if you receive them through two different channels. And with multi-day activities these problems even get amplified.

In our experience, this is hands-down the least standardised and least structured product type.

There are players like Magpie Travel who are trying to change this, but today’s picture is still rather bleak.

Overall standardization: 1/5 

Other Product Types: The list above is missing products like transfers and cruises.

Transfers are also not very standardised but here we have the advantage that they are rather simple compared to other products.

Once you know where it starts, where it ends, how long it takes and if there’s a short description of the comfort and vehicle, then the most important aspects are covered.

Cruises are also not standardised, but here the number of suppliers is much more limited than with activities.

How much standardisation do we need? 

So, would we rather deal with a lack of standardisation or an absence of standardisation? Should we bother to drive industry-wide efforts for a standardisation of activity data?

Not everything can be solved with official standards. Standards take a very long time to create and they typically evolve very slowly.

Therefore, they are more suited to rather static topics than to fast-changing ones.

What’s more nimble but less inclusive is the creation of de-facto standards created by strong industry players that are either copied by others to a certain degree or even open-sourced so that others can completely adapt them if desired.

The most effective solution is a combination of an industry-wide effort to improve data quality and create structured data and the application of modern technologies like natural language processing, machine learning and similarity matching to deal with the fact that many travel-related products will never be completely standardised.

One of the possible use cases for machine learning could be to determine that two similar data structures actually mean the same, therefore speeding up system integrations over various suppliers with non-standardized content.

Let’s take a look at a simple example.

Activity provider A uses the following structure:

activity: {

maxAdults: 2,

maxChilds: 0,

price: 320

currency: “EUR”


Activity provider B uses this structure:

activity: {

maxClients: 2,

childAllowed: false,

price: “320.00€”


Currently , this means we have to implement two parsers even though the actual information is identical.

Machine learning could allow us to identify the same content and produce the same output without human intervention.

Players like dedupIT are already applying this technique to hotel room descriptions.

By combining the application of modern information processing technologies with an industry-wide effort to structure and clean up data, we will be able to accept the fact that not much travel-related data is standardised and actually live with it quite comfortably.