Finding Answers to Data's Big Questions
Better access to customer data is essential if we want to improve how we travel, but the effective use of such data has often been hampered by practical considerations and concerns over privacy. Fortunately, solutions are now starting to emerge as Liz Davies, RSSB Professional Lead Data and Modelling, and Ron Oren, Principal Strategy Analyst at the Connected Places Catapult, explain.
In recent years, we have seen a significant increase in the amount of rail-related data that is openly available. Network Rail has made a range of data feeds available to the public domain, including static scheduling data, and real-time train positioning and movement data. The Rail Delivery Group has similarly released a range of static railway data sources, such as fares, routeing and timetable data via its website. It also provides real-time arrival and departure information, including schedule changes, service disruption, and cancellations.
Other organisations from across the wider transport sector, including Transport for London and the Connected Places Catapult, have helped to release datasets containing real time information such as vehicle movement patterns, or roadwork and disruption data. Even when not directly related to rail, these datasets can be integrated with rail data to support a rail passenger’s end-to-end journey.
Most of these datasets rely, on data generated by the transport networks themselves. This is a good start but, if we truly want to make travel more personalised, we need to gain access to more customer-generated data. To understand passenger needs and preferences better, including the type of information and flexibility that they may wish to be offered during the journey, operators need to collect information that goes beyond the simple origin and destination points of travel.
In order to make this kind of personalised services available, transport providers will need to address some remaining issues over ethics, security and reliability – but there are plenty of signs that these challenges can be met.
Early examples of aggregated customer data
It is worth pointing out at this stage that customer data is already being used for a number of trial apps and services, but that this tends to be aggregated customer data rather than data attributable to any particular individual, and that this data is used for improving operations and the general transit experience rather than for offering personalised services.
Examples of this ‘passive’ customer data include the use of mobile network data to track phones within a geographical cell, with the aggregated numbers being used to indicate potential overcrowding or clusters of slow movement. This type of data can be used to optimise services, for example by directing passengers to carriages that are less busy.
Similar information is being gleaned on the road network by using CCTV combined with automated analysis, in order to identify congestion hotspots or potential accidents. Anonymised people-counting software on platforms can do much the same job in a railway setting.
But while aggregated and anonymised data can help allay privacy concerns, it is less useful for providing a more personalised transport experience, since the transport operator is unable to differentiate the individuals involved. Individuals can be persuaded to share their data more willingly, and there is plenty of evidence from online retail and customer loyalty card schemes that the right incentives enable this.
So, are we doing enough in rail to encourage and reward customers that help us to provide a better service for them and for all?
Barriers to sharing
Even if individuals share their travelling related data more willingly, there is still the challenge of encouraging more sharing by the companies who gather data.
The first - not insubstantial - barrier may be corporate resistance, or perhaps just a lack of motivation, for a company to free up its data. Making data available can, of course, take effort, and there is sometimes also a misapprehension that ‘available’ means ‘without cost’.
Companies need to balance the effort of releasing data (whether generated by customers or the company itself) alongside the benefits they think it could bring. Sometimes the benefits may be obvious from the perspective of the industry as a whole or for society in general, but it may be harder to quantify the potential return on investment for the company that is being asked to share the data.
A second barrier might involve organisations who are already fully on board in terms of wanting to make their data available, but not be quite sure as to the best way of actually going about it.
Thirdly, as already touched upon above, there is the real issue of protecting customer privacy and security, especially since aggregated travel data could quickly become personal data if the data analyst (or analysis software) was able to deduce any personal identifier, such as the home addresses, of those providing the data.
… and bridges
None of these three barriers is insurmountable, however. When it comes to those companies who are unwilling to make their datasets available free of charge, it may be possible to sell the data as a direct revenue stream in addition to the company’s core business. Alternatively, there may be ways of opening up a shared data source with partners, in a mutually beneficial way.
For companies who are committed to data-sharing but have limited knowledge or resources to do so, it is usually more a matter of altering business processes or culture. External technological solutions are readily available, so the difficulty lies more often in the need to generate ‘buy-in’ across the organisation, especially if senior management is not clear on the business case and benefits. Again, solutions to smoothen the actual process of sharing data are becoming more and more commonplace, and the work being planned under the Joint Rail Data Action Plan and Rail Sector Deal will further open up opportunities in this space.
Regarding privacy and security, while the concerns here are genuine, great strides have been made already in sectors where highly sensitive data is handled, the banking sector being the most obvious example – so it is important that the perception of difficulty does not overshadow the potential opportunities that can arise from successful data sharing. It is possible to win ‘hearts and minds’ and make it attractive for individual travellers to share their data under the right, secure circumstances in return for a much-improved travel experience.
While individual companies will have different thoughts and approaches for meeting these challenges, there are industry-wide actions that could be taken now to smooth the way for everyone:
- Utilisation of neutral, independent, trusted and fully secure platforms for discovery and exchange of data, particularly for pre-commercial development, and specifically designed to combine data from different ‘silos’.
- Drawing up of tangible case studies of data-sharing in practice (based on real services or products that have been developed and run through shared data), including their impact on the partners' bottom lines.
- Establishment of a community of practice, where data science experts and transport ‘problem owners’ can share knowledge and experience, in order to build trust and reduce inefficiencies.
The best way to handle data – both the ‘big’ kind and the highly personal – will continue to be a cross-sector challenge for years to come, but it’s also important to remember the potentially enormous rewards, with a 2017 report by the Transport Systems Catapult estimating that the UK economy stands to gain £15bn per year through the sharing of transport data alone.
From that future perspective, today’s data issues will be seen not just as a challenge that had to be overcome, but one that was worth overcoming in order for us to enjoy the greener, more efficient and more customer-focused transport systems that are now within our grasp.