Transport for Greater Manchester (TfGM) is the local government body responsible for delivering Greater Manchester’s transport strategy and commitments. TfGM delivers the transport policies set by the Greater Manchester Mayor and the Greater Manchester Combined Authority, and invests money in improving transport services and facilities, to support the regional economy. TfGM owns Metrolink (the UK’s largest light rail network), Greater Manchester’s bus stations, stops and shelters and transport interchanges, subsidises fares to help older people, children and disabled people get around, pays for bus services at times and in areas where no commercial bus services are provided, promotes and invests in walking and cycling as safe, healthy and sustainable ways to travel, as well working closely with bus, tram and train operators to help improve the full journey experience.
With the imminent launch of contactless in Summer 2019 on Greater Manchester’s Metrolink network, the largest light rail network in the UK, TfGM needed the ability to use network data in order to understand and make improvements to the network, services and products provided to customers. TfGM needed to ingest data from their service providers, via various methods, and transform this into usable insight for operational and informational purposes. TfGM needed to be able to use contactless data in order to understand and make improvements to the services and products provided to customers.
TfGM wanted the solution to be expandable, as a service whenever possible, all tasks to be automated including verification of data quality before any ETL processes were ran, and to be cost effective at scale. The solution would need to scale to several hundred users without any changes to the architecture to support the new load.
Why Amazon Web Services
TfGM adopted AWS because it had all the services and providers needed to do the initial tasks as well as any envisioned for the solution in the future. They wanted their environments to be fully scripted from end to end to allow repeatable patterns and so Terraform was used not only to create all the native AWS services but was extended to create all database objects too.
To enable maximum automation, flat file imports would have the row counts, column counts and data types verified using AWS Lambda functions (failures would automatically raise help desk tickets with the supplier without any ETL jobs commencing on the TfGM side). Once verified, the ETL environment would be initiated using SQS; this orchestrated the instantiation of Matillion as well as other tasks.
Outside of the native AWS tools, other AWS partners would be used including Matillion for ETL, Snowflake (which is hosted on AWS) for the data warehouse and Tableau Online (which is also hosted on AWS) for reporting and visualisation.
Crimson Macaw were able to bootstrap a full production, test and development environments in weeks allowing TfGM to ingest data feeds, transform the data and create reports. The full scripting of these environments allowed the creation of completed new end-to-end environments in hours rather than days. This enable TfGM to speed up delivery whilst still having certainty over their environments and testing.
TfGM were able to use Snowflake and Tableau to give them to new insights into travel patterns and behaviours for TfGM to adapt the network and add improvements to contactless services. These new insights came first from the ability to store and analyse any volume of data required in the new Snowflake EDW and secondly from the new visualisation capability delivered via Tableau.
Malcolm Lowe, Head of IT at Transport for Greater Manchester said,
We have lots of data and information on the transport network and the challenge is capturing all this information, the dependencies and getting insight from it. When we don’t have to worry about technology, infrastructure, storage or performance, it means we can concentrate and focus on getting insight to improve the transport network and travel of people in Greater Manchester.