How traffic forecasting works in Upper Austria

Knowing Where Traffic Jams Will Be Before You Leave

by DI Dr. Károly Bósa

Have you ever wished you could foresee traffic before leaving your house? In Upper Austria, this is made possible through innovative technologies. The real-time traffic system of Upper Austria provides reliable predictions for the coming hour – every day, whether it’s a weekday or a holiday. But how does it work?

Understanding real-time traffic: Upper Austria’s path to technological independence
Special requirements for real-time traffic forecasting in Upper Austria
Selection of ARIMA as an appropriate forecasting method
Integration of the forecast into the overall system
Quality metrics of the forecast
Appendix I: Internal Validation I: Error in km/h
Appendix II: Internal Validation II: Root Mean Squared Error (RMSE)
References
Author
Read more

Understanding Real-Time Traffic: Upper Austria’s Path to Technological Independence

Information about the current traffic situation is fundamentally important for private commuters as well as for the route planning of logistics companies. Private individuals usually obtain this information from large US providers such as Google. To avoid technological dependence on these providers and to strengthen Austria’s technological sovereignty, RISC Software GmbH developed a real-time traffic situation map for Upper Austria as part of the ITS Upper Austria project, which is also used within the nationwide project EVIS.AT.

A core technological element is the forecasting of the traffic situation on the Upper Austrian road network, which is based on real-time collected and anonymized traffic data from fleet vehicles (for example, service vehicles of the state of Upper Austria, the ÖAMTC, or the Upper Austrian Transport Association), as well as data from stationary traffic counters (side radars, induction loops) and Bluetooth sensors.

The traffic forecasts provided by RISC Software for Upper Austria can also be accessed, for example, in the ASFINAG route planner, with the necessary steps illustrated in Figure 1.

Figure 1: The traffic forecast for Upper Austria in the Asfinag route planner

Short-term traffic forecasts are made by estimating the expected traffic conditions in the near future based on historical and current traffic information. These forecasts refer to a period of up to one and a half hours into the future.

A key characteristic of short-term traffic forecasting is that it must quickly respond to current events such as traffic jams or construction sites. The forecast described here achieves this, among other things, through the efficient integration of current vehicle data.

Special Requirements for Real-Time Traffic Forecasting in Upper Austria

Although the traffic forecasting solution from RISC Software GmbH uses Autoregressive Integrated Moving Average (ARIMA), a well-documented approach in the literature, it is still unique. This is primarily because it must deliver short-term traffic forecasts for the Upper Austrian road network as a productive service from at least 5 AM to 11 PM every day of the year. Therefore, the underlying server infrastructure must be designed for reliable continuous operation, requiring redundant server infrastructure and software installation.

Essentially, it is a mass forecasting solution that must calculate short-term traffic forecasts for about 100,000 road sections in Upper Austria. Thus, the performance of the underlying calculations and database queries is essential for the effective operation of the system, as the system must provide five forecasts every 15 minutes for each of these road sections for 15, 30, 45, 60, and 75 minutes into the future.

The actual number of reported forecasts fluctuates and depends heavily on the amount of available input data. This, in turn, directly depends on the number of fleet vehicles and other sensors that have provided data in the relevant section and time frame. For example, in February 2024, the number of road sections for which the service actually provided short-term forecasts varied between 65,000 and 80,000 during peak times. If no precise current data is available for a road section, the corresponding values of the daily average speeds are used for the forecast.

An additional challenge is that traffic follows different seasonal and temporal patterns. Daily traffic varies on many weekdays, during school and holiday periods, and on national holidays. Therefore, the appropriate historical speed values must be used for the calculation of the traffic forecast for the current 15-minute interval. Due to the above-mentioned complex seasonal requirements, a pre-calculation is carried out daily, which is based on the relevant historical data for that day. This serves as a basis for the best possible consideration of seasonal effects in the calculation of the 15-minute forecasts.

Selection of ARIMA as an Appropriate Forecasting Method

In general, creating forecasts based on past and current measurements is a well-researched area of machine learning for over a decade, often using approaches based on time series data. Many forecasting methods have been developed to address tasks in this problem area, such as Auto-Regressive Integrated Moving Average (ARIMA), Support Vector Regression (SVR ), Grey System Model, k-Nearest-Neighbor, and neural networks, with Long Short-Term Memory (LSTM) being a special case.

However, the size of the problem and the computational complexity of the preparatory phases required for the individual algorithms limit the selection. Therefore, in a feasibility study prior to integration into the forecasting service, various performance factors of the aforementioned forecasting methods were compared based on a small sample dataset.

The results showed that Long Short-Term Memory (LSTM) had the lowest error rate, slightly outperforming ARIMA. However, training such a model would take several hours every day, which would heavily utilize the hardware resources intended for this service. Without additional hardware, this could lead to problems as the system would not be able to deliver forecast values until the morning rush hour. Therefore, the ARIMA method was chosen, as it can prepare the necessary time series for the productive system in just over half an hour. In practice, this means that the software can complete the preparations for the daily forecasts by 12:40 AM at the latest.

Before developing the mass forecasting service, a prototype of the forecasting logic was integrated into a GUI-based software by RISC Software GmbH, called Traffic Evaluation and Sensor Statistics (TESS). In the so-called forecasting module of this application, a single road section on the map of Upper Austria can be selected and a traffic forecast for that section can be calculated. The main idea behind the development of this TESS module was, among other things, to test and optimize all planned methods and algorithms together. In the forecasting module of TESS, it is also possible to visualize the deviations between the forecasts and the actual traffic flow for any time within the last two days. The finalized forecasting logic in this module is now used as the core of the productive system.

Figure 2: Forecasting module in TESS

Integration of the Forecast into the Overall System

To meet the above requirements and deliver the forecasts on time, the input data is provided via an optimized Citus PostgreSQL cluster hosted by RISC Software GmbH.

This data includes, among other things, the current road network of Upper Austria, based on the data from the Graph Integration Platform (GIP.AT). Additionally, the average speeds – so-called gang lines – based on traffic data from the last weeks, which have been aggregated in 15-minute intervals, are provided. Furthermore, the current real-time traffic sensor data is stored in the database.

For the calculation of a current forecast, only the subset of gang lines corresponding to the seasonal type of the current day is loaded. Within the forecast, the gang lines for the current day are used to restore the continuity of incomplete time series by using a smoothing technique, while also considering current traffic trends. Additionally, the gang line data is used if there is insufficient current measurement data available for a current short-term forecast.

Every 15 minutes, the service provides forecasts for the average speed values 15, 30, 45, 60, and 75 minutes ahead for each main road section in Upper Austria. As mentioned above, the number of forecast calculations in each 15-minute interval heavily depends on the amount of available real-time and historical traffic data. However, EVIS.AT requires that the forecasting service provides an estimate of the near-future traffic situation on all predefined road sections.

Therefore, the application still provides an estimate of the traffic situation if a time series forecast for a road is not possible due to certain missing traffic information. This can go through several stages of accuracy depending on data availability:

If a previously calculated short-term forecast is already available, this estimate is re-published as a forecast for the same timestamp (e.g., if forecast calculation is no longer possible and the system predicted a speed value for 60 minutes ahead 45 minutes ago, this value is re-published as a 15-minute forecast). If no older forecasts are available, the corresponding average speed values based on historical data from the last weeks are used instead. If these are also not available for a specific road section, the corresponding values of the daily gang lines are published as the forecast.

Figure 3: Screenshots of the forecast service logs (left) and the server performance monitor (right), showing CPU usage.

Although the service in production currently continuously utilizes 14 of 16 CPUs on the server hosting the application (as seen on the right side of the above figure), this does not impair the software’s performance. If more computing power is needed in the near future (e.g., due to a significant increase in traffic sensor coverage on the Upper Austrian road network), the forecasting service can be easily redeployed on a machine with more CPUs to scale performance. The software automatically detects changes in the number of available computing resources and automatically distributes the computational tasks to these resources.

Quality Metrics of the Forecast

The forecasting service is equipped with a mechanism that continuously caches 15- and 30-minute forecast values as long as it can compare them with the measured real-time traffic data, and calculates the average daily difference between them for each main traffic route. This self-validation is only applied to the newly calculated, current short-term forecasts, not to other published values (i.e., not for re-published earlier forecasts or daily curve values).

Appendix I and Appendix II briefly summarize a preliminary analysis of the validation data of the 30-minute forecast values collected in October 2019. Appendix I summarizes the average daily deviations between the forecast and the measured traffic data in km/h, while Appendix II does the same in terms of the Root Mean Squared Error (RMSE). The analysis of this data shows that for about 93% of the road sections, the average daily deviation is less than 5 km/h, and for about 56% of the road sections, the RMSE is less than 3.

Appendix I: Internal Validation I: Error in km/h

This is a summary of the average daily deviations in km/h between the 30-minute forecasts and the measured speed values from October 2019:

In the case of 243 road sections, the deviation was between 15 and 37 km/h.
In the case of 964 road sections, the deviation was between 10 and 15 km/h.
In the case of 8,923 road sections, the deviation was between 5 and 10 km/h.
In the case of 74,926 road sections, the deviation was between 1 and 5 km/h.
In the case of 61,781 road sections, the deviation was less than 1 km/h.

Summary: In the case of about 93% of the road sections, the average deviation is less than 5 km/h.

Appendix II: Internal Validation II: Root Mean Squared Error (RMSE)

This is a summary of the average daily deviations of the RMSE between the 30-minute forecasts and the measured speed values from October 2019:

In the case of 112 road sections, the RMSE was over 20.
In the case of 2,605 road sections, the RMSE was between 10 and 20.
In the case of 62,048 road sections, the RMSE was between 3 and 10.
In the case of 67,432 road sections, the RMSE was between 1 and 3.
In the case of 14,640 road sections, the RMSE was less than 1.

Summary: In the case of about 56% of the road sections, the RMSE is less than 3.

References

Cao, L. J. & Tay, F. E. H. (2003) Support vector machine with adaptive parameters in financial time series forecasting. Neural Networks, IEEE Transactions on, 14 (6), 1506-1518.
GIP, An Intermodal Traffic Graph Austria, 2021, https://www.gip.gv.at/assets/downloads/GIP_Standardbeschreibung_2.3.2_FINAL.pdf
Kim, K. J. (2003) Financial time series forecasting using support vector machines. Neurocomputing, 55 (1-2), 307-319.
Lu, C. J., Lee, T. S. & Chiu, C. C. (2009) Financial time series forecasting using independent component analysis and support vector regression. Decision Support Systems, 47 (2), 115-125.
Majhi, R., Panda, G. & Sahoo, G. (2009) Efficient prediction of exchange rates with low complexity artificial neural network models. Expert Systems with Applications, 36 (1), 181-189.
Pai, P. F. & Lin, C. S. (2005) A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33 (6), 497-505.
Saad, E. W., Prokhorov, D. V. & Wunsch, D. C. (1998) Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks. IEEE Transactions on Neural Networks, 9 (6), 1456-1470.
Sharma, S. K. & Sharma, V. (2012) Time series prediction using kNN algorithms via euclidian distance function: a case of foreign exchange rate prediction. Asian Journal of Computer Science and Information Technology, 2 (7), 219-221.
Wang, Y. F. (2002) Predicting stock price using fuzzy grey prediction system. Expert Systems with Applications, 22 (1), 33-38.