Scalable weather database for wetter.at and wetter-deutschland.com
Reliable and accurate weather forecasts are essential for planning personal leisure activities as well as for many companies and organizations. Therefore, Mobile World Information Systems GmbH (MOWIS) provides weather data services, including the operation of wetter.at and wetter-deutschland.com.
RISC Software GmbH developed the central data management system for MOWIS. Since 2011, the system has been continuously maintained and expanded as needed. Its core is the NoSQL database HBase, which is based on a Hadoop cluster. Current weather data from several sources – such as forecast models and measurements from weather stations – are continuously fed into this database. Consequently, users can retrieve interactive weather forecasts worldwide up to fourteen days in advance.
The system dynamically selects the most suitable database for each forecast and combines data from different sources. In addition, already retrieved data is cached to accelerate queries. An XML-based interface enables web services to retrieve forecasts. Moreover, Apache Hive provides SQL access for flexible queries on stored location data, mainly used by MOWIS for interactive data quality control.
Ongoing data import and export
The HBase data stock is continuously updated through the import of forecast model results and weather station measurements. Each dataset is converted into a structured text representation, which allows the use of Hadoop MapReduce and HBase bulk imports. Thus, a high-resolution Austrian weather forecast model can be imported within five minutes and made available almost immediately. A global weather data model with daily forecasts can be imported in fifteen minutes. By comparison, the replaced SQL database required several hours for similar imports.
Design of a suitable data model
In order to perform queries interactively, the data model was adapted to the queries, allowing, for example, a quick query for a location. To make a NoSQL database effectively usable, the design of a data model optimized for the planned queries is central. Therefore, at the beginning of the project, the planned queries were defined together with the domain experts of the company MOWIS. On this basis, the data model for HBase was defined, which in particular allows fast queries on individual locations and, on the other hand, enables automated removal of data that is no longer required. In order to enable efficient access via other attributes, numerous lookup tables were also implemented. The use of a Big Data system allows the flexible adaptation or extension of the data model, if new queries are needed.
Acceleration and cost savings compared to legacy SQL database.
By switching to a Hadoop-based NoSQL solution, an additional forecast model for worldwide weather data could be introduced, as well as data imports as well as exports accelerated by a factor of seven. This makes it possible to retrieve worldwide weather forecasts interactively or to export current weather forecasts for the whole of Austria and Germany for the above-mentioned websites. For this purpose, both the imports of the different weather models and the exports of the data updates for wetter.at use Hadoop Map-Reduce jobs to execute the creation of the current weather forecasts for all of Austria and Germany on the Hadoop cluster in parallel.
The use of Hadoop brings the following advantages here:
- As the amount of data grows, simple and inexpensive scaling of the system by adding new compute nodes to the cluster.
- The use of an OpenSource technology makes it possible to save significant licensing costs compared to classic commercial database offerings.
Project partner
Project details
- Project partner:
- Mobile World Information Systems GmbH (MOWIS)
- Duration: 201 0- ongoing
Contact
Project management
DI Paul Heinzlreiter
Senior Data Engineer