{"id":2812,"date":"2023-05-24T08:35:43","date_gmt":"2023-05-24T06:35:43","guid":{"rendered":"https:\/\/www.risc-software.at\/data-engineering-die-solide-basis-fuer-eine-effektive-datennutzung\/"},"modified":"2024-11-06T17:36:51","modified_gmt":"2024-11-06T16:36:51","slug":"technical-article-data-engineering-the-solid-basis-for-effective-data-utilization","status":"publish","type":"publication","link":"https:\/\/www.risc-software.at\/en\/technicalarticles\/technical-article-data-engineering-the-solid-basis-for-effective-data-utilization\/","title":{"rendered":"Data Engineering \u2013 the solid basis for effective data utilization"},"content":{"rendered":"\n<h2 class=\"wp-block-heading is-style-v2-telegrafico\">The path of data from sources to the integrated data lake<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">by DI Paul Heinzlreiter<\/h3>\n\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<p><em>Data engineering integrates data from a wide variety of sources and makes it usable effectively. This makes it a prerequisite for effective data analysis, machine learning and artificial intelligence, especially in the Big Data area.<\/em><br><\/p>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-media-text has-media-on-the-right is-stacked-on-mobile is-vertically-aligned-center\"><div class=\"wp-block-media-text__content\">\n<p><strong>Table of contents<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Delimitation<\/li>\n\n\n\n<li>Data cleansing and integration<\/li>\n\n\n\n<li>Data storage and data modeling for Big Data<\/li>\n\n\n\n<li>Use Cases<\/li>\n\n\n\n<li>Author<\/li>\n<\/ul>\n<\/div><figure class=\"wp-block-media-text__media\"><img decoding=\"async\" width=\"1024\" height=\"649\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1047824676-1-1024x649.jpg\" alt=\"Data network\" class=\"wp-image-2509 size-full\" srcset=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1047824676-1-1024x649.jpg 1024w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1047824676-1-300x190.jpg 300w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1047824676-1-768x486.jpg 768w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1047824676-1-1536x973.jpg 1536w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1047824676-1.jpg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<p>In recent years, the topic of extracting information from big data has become increasingly important for more and more businesses in a wide range of economic sectors. Examples of this are historical sales data that can be used to optimize the product range of online stores and sensor data from a production line that can help to increase the quality of products or replace machine parts in good time as part of preventive maintenance. In addition to the direct use of an integrated database in operational practice, it is precisely the topicality of the topics of artificial intelligence (AI) and machine learning (ML) with the promise of being able to continuously optimize a production process, for example, that represents a strong motivation.<\/p>\n\n\n\n<p>However, when the process of information acquisition is considered in its entirety, it quickly becomes clear that AI and ML represent only the proverbial tip of the iceberg. These methods require large amounts of consistent and complete data sets, especially for the steps of model training and model validation. Such data sets can be generated, for example, by sensor networks or by sensors in production.<\/p>\n\n\n\n<p>The transfer, storage and processing of this data in order to make it effectively usable is the central task of data engineering. This is independent of whether the goal is company-wide and effective reporting, data science to improve the production process, or AI. A solid data basis is necessary in all cases. The integration of data into a common database can additionally form a reliable ground truth for a wide variety of use cases in the company: For effective day-to-day business, for strategic planning based on solid data and facts, or for model training in the field of AI.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<figure class=\"wp-block-image size-large is-style-rounded\"><img decoding=\"async\" width=\"1024\" height=\"580\" sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-966899060-1-1024x580.jpg\" alt=\"Binary code\" class=\"wp-image-2183\" srcset=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-966899060-1-1024x580.jpg 1024w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-966899060-1-300x170.jpg 300w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-966899060-1-768x435.jpg 768w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-966899060-1-1536x870.jpg 1536w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-966899060-1.jpg 1920w\" \/><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Delimitation<\/h3>\n\n\n\n<p>The overarching goal is thus to increase the <a href=\"http:\/\/ris.w4.at\/en\/technical-article-data-quality\" target=\"_blank\" rel=\"noreferrer noopener\">quality<\/a> and usability of the available data and thus essentially follows the data science hierarchy of needs, which describes the stages from raw data to AI. Analogous to Maslow&#8217;s hierarchy of needs, the lower levels of the pyramid represent a necessary prerequisite for the steps that build on them.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-style-default\"><img decoding=\"async\" width=\"642\" height=\"242\" sizes=\"(max-width: 642px) 100vw, 642px\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/2021-05-27-DataEngineering_RISC-Software-GmbH_Fachbeitrag-Data-Engineering_Abbildung1.jpg\" alt=\"Data Science Hierarchy of Needs\" class=\"wp-image-2782\" srcset=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/2021-05-27-DataEngineering_RISC-Software-GmbH_Fachbeitrag-Data-Engineering_Abbildung1.jpg 642w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/2021-05-27-DataEngineering_RISC-Software-GmbH_Fachbeitrag-Data-Engineering_Abbildung1-300x113.jpg 300w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/2021-05-27-DataEngineering_RISC-Software-GmbH_Fachbeitrag-Data-Engineering_Abbildung1-640x242.jpg 640w\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\"><em>Data Science Hierarchy of Needs<\/em><\/p>\n\n\n\n<p>At the top of the pyramid are the activities of Data Science, which are based on integrated and cleansed data sets. These can then be used to train ML models, for example. The levels colored in blue represent the data engineering activities, with the focus on the move, store and transform, explore levels. While the levels above with AI, deep learning, and ML are the domain of data scientists, activities such as data labeling and data aggregation are borderline areas that can be performed by data scientists or data engineers depending on the exact task and personnel availability.<\/p>\n\n\n\n<p>The activities of data collection at the base of the pyramid fall only partially within the scope of data engineering in that it usually takes the data at a defined interface &#8211; via files, external databases or a network protocol. This is also due to the fact that data engineering is a sub-field of computer science or software engineering, and thus does not usually deal with topics such as building or operating data collection hardware such as sensors.<\/p>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Data cleansing and integration<\/h3>\n\n\n\n<p>As part of the data engineering process, the raw data is prepared over several steps after transfer and finally stored in the data store in a consistent and fully prepared form:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data cleanup<\/li>\n\n\n\n<li>Data integration<\/li>\n\n\n\n<li>Data transformation<\/li>\n<\/ul>\n\n\n\n<p>These forms of data transformations are carried out step by step and sequentially. The technical implementation can take the form of data stream processing &#8211; the consecutive processing of many small data packets &#8211; or batch processing for the entire data set simultaneously. An appropriately dimensioned data store &#8211; the data lake &#8211; makes it possible to persist the data in various states during its processing.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-style-rounded\"><img decoding=\"async\" width=\"506\" height=\"284\" sizes=\"(max-width: 506px) 100vw, 506px\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/2021-05-27-DataEngineering_RISC-Software-GmbH_Fachbeitrag-Data-Engineering_Abbildung2_EN.jpg\" alt=\"Data engineering\" class=\"wp-image-2814\" srcset=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/2021-05-27-DataEngineering_RISC-Software-GmbH_Fachbeitrag-Data-Engineering_Abbildung2_EN.jpg 506w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/2021-05-27-DataEngineering_RISC-Software-GmbH_Fachbeitrag-Data-Engineering_Abbildung2_EN-300x168.jpg 300w\" \/><\/figure>\n\n\n\n<p>Data cleansing includes, for example, checking the read-in data lines for completeness and syntactic correctness. Data errors such as incorrect sensor values can also be detected by predefined rules in this step.<\/p>\n\n\n\n<p>If these criteria are violated, the following options are available, depending on the application:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Improve raw data quality<\/strong>: If the raw data can be subsequently delivered in improved quality, these replace the faulty data<\/li>\n\n\n\n<li><strong>Discard data<\/strong>: Incorrect data can be discarded, for example, if the data set is to be used for training purposes in ML and sufficient correct data is available.<\/li>\n\n\n\n<li><strong>Automatically correct errors during import<\/strong>: For example, if the data can be obtained from an additional data source, errors can be corrected during data integration.<\/li>\n<\/ul>\n\n\n\n<p>In practice, discarding the incorrect data is the easiest solution to implement. However, if every single data point can have relevance for the planned evaluations, erroneous data must be corrected if possible. This case can occur, for example, during quality assessment in production, when the production data for a defective workpiece is incorrect due to a sensor error. The correction of data can either be done manually by domain experts, or the correct data can be supplied at a later point in time.<\/p>\n\n\n\n<p>The involvement of domain experts is central here, because on the one hand they know the criteria for the correctness of the data, such as sensor values, and on the other hand they know how to deal with incorrect or incomplete data.<\/p>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<p>Data integration deals with the automated linking of data from different data sources. Depending on the application domains and the type of data, data linking can be done by different methods such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Unique identifiers, analog to foreign keys in relational model<\/li>\n\n\n\n<li>Geographical or temporal proximity<\/li>\n\n\n\n<li>Domain-specific interrelationships such as sequences in manufacturing processes or in production lines<\/li>\n<\/ul>\n\n\n\n<p>After the data cleansing and data integration steps, data engineers can provide a dataset suitable for further use by data scientists. The data transformation step mentioned above refers to ongoing adjustments to the data model to improve the performance of queries by Data Scientists.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<figure class=\"wp-block-image size-large is-style-rounded\"><img decoding=\"async\" width=\"1024\" height=\"683\" sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/shutterstock_1667832958-1024x683.jpg\" alt=\"Programming\" class=\"wp-image-2359\" srcset=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/shutterstock_1667832958-1024x683.jpg 1024w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/shutterstock_1667832958-300x200.jpg 300w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/shutterstock_1667832958-768x512.jpg 768w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/shutterstock_1667832958-1536x1024.jpg 1536w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/shutterstock_1667832958.jpg 1920w\" \/><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<h3 class=\"wp-block-heading\">Data storage and data modeling for Big Data<\/h3>\n\n\n\n<p>The cleaned and integrated data can be stored in a suitable data storage solution. In the application area of Industry 4.0, for example, data is generated continuously by sensors, which often leads to data volumes in the terabyte range within months. Such data volumes are often no longer manageable with a classic relational database. Although there are scalable databases available on the market that use the relational model, these are not an option for many implementation projects &#8211; especially in the SME sector &#8211; due to their high licensing costs.<\/p>\n\n\n\n<p>As an alternative, horizontally scalable NoSQL systems are available, the term being an abbreviation for \u201cNot only SQL\u201c. This term covers data stores that use non-relational data models. The property of horizontal scalability refers to the possibility of expanding such systems by integrating additional hardware for basically unlimited data volumes. Typical representatives of NoSQL systems are also often subject to liberal licensing models such as the Apache license and can thus also be used commercially without license costs. In addition, these systems do not place any special requirements on the hardware used, which further reduces the acquisition costs of such systems. Thus, NoSQL systems such as <a href=\"https:\/\/hadoop.apache.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Apache Hadoop<\/a> and related technologies represent a cost-effective way of executing queries on data volumes in the terabyte range.<\/p>\n\n\n\n<p>Particularly in the Big Data area, the selection of a suitable NoSQL database and a suitable data model is of central importance because both aspects are central to the performance of the overall system. This refers both to the input of data and to queries against the NoSQL system.<\/p>\n\n\n\n<p>The selection of the technology to be used as well as the data model design is clearly driven by the system requirements:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What data volumes and data rates need to be imported?<\/li>\n\n\n\n<li>Which queries and evaluations are to be performed with the data?<\/li>\n\n\n\n<li>What are the performance requirements for the queries? Is it a real-time system?<\/li>\n<\/ul>\n\n\n\n<p>The central question is, for example, whether the system should only support fixed queries or &#8211; for example, using SQL &#8211; allow flexible queries.<\/p>\n\n\n\n<p>In the context of technology selection, a distinction can be made, for example, as to whether data is always accessed via a known key or whether queries are also made on the values of other attributes. In the first case, a system with the semantics of a distributed hash map, such as <a href=\"https:\/\/hbase.apache.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Apache HBase<\/a>, is suitable, while in the other case, for example, an in-memory analysis solution such as <a href=\"https:\/\/spark.apache.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Apache Spark <\/a>is suitable. If the use of the data is primarily aimed at the links between data, the use of a <a href=\"http:\/\/ris.w4.at\/en\/technical-article-graphdatabases-1\" target=\"_blank\" rel=\"noreferrer noopener\">graph database<\/a> should be considered.<\/p>\n\n\n\n<p>In a Big Data system, data is stored denormalized for performance reasons, i.e., all data relevant to a query result should be stored together. The reason for this is that performing joins is very resource-intensive and time-consuming. Therefore, the planned queries are central to the design of the data model. For example, the attributes that mainly appear as parameters in the queries should be used as key attributes. This is also the reason why the data model often has to be extended when new queries are added to ensure their effective execution, and thus data engineering activities are continuously required even after the data has been introduced.<\/p>\n\n\n\n<p><em>With its expertise in the field of open source NoSQL databases built up over more than ten years, RISC Software GmbH represents a reliable consulting and implementation partner for the introduction or expansion of a solid database in your company, regardless of the area of application.<\/em><\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<figure class=\"wp-block-image size-large is-style-rounded\"><img decoding=\"async\" width=\"1024\" height=\"670\" sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-155667482-1024x670.jpg\" alt=\"Hall\" class=\"wp-image-1430\" srcset=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-155667482-1024x670.jpg 1024w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-155667482-300x196.jpg 300w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-155667482-768x502.jpg 768w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-155667482-1536x1005.jpg 1536w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-155667482.jpg 1920w\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large is-style-rounded\"><img decoding=\"async\" width=\"1024\" height=\"683\" sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-183805661-1-1024x683.jpg\" alt=\"Database\" class=\"wp-image-2355\" srcset=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-183805661-1-1024x683.jpg 1024w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-183805661-1-300x200.jpg 300w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-183805661-1-768x512.jpg 768w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-183805661-1-1536x1024.jpg 1536w, https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-183805661-1.jpg 1920w\" \/><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Use Cases for Data Engineering<\/h3>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div data-aos=\"fade-up\"  data-aos-offset=\"0\" data-aos-anchor-placement=\"top-bottom\" class=\"icon-box is-style-bg-blue\">\n  <div class=\"icon-overlay\">\n          <picture>\n        \n        \n        \n        \n        <img decoding=\"async\"  class=\"\" width=\"44\" height=\"44\"\n             src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/timeline-solid.png\"\n             alt=\"\">\n      <\/picture>\n      <\/div>\n  \n\n<h5 class=\"has-text-align-center wp-block-heading\">Use Case 1: Corporate data integration<\/h5>\n\n\n\n<p class=\"has-text-align-center\">Data from different sources can be merged and used effectively in an integrated data model<\/p>\n\n\n\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div data-aos=\"fade-up\"  data-aos-offset=\"0\" data-aos-anchor-placement=\"top-bottom\" class=\"icon-box is-style-bg-blue\">\n  <div class=\"icon-overlay\">\n          <picture>\n        \n        \n        \n        \n        <img decoding=\"async\"  class=\"\" width=\"44\" height=\"44\"\n             src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/brain-solid.png\"\n             alt=\"\">\n      <\/picture>\n      <\/div>\n  \n\n<h5 class=\"has-text-align-center wp-block-heading\">Use Case 2: Data preparation for AI \/ ML<\/h5>\n\n\n\n<p class=\"has-text-align-center\">Data engineering methods can be used to provide a large amount of consistent and complete training data for AI and ML<\/p>\n\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div data-aos=\"fade-up\"  data-aos-offset=\"0\" data-aos-anchor-placement=\"top-bottom\" class=\"icon-box is-style-bg-blue\">\n  <div class=\"icon-overlay\">\n          <picture>\n        \n        \n        \n        \n        <img decoding=\"async\"  class=\"\" width=\"44\" height=\"44\"\n             src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/database-solid.png\"\n             alt=\"\">\n      <\/picture>\n      <\/div>\n  \n\n<h5 class=\"has-text-align-center wp-block-heading\">Use Case 3: Transformation of the data model to improve data understanding<\/h5>\n\n\n\n<p class=\"has-text-align-center\">Data engineering can significantly increase data understanding by better adapting the data model to the use case. An example could be the introduction of a graph database.<\/p>\n\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div data-aos=\"fade-up\"  data-aos-offset=\"0\" data-aos-anchor-placement=\"top-bottom\" class=\"icon-box is-style-bg-blue\">\n  <div class=\"icon-overlay\">\n          <picture>\n        \n        \n        \n        \n        <img decoding=\"async\"  class=\"\" width=\"44\" height=\"44\"\n             src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/forward-fast-solid.png\"\n             alt=\"\">\n      <\/picture>\n      <\/div>\n  \n\n<h5 class=\"has-text-align-center wp-block-heading\">Use Case 4: Improved (faster) data usage<\/h5>\n\n\n\n<p class=\"has-text-align-center\">Data engineering can help significantly speed up interactive queries by adapting the data storage and data model.<\/p>\n\n\n\n<div style=\"height:23px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Sources<\/h3>\n\n\n\n<p>[1]&nbsp;Blasch, Erik &amp; Sung, James &amp; Nguyen, Tao &amp; Daniel, Chandra &amp; Mason, Alisa. (2019). Artificial Intelligence Strategies for National Security and Safety Standards.<\/p>\n\n\n\n<p>[2] Abraham Maslow:&nbsp;<em>A Theory of Human Motivation.<\/em>&nbsp;In&nbsp;<em>Psychological Review.<\/em>&nbsp;1943, Vol. 50 #4, Seite 370\u2013396.<\/p>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<h3 class=\"wp-block-heading\">Contact<\/h3>\n\n\n\n<div class=\"wp-block-contact-form-7-contact-form-selector\">\n<div class=\"wpcf7 no-js\" id=\"wpcf7-f663-o1\" lang=\"en-US\" dir=\"ltr\" data-wpcf7-id=\"663\">\n<div class=\"screen-reader-response\"><p role=\"status\" aria-live=\"polite\" aria-atomic=\"true\"><\/p> <ul><\/ul><\/div>\n<form action=\"\/en\/wp-json\/wp\/v2\/publication\/2812#wpcf7-f663-o1\" method=\"post\" class=\"wpcf7-form init\" aria-label=\"Contact form\" novalidate=\"novalidate\" data-status=\"init\">\n<fieldset class=\"hidden-fields-container\"><input type=\"hidden\" name=\"_wpcf7\" value=\"663\" \/><input type=\"hidden\" name=\"_wpcf7_version\" value=\"6.1.5\" \/><input type=\"hidden\" name=\"_wpcf7_locale\" value=\"en_US\" \/><input type=\"hidden\" name=\"_wpcf7_unit_tag\" value=\"wpcf7-f663-o1\" \/><input type=\"hidden\" name=\"_wpcf7_container_post\" value=\"0\" \/><input type=\"hidden\" name=\"_wpcf7_posted_data_hash\" value=\"\" \/>\n<\/fieldset>\n<div class=\"form-row\">\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-name\">Your name <\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-name\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text wpcf7-validates-as-required\" id=\"your-name\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"Name\" value=\"\" type=\"text\" name=\"your-name\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-email\">Your email<\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-email\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-email wpcf7-validates-as-required wpcf7-text wpcf7-validates-as-email\" id=\"your-email\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"E-Mail\" value=\"\" type=\"email\" name=\"your-email\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n<\/div>\n<div class=\"form-row\">\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-company\">Company <\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-company\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text\" id=\"your-company\" aria-invalid=\"false\" placeholder=\"Unternehmen\" value=\"\" type=\"text\" name=\"your-company\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-position\">Position<\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-position\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text\" aria-invalid=\"false\" placeholder=\"Position\" value=\"\" type=\"text\" name=\"your-position\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n<\/div>\n<div class=\"form-row\">\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-subject\"> Subject <\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-subject\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text wpcf7-validates-as-required\" id=\"your-subject\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"Thema\" value=\"\" type=\"text\" name=\"your-subject\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n<\/div>\n<p><span id=\"wpcf7-69d573b582917-wrapper\" class=\"wpcf7-form-control-wrap phone-95-wrap\" style=\"display:none !important; visibility:hidden !important;\"><label for=\"wpcf7-69d573b582917-field\" class=\"hp-message\">Please leave this field empty.<\/label><input id=\"wpcf7-69d573b582917-field\"  class=\"wpcf7-form-control wpcf7-text\" type=\"text\" name=\"phone-95\" value=\"\" size=\"40\" tabindex=\"-1\" autocomplete=\"new-password\" \/><\/span><br \/>\n<label class=\"sr-only\" for=\"your-message\"> Your message (optional)<\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-message\"><textarea cols=\"40\" rows=\"10\" maxlength=\"2000\" class=\"wpcf7-form-control wpcf7-textarea\" id=\"your-message\" aria-invalid=\"false\" placeholder=\"Ihre Nachricht an uns\" name=\"your-message\"><\/textarea><\/span><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"hcap-cf7\">\t\t<input\n\t\t\t\ttype=\"hidden\"\n\t\t\t\tclass=\"hcaptcha-widget-id\"\n\t\t\t\tname=\"hcaptcha-widget-id\"\n\t\t\t\tvalue=\"eyJzb3VyY2UiOlsiY29udGFjdC1mb3JtLTdcL3dwLWNvbnRhY3QtZm9ybS03LnBocCJdLCJmb3JtX2lkIjo2NjN9-5cf29316f0fc31f5a29d11a228757560\">\n\t\t\t\t<span id=\"hcap_cf7-69d573b582f240.00956467\" class=\"wpcf7-form-control h-captcha \"\n\t\t\tdata-sitekey=\"3a6a81c1-2b2e-4b2a-b1eb-d9446bc09afb\"\n\t\t\tdata-theme=\"light\"\n\t\t\tdata-size=\"normal\"\n\t\t\tdata-auto=\"false\"\n\t\t\tdata-ajax=\"false\"\n\t\t\tdata-force=\"false\">\n\t\t<\/span>\n\t\t<input type=\"hidden\" id=\"_wpnonce\" name=\"_wpnonce\" value=\"2762796293\" \/><input type=\"hidden\" name=\"_wp_http_referer\" value=\"\/en\/wp-json\/wp\/v2\/publication\/2812\" \/><\/span><input class=\"wpcf7-form-control wpcf7-submit has-spinner btn\" type=\"submit\" value=\"Senden\" \/>\n<\/p><div class=\"wpcf7-response-output\" aria-hidden=\"true\"><\/div>\n<\/form>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<h3 class=\"wp-block-heading\">Author<\/h3>\n\n\n<div class=\"contact-person\">\n      <picture>\n      \n      \n      \n      \n      <img decoding=\"async\" data-aos=\"fade-zoom-in\"\n           data-aos-offset=\"0\" class=\"w-full\" width=\"212\" height=\"293\"\n           src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/pheinzlr1-removebg-preview.png\"\n           alt=\"\">\n    <\/picture>\n    \n\n<h5 class=\"wp-block-heading\">DI Paul Heinzlreiter<\/h5>\n\n\n\n<p>Senior Data Engineer<\/p>\n\n  <\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignwide is-layout-constrained wp-block-group-is-layout-constrained\"><div class=\"posts-slider-block\" data-aos=\"fade-up\" data-aos-offset=\"0\" data-aos-anchor-placement=\"top-bottom\">\n        <section class=\"splide posts-slider\" aria-label=\"Gallery Slides\">\n            <div class=\"splide__arrows\">\n                <button class=\"splide__arrow splide__arrow--prev\">\n                    <span class=\"sr-only\">Previous<\/span>\n                    <img decoding=\"async\" loading=\"lazy\" width=\"25\" height=\"21\" src=\"https:\/\/www.risc-software.at\/app\/themes\/risc-theme\/public\/images\/icon-arrow.35d2ec.svg\"\n                         alt=\"Previous\">\n                <\/button>\n                <button class=\"splide__arrow splide__arrow--next\">\n                    <span class=\"sr-only\">Next<\/span>\n                    <img decoding=\"async\" loading=\"lazy\" width=\"25\" height=\"21\" src=\"https:\/\/www.risc-software.at\/app\/themes\/risc-theme\/public\/images\/icon-arrow.35d2ec.svg\"\n                         alt=\"Next\">\n                <\/button>\n            <\/div>\n            <div class=\"inner\">\n                <div class=\"splide__track\">\n                    <div class=\"splide__list\">\n\n                                                    <a href=\"https:\/\/www.risc-software.at\/en\/technicalarticles\/technical-article-graphdatabases-1\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1194430859-1-360x214.jpg\"\n                                                 alt=\"Data Understanders: Leveraging enterprise data through intelligent Graph Databases\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Data Understanders: Leveraging enterprise data through intelligent Graph Databases<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        Graph databases enable intuitive mapping of many real-world scenarios such as industrial manufacturing, traffic data analysis or IT infrastructure monitoring. This makes data not only more efficiently stored, but also much more usable.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                                    <a href=\"https:\/\/www.risc-software.at\/en\/technicalarticles\/technical-article-can-data-science-lead-industrial-companies-out-of-the-crisis\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1170740969-360x214.jpg\"\n                                                 alt=\"Can data science lead industrial companies out of the crisis?\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Can data science lead industrial companies out of the crisis?<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        How it is possible to minimize costs, respond flexibly to fluctuations in demand, and avoid production downtime due to disruption.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                                    <a href=\"https:\/\/www.risc-software.at\/en\/technicalarticles\/technical-article-data-quality\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1200230502-360x214.jpg\"\n                                                 alt=\"Data quality: From information flow to information content\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Data quality: From information flow to information content<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        Making decisions is not always easy &#8211; a possible quick win for companies can be derived from company data: future-oriented data quality management, a process that unfortunately often receives far too little attention.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                                    <a href=\"https:\/\/www.risc-software.at\/en\/technicalarticles\/technical-article-graphdatabases-2\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-1047824676-1-360x214.jpg\"\n                                                 alt=\"Graph databases in practice\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Graph databases in practice<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        Graph databases enable the mapping of many real-world scenarios such as: Traffic data analysis or IT infrastructure monitoring.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                            <\/div>\n                <\/div>\n            <\/div>\n        <\/section>\n    <\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>Data engineering integrates data from a wide variety of sources and makes them effectively usable. This makes it a prerequisite for effective data science, machine learning and artificial intelligence, especially in the big data area.<\/p>\n","protected":false},"featured_media":2184,"template":"","publication-category":[50,72],"class_list":["post-2812","publication","type-publication","status-publish","has-post-thumbnail","hentry","publication-category-data-science-and-a-i","publication-category-industrie-4-0"],"acf":[],"portrait_thumb_url":"https:\/\/www.risc-software.at\/app\/uploads\/2023\/06\/iStock-966899060-1-360x214.jpg","_links":{"self":[{"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/publication\/2812","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/publication"}],"about":[{"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/types\/publication"}],"version-history":[{"count":9,"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/publication\/2812\/revisions"}],"predecessor-version":[{"id":4880,"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/publication\/2812\/revisions\/4880"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/media\/2184"}],"wp:attachment":[{"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/media?parent=2812"}],"wp:term":[{"taxonomy":"publication-category","embeddable":true,"href":"https:\/\/www.risc-software.at\/en\/wp-json\/wp\/v2\/publication-category?post=2812"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}