Project News-Extracted Evolving European Datasphere (NEEED)
An evolutionary graph of local, national and international news events.
In the FFG-funded research project NEEED (“News-Extracted Evolving European Datasphere”), researchers and developers from RISC Software GmbH, SCCH GmbH and Newsadoo GmbH worked together on the further development of the Newsadoo platform. Newsadoo collects, analyses and sorts news from local, national and international sources fully automatically and enables the personalized, topic-specific and decentralized display of relevant news.
The cooperation partners had already succeeded in optimizing the automatic processing of news articles and the underlying recommendation algorithm in the TIDE pre-project. With NEEED, the Newsadoo technology was taken to the next stage of development, in which it is now possible to further structure and use the collected data in the form of a tag graph (News Datasphere). On a daily basis, information from news articles can be merged into a dynamic network of related tags (keywords) and their development over time can be tracked. This allows both long-term correlations (e.g. “Rome” and “Vatican”) and short-term trends (e.g. “ChatGPT” and “Natural Language Processing”, “Queen Elizabeth II” and “funeral”) to be derived and analyzed based on the qualitative news content produced daily.
Big Data in the Data Sphere: Analysis of millions of news articles
With over 30,000 news articles produced from German and English news sources every day, the Data Sphere is updated daily with the latest topics. With several million news articles, from which over a million tags are generated, conventional methods of data processing quickly reach their limits. The use of big data technologies makes it possible to analyze the current data stock in the first place and, thanks to the special focus on the scalability of the system, can also be applied to the continuously growing data volumes in the long term.
Various approaches from the fields of artificial intelligence (AI), quantitative statistics and association analysis were evaluated and combined to calculate the relationships between tags. The resulting relationship network of tags is made accessible in the form of a graph database and can not only display the relationship between two tags in a few seconds using a special query language, but can also display the most relevant neighboring tags. In the long term, the Data Sphere will be used as a basis for other different applications, such as the definition of subject areas or the exploration of new tags.
generated with DALL-E
Image: Simplified representation of the News Datasphere – a dynamic network of news tags, (C) Newsadoo
Image: Simplified representation of the News Datasphere for “Vienna” – a dynamic network of news tags, (C) Newsadoo
This project was funded by the Austrian Research Promotion Agency (FFG).
Project partners
Project details
- Short Project Title: NEEED
- Full Project Title: News-Extracted Evolving European Datasphere
- Project Partners:
- Newsadoo GmbH
- Software Competence Center Hagenberg GmbH
- Funding Call: FFG Basic Programme
- Duration: 03/2022 – 12/2023 (22 months)
Contact person
Project management
Sandra Wartner, MSc
Data Scientist