Anthropogenic impact and global climate change

Objectives

The validation scenario “Anthropogenic impact and global climate change” (abbreviated “MU scenario”), leads by MU and focuses on the correlation of environmental pollutants (POPs – Persistent Organics Pollutants) and their impact on health of human population and the impact of global climate change on an atmospheric transport of environmental pollutants. The aim of MU scenario is to create a central place for researchers, domain experts and decision makers to discover and access interdisciplinary knowledge in more efficient and usable way that is the currently state of the art. Due to the fact that there is an enormous amount of information resources in scientific fields, which is steadily growing, available search mechanisms like search engines, scientific networks and similar technologies are not sufficient to meet the complex requirements of today’s researchers and scientists. The results of conventional discovery processes are often not matching the domain context of the users and obligate them the tedious task of filtering large result sets to obtain the original object of the interest of the researcher intended to find with the search. Therefore the need arises for an improving discovery method, which will incorporate the domain knowledge and additional semantic information into the search in order to obtain a more fitting result for the specific context of the user.

The MU scenario not only aims to the use the TaToo framework to improve the discovery of scientific resources for one particular domain, but also tries to discover and create new relationships among different domains (environmental pollution and tumour epidemiology). The correlation of environmental pollutants including their transport due to global climate change and their impact on the health of human population is only one significant example of creating new relationships among different domains. These dependencies could represent new scientific insights for already available resources and connect the knowledge of the single domains. These relationships should facilitate further discovery process to deliver matching resources of multiple domains.

The MU scenario is therefore used to evaluate the resulting tagging and discovery framework of the TaToo project. Since the primary scope of the TaToo project is to facilitate the discovery of environmental pollution resources, this scenario delivers the perfect opportunity to validate the resulting solution against challenging real word problems. There are numerous scientific domains available and actively researched at the MU, but two important domains have been carefully chosen to demonstrate and validate the envisioned functionality of the TaToo project. The vision of the MU scenario is that other scientific domains could follow the initial institutes to further spin a new kind of knowledge network to deliver a new generation of tools and methods to effectively and conveniently support the scientific user in their daily work.

 

Validation Sources

System for Visualizing of Oncological Data (SVOD)

Creating a web portal SVOD (http://www.svod.cz)(System for Visualizing of Oncological Data) about tumour epidemiology in the Czech Republic is primarily motivated by the effort to make this representative and valuable data available to wide spectrum of users. We anticipate that general epidemiology data about these serious diseases and related population risks should be freely available to everybody in the Czech Republic. Another ambition of this web portal is to provide relevant information about tumour epidemiology in the Czech Republic abroad.

The web portal SVOD works mainly with data from Czech National Cancer Registry (http://www.linkos.cz) (NOR) which is managed by Institute of Health Information and Statistics (http://www.uzis.cz) (UZIS CR). It offers validated epidemiological data from the years 1977 - 2008. This represents a unique representative data set at least in European region (currently there is 1617809 records). UZIS CR is therefore cited as a data manager in all outputs and is stated among scientific guarantees of the project.

The web portal SVOD was created by the team of authors from the Faculty of Medicine in Brno (Institute of Biostatistics and Analyses) of MU and the Masaryk Memorial Cancer Institute (MMCI) in Brno. Creation of the SVOD portal is vitally supported by Ministry of Health of the Czech Republic in context of National Healthcare Quality Programme. Further development is supported by a research programme of MMCI (Functional diagnostics of tumours, MZO 00209805) and a research programme of the Faculty of Science of MU (INCHEMBIOL - RECETOX project, No. 0021622412). These grant projects guarantee long-term viability of the portal and ensure regular updates of data and successive development under supervision of administrators.

Global Environmental Assessment Information System (GENASIS)

The web portal GENASIS (http://www.genasis.cz)(Global ENvironmental ASsessment Information System) provides information support for the implementation of the Stockholm Convention (http://chm.pops.int) (SC) on Persistent Organic Pollutants (POPs) at the international level. The GENASIS is developed in accordance with the objectives of the Single Information System of the Environment (http://www.mzp.cz) of the Ministry of Environment of the Czech Republic and its new strategy of the eGovernment implementation. Its connection with other data sources creates the potential for a comprehensive assessment of anthropogenic impact on the environment and the associated ecological and human health risks. The GENASIS contains data collected by the Research Centre for Toxic Compounds in the Environment (http://www.recetox.muni.cz) (RECETOX) of MU and its partners since 1988 in various monitoring types (long-term, short-term, research studies, etc.) (Holoubek, 2011).

The GENASIS also offers analytical tools, one of the most important parts of the web portal. These tools allow basic processing of measured environmental data by “statistical” program units. In the introductory screen the user can determine what kind of data enters the analysis by selection of various parameters (e.g. project name, sampling time, matrix, chemical compound, etc.). This procedure provides core set of data. With tools implemented in this part of the GENASIS system it is possible to visualise the location of each sampled site by the means of synoptic maps and examine general and / or detailed information about sampling frequency. It is also possible to sort and select / deselect localities and view measured concentrations of selected compounds at the localities. Using additional modules it is possible to obtain descriptive statistics for selected data set, observe changes in concentration of the user-selected chemicals during time period and easily depict seasonal and long-term trends. Each module includes an option to use additional criteria that restrict entry data (e.g. the selection of explicit altitudes). Another integral part of the analytical modules is the stratification of localities according to various parameters (land use, altitude, distance to roads, sources of pollution, inhabited areas), which enables more detailed view and localities discrimination. More complex analyses and models are currently and continuously being prepared.

GENASIS project uses data collected both within the National Implementation Plan for the Implementation of the Stockholm Convention in the Czech Republic (http://www.pops.int) (NIP CR) and international projects to reach its goals. The Czech Republic has had a long-term tradition in POPs monitoring in the environment and its monitoring networks cover all environmental components. A basic description supplemented by outputs used within the frame of the NIP CR is available for each monitoring network.

 

Validation Use Cases

The MU Scenario profits from the GENASIS and SVOD web portals, which collect the desired data. We focused on these two portals as representatives, but there will also be other resources from outside. For the validation of TaToo tools we have proposed in the MU Scenario eight use cases (UC):

  • UC1: Discover resources with existing tools - SVOD and GENASIS users will be provided with the possibility to indirectly use the TaToo functionality for discovering similar resources based on analysed objects in SVOD/GENASIS web portals. The TaToo discovery could be started from within the web analysis tools. The relevant information needed for the search would be already entered via the web interface during the analysis and can be submitted to the TaToo framework.
  • UC2: Generic discovery - the goal of this use case is to deliver improvements regarding result relevance compared to conventional search engine results. The relevance of the resources to the search criteria should be improved so that the user receives more potential interesting search results. This circumstance hopefully leads to a reduction in the tedious effort of scanning the search results for matching entries.
  • UC3 + UC4: Persistent Organic Pollutant Resource Discovery and Oncological resource discovery. UC3 and UC4 bring extended domain specific search. The user is interested in the discovery of cancer/POPs related resources and has possibility to use extended search criteria (measurement methods, measured compounds, cancer incidences and mortality rates, etc.) to better specify desired domain specific resources. Domain experts are also allowed to enrich the resources by using TaToo tools.
  • UC5: Define resource uncertainty - domain experts are allowed to define certain quality criteria (annotations) for resources. This use case should allow domain experts to define certain quality criteria for resources like the reputation of the publishing institute, the measurement methods, used norms and standards etc. The user should have the possibility to assess the different criteria with a value.
  • UC6: Compare resources - enables users to compare found resources on the fly after the discovery. This use case should be helpful in finding the connections between different resources either in the same domain or in different domains.
  • UC7: Find similar resources. UC7 brings the functionality to search for similar resources based on interesting resource already found. If the user finds a resource that matches his needs a new search is started based on a current resource. The found resources should have a high probability to match the template resource used for the search.
  • UC8: Find related resources. The last use case provides with searching for related resources in other knowledge domains based on an already found resource. For example the user wants to find pollutant monitoring data for a specific time range and geospatial region, based on the values of a discovered cancer analysis. The geospatial extend and temporal extend from the cancer analysis will be used to perform a new search. The user only has to provide and specify the domain of interest in which new resources should be discovered.