ИСТИНА |
Войти в систему Регистрация |
|
ИСТИНА ЦЭМИ РАН |
||
In our days science, governance, consulting and many others branches use huge volume of data in their activity. Big Data and Machine Learning are the mainstreams today. It is reasonable to mention that data preparation, finding gaps and mistakes is quite complicated and time-consuming task. These issues are also common for geosciences. Geoinformation technologies (GIS) and appropriate software deal with environmental data. We use data to design maps and perform spatio-temporal analysis. An important source of environmental data is field survey or monitoring data. We studied such data on the example of Bali island and detected some problems which decrease data availability. There are some of them: gaps and missed data, different patterns of data entry, data amount mismatch from year to year etc. These issues of data input become critical if we try to convert data into GIS format. This data becomes the source for some thematic maps which visualize environmental situation on the island. Presumably we can lose up to 30% of table rows due to the missing and inappropriate data. Of course, this will affect maps and future conclusions made by competent authority. We propose an approach based on scripts, which support each stage of data utilization. The most significant stages are data pre-processing and data processing. Data pre-processing include data input, pattern and format verification, tests for limit values, check of point location. As a result, we get the most convenient table to convert it into GIS database. Data processing means required calculation and getting derivative values to place them on the maps. The described solution doesn’t suit for GIS-specialists because it restricts capabilities of GIS analysis. Besides it helps to get expected results which are uniformly designed. It is a great advantage for geographers, soil scientists and authorities because it is no need to waste time on data preparation and mapping.