Metody i modele oceny jakości danych przestrzennych
AbstractThe quality of data collected in official spatial databases is crucial in making strategic decisions as well as in the implementation of planning and design works. Awareness of the level of the quality of these data is also important for individual users of official spatial data. The author presents methods and models of description and evaluation of the quality of spatial data collected in public registers. Data describing the space in the highest degree of detail, which are collected in three databases: land and buildings registry (EGiB), geodetic registry of the land infrastructure network (GESUT) and in database of topographic objects (BDOT500) were analyzed. The results of the research concerned selected aspects of activities in terms of the spatial data quality. These activities include: the assessment of the accuracy of data collected in official spatial databases; determination of the uncertainty of the area of registry parcels, analysis of the risk of damage to the underground infrastructure network due to the quality of spatial data, construction of the quality model of data collected in official databases and visualization of the phenomenon of uncertainty in spatial data. The evaluation of the accuracy of data collected in official, large-scale spatial databases was based on a representative sample of data. The test sample was a set of deviations of coordinates with three variables dX, dY and Dl – deviations from the X and Y coordinates and the length of the point offset vector of the test sample in relation to its position recognized as a faultless. The compatibility of empirical data accuracy distributions with models (theoretical distributions of random variables) was investigated and also the accuracy of the spatial data has been assessed by means of the methods resistant to the outliers. In the process of determination of the accuracy of spatial data collected in public registers, the author’s solution was used – resistant method of the relative frequency. Weight functions, which modify (to varying degree) the sizes of the vectors Dl – the lengths of the points offset vector of the test sample in relation to their position recognized as a faultless were proposed. From the scope of the uncertainty of estimation of the area of registry parcels the impact of the errors of the geodetic network points was determined (points of reference and of the higher class networks) and the effect of the correlation between the coordinates of the same point on the accuracy of the determined plot area. The scope of the correction was determined (in EGiB database) of the plots area, calculated on the basis of re-measurements, performed using equivalent techniques (in terms of accuracy). The analysis of the risk of damage to the underground infrastructure network due to the low quality of spatial data is another research topic presented in the paper. Three main factors have been identified that influence the value of this risk: incompleteness of spatial data sets and insufficient accuracy of determination of the horizontal and vertical position of underground infrastructure. A method for estimation of the project risk has been developed (quantitative and qualitative) and the author’s risk estimation technique, based on the idea of fuzzy logic was proposed. Maps (2D and 3D) of the risk of damage to the underground infrastructure network were developed in the form of large-scale thematic maps, presenting the design risk in qualitative and quantitative form. The data quality model is a set of rules used to describe the quality of these data sets. The model that has been proposed defines a standardized approach for assessing and reporting the quality of EGiB, GESUT and BDOT500 spatial data bases. Quantitative and qualitative rules (automatic, office and field) of data sets control were defined. The minimum sample size and the number of eligible nonconformities in random samples were determined. The data quality elements were described using the following descriptors: range, measure, result, and type and unit of value. Data quality studies were performed according to the users needs. The values of impact weights were determined by the hierarchical analytical process method (AHP). The harmonization of conceptual models of EGiB, GESUT and BDOT500 databases with BDOT10k database was analysed too. It was found that the downloading and supplying of the information in BDOT10k creation and update processes from the analyzed registers are limited. An effective approach to providing spatial data sets users with information concerning data uncertainty are cartographic visualization techniques. Based on the author’s own experience and research works on the quality of official spatial database data examination, the set of methods for visualization of the uncertainty of data bases EGiB, GESUT and BDOT500 was defined. This set includes visualization techniques designed to present three types of uncertainty: location, attribute values and time. Uncertainty of the position was defined (for surface, line, and point objects) using several (three to five) visual variables. Uncertainty of attribute values and time uncertainty, describing (for example) completeness or timeliness of sets, are presented by means of three graphical variables. The research problems presented in the paper are of cognitive and application importance. They indicate on the possibility of effective evaluation of the quality of spatial data collected in public registers and may be an important element of the expert system.
|Other language title versions||Methods and models for the assessment of the of spatial data quality|
|Publisher||Uniwersytet Rolniczy im. Hugona Kołłątaja w Krakowie, MNiSW |
|Publishing place (Publisher address)||Kraków|
|Book series /Journal (in case of Journal special issue)||Zeszyty Naukowe Uniwersytetu Rolniczego im. Hugona Kołłątaja w Krakowie. Rozprawy, ISSN 1899-3486, (0 pkt)|
|Publication size in sheets||10.25|
* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.