Back to first pageBack to first page Centre for Artificial Intelligence of UNL
Browse our site
You are here:

Publication details

Publication details
Main information
Implementation for Spatial Data of the Shared Nearest Neighbour with Metric Data Structures
November 2012
bfaustino
The Shared Nearest Neighbour (SNN) is a data clustering algorithm that identifies noise in data and finds clusters with different densities, shapes and sizes. making the SNN a good candidate to deal with spatial data. The SNN time complexity can be a bottleneck in spatial data clustering, since it has a time complexity in the worst case evaluated in O(n2). In this thesis, it is proposed to use metric data structures to index spatial data and support the SNN in querying for the k-nearest neighbours. When dealing with spatial data, the time complexity in the average case of the SNN, using a metric data structure in primary storage (kd-Tree), is improved to at most O(n × log n). Furthermore, using a strategy to reuse the k-nearest neighbours between consecutive runs, it is possible to obtain a time complexity in the worst case of O(n). The experimental results were done using the kd-Tree and the DF-Tree, which work in primary and secondary storage, respectively.
M. Sc. dissertation
Bruno Faustino
João Moura Pires
UNL
-
-
Publication files
- click here to download - pdf 5242 KB
Export formats
Bruno Faustino, Implementation for Spatial Data of the Shared Nearest Neighbour with Metric Data Structures, João Moura Pires (superv.), UNL, November 2012.
<b>Bruno Faustino</b>, <u>Implementation for Spatial Data of the Shared Nearest Neighbour with Metric Data Structures</u>, <a href="/people/members/view.php?code=542b14e1830dcf7566974fd36b6fccc7" class="supervisor">João Moura Pires</a> (superv.), UNL, November 2012.
@mastersthesis {bfaustino, author = {Bruno Faustino}, title = {Implementation for Spatial Data of the Shared Nearest Neighbour with Metric Data Structures}, school = {UNL}, note = {Jo{\~a}o Moura Pires (superv.); }, abstract = {The Shared Nearest Neighbour (SNN) is a data clustering algorithm that identifies noise in data and finds clusters with different densities, shapes and sizes. making the SNN a good candidate to deal with spatial data. The SNN time complexity can be a bottleneck in spatial data clustering, since it has a time complexity in the worst case evaluated in O(n2). In this thesis, it is proposed to use metric data structures to index spatial data and support the SNN in querying for the k-nearest neighbours. When dealing with spatial data, the time complexity in the average case of the SNN, using a metric data structure in primary storage (kd-Tree), is improved to at most O(n × log n). Furthermore, using a strategy to reuse the k-nearest neighbours between consecutive runs, it is possible to obtain a time complexity in the worst case of O(n). The experimental results were done using the kd-Tree and the DF-Tree, which work in primary and secondary storage, respectively.}, keywords = {df-tree,kd-tree,sharednearestneighbour,snn,spatialdata}, month = {November}, year = {2012}, }
Publication's urls
/publications/view.php?code=4ad9d35f8453cb17d6455049d54ac481
/publications/view.php?code=bfaustino

Centre for Artificial Intelligence of UNL
Departamento de Informática, FCT/UNL
Quinta da Torre 2829-516 CAPARICA - Portugal
Tel. (+351) 21 294 8536 FAX (+351) 21 294 8541

Fundacao_FCT