SPADES
Spatio-textual Data Exploration at Scale


The SPADES project is a research project that is funded by the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under grant agreement No [1667]. The instrument aims to support post-doctoral research, and the principal investigator of SPADES is Akrivi Vlachou. The project is hosted at the Department of Digital Systems in the University of Piraeus.

SPADES won the Best Research Paper Award at SSTD'21 with the paper: A Novel Indexing Method for Spatial-Keyword Range Queries

Find Out More

Description of Work


Several research challenges are associated with the efficient support of spatio-textual processing at scale. Two main factors are critical for location-based services, thus determining their overall performance: the efficiency of query processing for spatio-textual queries and the quality of the retrieved points of interest. the efficiency of query processing directly influences the query throughput, which is very important in the context of scalable applications.

Consequently, this project proposal identifies a set of research and technological challenges that need to be effectively addressed, in order to support spatio-textual and spatio-temporal queries over massive data:

1. Support of advanced spatio-textual query types

The spatio-textual query returns objects that satisfy a spatial constraint and are also described by textual descriptions that best match the query keywords. The simplest form of spatial constraint is that the location of the retrieved object is in a given region on the current map, similar to a range query. Nevertheless, other spatial constraints, such as the nearest points of interest or those that are better than any other point of interest closer to the user location, do make sense for spatial-keyword search and they are not supported to-date. Even though a wide variety of spatial queries has been studied for spatial data, only a limited set of basic spatial constraints has been studied with respect to spatio-textual search. To alleviate this shortcoming, SPADES will define new advanced query types that allow the formulation of useful queries with complex constraints. It is important to support also queries that rank points of interest based on the textual descriptions that characterize other interesting facilities in their spatial neighborhood.

2. Novel distributed index structures for spatio-textual search

Efficient multi-dimensional access methods (such as R-trees) for spatial queries have been well researched in spatio-temporal databases in the past. Similarly, index structures for storage of textual information (documents) have been successfully developed by the IR-community in the past. The majority of such index structures have been developed independently. However, supporting efficient spatial-keyword search requires the combination of the merits of both worlds, and this has not been adequately studied yet for massive datasets that exceed the computational capacity of a single node. As such, distributed indexing is necessary, which is going to combine global with local indexes, suitable for spatiotextual data. This is also tightly related to issues such as data partitioning and load balancing, which also need to be investigated for spatio-textual data and under the respective query workloads.

3. Abstractions for parallel spatio-textual data processing

In order to provide a generic and portable framework for parallel processing of spatio-textual data, SPADES will propose the design of a framework that consists of generic operators (e.g., filter, scan, index, distribute, etc.) that work on spatio-textual data. In this way, SPADES will put in place an abstraction for the definition of parallel processing algorithms, which can be customized for specific parallel processing engines, by providing the necessary implementation of the operators. Our intention is to decouple the algorithm specification from the underlying parallel processing engine, which is a methodology that is going to offer added value to our research. To demonstrate its feasibility, SPADES is going to provide implementations of all operators for a specific data-parallel processing engine.

Objectives


  • Novel rank-aware query types complying with the paradigm of spatial-keyword search that cover a wide variety of information needs, targeting at the mobile user.
     
  • Provision of more expressive querying mechanisms for points of interest that combine spatial information and textual relevance with temporal information and user preferences.
     
  • Advanced distributed indexing structures capable to support complex spatial-keyword queries effectively, by means of harnessing the merits of spatial data structures and text indexes.
     
  • Novel partitioning mechanisms and load balancing techniques for spatio-textual queries, aiming at efficient parallel data processing.
     
  • Efficient query processing algorithms that drastically prune the search space, capitalizing on the available indexes, and enabling ranked retrieval of results.
     

People


Avatar

Akrivi Vlachou

Principal Investigator, Associate Professor

Avatar

Christos Doulkeridis

Scientific Host, Assistant Professor

Avatar

Christos Kalyvas

PhD Candidate

Avatar

Dimitris Poulopoulos

PhD Candidate

Avatar

Kjetil Nørvåg

Professor

Avatar

Alexandros Fakis

Researcher

Avatar

Spyros Kasdaglis

Researcher

Avatar

Konstantinos Platis

Researcher

Avatar

Antonis Psarros

Researcher

Avatar

Panagiotis Tampakis

Researcher

Avatar

Aikaterini Ntzelepi

Researcher


Results

The results of the project are expected to be exploited by the local tourism business that could benefit from the provision of innovative location-based services over vast quantities of data to tourists. Such services entail queries that are processing-intensive and typically run for minutes rather than seconds, and often produce results that are not truly useful to the end user. SPADES promises to facilitate the access to location-based information, a task of particular importance to tourists. In this respect, the research results of SPADES are expected to benefit also society at large.

Impact


SPADES aims to address the limitations of spatio-textual data analysis and processing when applied in the context of Big Spatial Data, as witnessed by the lack of existing systems and techniques for this purpose. Achievement of this goal constitutes a substantial step forward in dealing with challenges emerging from management of Big Spatial Data. At a practical level, the research outcome will benefit applications such as spatio-textual search and retrieval, mining of spatio-textual data, next generation location-based services, and tourism-oriented applications to name a few.

By exploiting SPADES the analysis of massive spatio-textual datasets (typically encountered in the aforementioned domains and especially in social networks) is going to be accelerated significantly. In consequence, applications will be able to query and analyze larger quantities of spatio-textual data in shorter time, thus speeding up the process of making new discoveries as well as aiding the task of interpretation of heterogeneous data (spatial or multidimensional data and unstructured textual data).

Publications


Journals

  1. Christos Doulkeridis, Akrivi Vlachou, Nikos Pelekis, Yannis Theodoridis: A Survey on Big Data Processing Frameworks for Mobility Analytics. In SIGMOD Record, 2021 (to appear)
  2. Panagiotis Nikitopoulos, Georgios A. Sfyris, Akrivi Vlachou, Christos Doulkeridis, Orestis Telelis: Pruning Techniques for Parallel Processing of Reverse Top‑k Queries. In Distributed and Parallel Databases Journal, Springer, 39(1): 169-199, 2021

Conferences & Workshops

  1. Nikolaos Koutroumanis, Nikolaos Kousathanas, Christos Doulkeridis, Akrivi Vlachou. A Demonstration of NoDA: Unified Access to NoSQL Stores In Proceedings of the 47th International Conference on Very Large Data Bases (VLDB'21), Copenhagen, Denmark - August 16-20, 2021.
  2. Panagiotis Tampakis, Dimitris Spyrellis, Christos Doulkeridis, Nikos Pelekis, Christos Kalyvas and Akrivi Vlachou. A Novel Indexing Method for Spatial-Keyword Range Queries. In Proceedings of the 17th International Symposium on Spatial and Temporal Databases (SSTD'21), August 2021.
  3. Andreas Tritsarolis, Christos Doulkeridis, Nikos Pelekis, Yannis Theodoridis. ST_VISIONS: A Python Library for Interactive Visualization of Spatio-temporal Data. In Proceedings of 22nd International Conference on Mobile Data Management (MDM'21) - demo track, Toronto, Canada (Virtual), June 15-18, 2021
  4. Akrivi Vlachou, Christos Doulkeridis, Nikolaos Koutroumanis, Dimitrios Poulopoulos, Kjetil Norvag. The SPADES Framework for Scalable Management of Spatio-textual Data. In Proceedings of 24th Pan-Hellenic Conference on Informatics (PCI'20), Athens, Greece, November 2020
  5. Stella Maropaki, Sean Chester, Christos Doulkeridis, Kjetil Norvag: Diversifying Top-k Point-of-Interest Queries via Collective Social Reach In Proceedings of 29th ACM Conference on Information and Knowledge Management (CIKM'20), October 19-23, 2020.
  6. Nikolaos Koutroumanis, Panagiotis Nikitopoulos, Akrivi Vlachou, Christos Doulkeridis: NoDA: Unified NoSQL Data Access Operators for Mobility Data. In Proceedings of the 16th International Symposium on Spatial and Temporal Databases (SSTD’19), Vienna, Austria, August 2019.
  7. Georgios M. Santipantakis, Apostolos Glenis, Christos Doulkeridis, Akrivi Vlachou, George A. Vouros: stLD: Towards a Spatio-temporal Link Discovery Framework. In Proceedings of the International Workshop on Semantic Big Data (SBD'19) (workshop held in conjunction with SIGMOD'19), Amsterdam, The Netherlands, July 2019, pp.4:1-4:6
  8. Panagiotis Nikitopoulos, Georgios A. Sfyris, Akrivi Vlachou, Christos Doulkeridis, Orestis Telelis: Parallel and Distributed Processing of Reverse Top-k Queries. In Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE'19), Macau SAR, China, April 2019, pp.1586-1589
  9. Akrivi Vlachou, Christos Doulkeridis, Apostolos Glenis, Georgios M. Santipantakis, George A. Vouros: Efficient Spatio-temporal RDF Query Processing in Large Dynamic Knowledge Bases. In Proceedings of the 34th ACM/SIGAPP Symposium On Applied Computing (SAC'19), Limassol, Cyprus, April 2019, pp.439-447

Technical Reports

  1. Andreas Tritsarolis, Christos Doulkeridis, Nikos Pelekis, Yannis Theodoridis. ST_VISIONS: A Python Library for Interactive Visualization of Spatio-temporal Data. Technical Report (2020).
  2. Akrivi Vlachou, Christos Doulkeridis, Nikolaos Koutroumanis, Dimitrios Poulopoulos, Kjetil Norvag. The SPADES Framework for Scalable Management of Spatio-textual Data. Technical Report (2020).

Contact Us


For more information please contact Akrivi Vlachou.
For more information about the research group and the department, please visite the respective home pages: Department of Digital Systems