Block
DataScientia Services
collecting pervasive data on human behavior
Engage
with participants
Collect
data using iLog
Monitor
project progress
Prepare
ready to share data
Distribute
Privacy-conscious data

Abstract

Big Data is increasingly recognized for its substantial potential to enhance research on human behavior. Data from diverse sources, such as social media platforms, telecommunications companies, and mapping services like OpenStreetMap (OSM), provide a wealth of information. However, merely having this data is insufficient to fully grasp individuals’ socio-economic and personal contexts. While Big Data is effective for in-depth studies of specific aspects of human behavior, it often falls short of capturing the overall complexity of daily life.

For example, social media data facilitates sophisticated analyses of interactions within a platform but doesn’t adequately reflect the intricacies of real-world activities. Conversely, traditional social science methods—such as questionnaires, interviews, and participant observation—capture the nuances of daily routines and individual perspectives but have a different, more limited scale compared to Big Data.

In this context, the DataScientia (DS) initiative provides tools for gathering real-time behavioral data, combining insights from both individual interactions and sensor streams from personal devices, particularly smartphones. This approach is beneficial for researchers across various fields, from social sciences to computer science. The impact of this data is further amplified by integrating it with other secondary data, such as merging GPS data from smartphones with location descriptions and Points of Interest obtained from OSM.

However, collecting this type of data presents several challenges, including technological issues and privacy concerns, particularly regarding GDPR compliance. Researchers must also navigate potential errors due to participant interactions, such as response bias and participant burden, as well as prepare the data for distribution, addressing issues like data loss and lack of anonymization.

To help address these challenges, DS is equipped with several essential tools:

a community component to engage participants and communicate research and privacy concerns
a configurable smartphone application for data collection
a dashboard for real-time monitoring of data collection
a pipeline for cleaning and ensuring GDPR-compliant data
a catalog for distributing data while respecting privacy and copyright

This tutorial proposal aims to introduce interested researchers to the services offered by DS through an interactive and hands-on approach. Participants will have the opportunity to experiment with the various services, replicating the user journey of a participant in a data collection process to better understand the impact and potential of these technologies.

Tutorial Schedule

Time Title Activity Presenter
TBD Introduction   Matteo
TBD The DataScientia Community Community registration and Project joining: Participants will explore the DS Community features, such as profile creation, project management, participants recruitment. Interested people can register in the Community and, from there, access the data collection project designed for the tutorial. Ali
TBD The data collection system and services

Download the iLog app and Collect data: Participants will be invited to download the GDPR compliant iLog app and start collecting their smartphone data and interacting with the app to explore its capabilities.

Monitor data collection (demo): During the collection, participants will explore the dashboard, which is designed for researchers to monitor the data collected in real-time.

Leonardo
TBD The data preparation pipeline Data transformation (demo): Participants will learn the pipeline for transforming and preparing the data and consolidating it from a privacy point of view. Andrea
TBD The LivePeople Catalog

Download data*: participants can download the collected data onto their PC.

Upload and Data documentation*: Once the data has been obtained, participants will be invited to upload the descriptions to their personal Catalog provided by DS, thus exploring ways of describing metadata and distributing data that are designed to be ethically and legally sustainable.

Andrea
TBD Wrap up and conclusion   Matteo

* The proposed activity will be carried out if the number of participants allows it. Alternatively, the LivePeople Catalog and the ways to access it and request data will be shown.

Presenters

Matteo Busso

University of Trento – DISI matteo.busso@unitn.it

Andrea Bontempelli

University of Trento – DISI andrea.bontempelli@unitn.it

Ali Hamza

University of Trento – DISI ali.hamza@unitn.it

Leonardo Javier Malcotti

University of Trento – DISI leonardo.malcoltti@unitn.it