LiveLanguage Kinship Study Featured in Frontiers
- Posted by Tetiana Bihun
- Categories News
- Date January 14, 2025
On November 20, 2023, the journal Frontiers in Psychology published an article titled “Lexical Diversity in Kinship Across Languages and Dialects” by members of DataScientia who specialize in creating high-quality, diversity-aware lexical databases for natural language processing and machine translation. The article’s authors are Hadi Khalilia, Gábor Bella, Fausto Giunchiglia, Abed Alhakim Freihat, and Shandy Darma.
The research work is part of DataScientia’s initiative, LiveLanguage, which prioritises the role of human contributors from various communities, including linguists, AI engineers, and educators to manage diversity-aware data. Using this method, DataScientia memebers expanded an open dataset integrated into our Universal Knowledge Core with 223 kin terms and 1,619 lexical gaps across seven Arabic dialects and three Indonesian languages.
The paper was published as part of the research on The Adaptive Value of Languages: Non-Linguistic Causes of Language Diversity. The goal was to explore how natural environments or human-made influences might shape some features of language structure. Due to the tasks’ complexity, researchers from various fields, including linguistics, psychology, biology, sociology, and anthropology, were invited to contribute.
The paper was dedicated to finding specific concepts and gaps across various languages and dialects. The scholars proposed a unique method to capture linguistic diversity based on systematic data collection of specific narrowed areas rather than the general topic.
This method is designed to represent concepts specific to different languages and dialects while also addressing gaps in vocabulary. It has already been tested in two studies on kinship terms in Arabic dialects and the Indonesian language.
Here below the paper’s abstract:
Languages are known to describe the world in diverse ways. Across lexicons, diversity is pervasive, appearing through phenomena such as lexical gaps and untranslatability. However, in computational resources, such as multilingual lexical databases, diversity is hardly ever represented. In this paper, we introduce a method to enrich computational lexicons with content relating to linguistic diversity.
The method is verified through two large-scale case studies on kinship terminology, a domain known to be diverse across languages and cultures: one case study deals with seven Arabic dialects, while the other one with three Indonesian languages. Our results, made available as browseable and downloadable computational resources, extend prior linguistics research on kinship terminology, and provide insight into the extent of diversity even within linguistically and culturally close communities.
Read the full paper on Frontiers in Psychology.
The link to download the PDF of the article.
Keywords: linguistic diversity, lexical gaps, untranslatability, computational lexicons, multilingual lexical databases, kinship terminology, Arabic dialects, semantic diversity, cross-cultural analysis, terminology databases.

────────────────
Tetiana Bihun
Author, Content Creator
Tag:LiveLanguage
You may also like
CS2 Italy Conference 2025
The CS2 Italy Conference 2025 was held from January 15-18, 2025, at the Department of Sociology and Social Research, Trento, Italy. This event aimed to bring together researchers from diverse fields such as sociology, economics, political science, and computer science, …
DataScientia Event 2024: Reflections from Partners
The DataScientia Event: Community, Projects, and Education 2024 was successfully held in Trento from September 27 to 29, bringing together participants for three days of collaboration, innovation, and learning. The program featured a dynamic lineup of presentations, including innovative AI …
The Best Paper Award at the UBICOMP
We are proud to share that the DataScientia scholars from the WeNet research team received a notable award at ACM UbiComp/ISWC 2023 in Cancún, Mexico! Their paper, “Generalization and Personalization of Mobile Sensing-Based Mood Interference Models: An Analysis of College …