An Ontology of Segregation

From Segregation Wiki
Revision as of 18:15, 12 October 2024 by Nettoworks (talk | contribs)

Network representation of the segregation ontology: the inner ring contains the 32 types, the outer ring contains the 804 identified segregation forms (SFs). Colors represent the types. The lines connect each type with all the SFs that they are associated with. The colored dots in the SF ring show the types associated with each SF. Two randomly selected SFs and their corresponding types are highlighted. Navigate the zoomable high-resolution figure for more detail, or explore the interactive ontology, which is currently under development.

Hundreds of forms of segregation across a diverse scientific literature encompassing 169 disciplinary fields have been have identified and mapped (see Netto et al., 2024), revealing the extraordinary connectivity between these forms. Given the complexity of this mosaic, how can we make it more comprehensible and valuable to the multidisciplinary community of researchers studying segregation and its many dimensions? Our approach has been identifying segregation forms and their relationships across over a century of literature. This search for a systemic understanding of segregation in its multiple manifestations is akin to an ontology.  The term “ontology” originates from philosophy, where it refers to the study of existence. For example, Aristotle’s ontology defines primitive categories like substance and quality, used to account for existing entities. In the early 1980s, Artificial Intelligence (AI) researchers adopted the term in computer and information science to describe both a theory of a modeled world and a component of knowledge systems. An ontology defines a set of representational primitives, such as classes (or sets), attributes (or properties), and relationships, to model a domain of knowledge. This includes information about their meaning and the constraints on their logical application, similar to relational models for representing individuals, their attributes, and their relationships.

We propose an inductive approach to ontology creation. Additionally, we define the nature of the relationships and typological positions that various forms of segregation occupy within this conceptual space, based on the following definitions:

  • Segregation form refers to a specific act, practice, or process of separating or restricting interaction between individuals or social groups based on distinguishing characteristics, such as race, income, religion, or other social attributes. These forms are manifested through observable social, spatial, material, or economic patterns. Segregation forms are context-dependent, reflecting how segregation forces manifest within a particular environment, time, or population, shaping and being shaped by the surrounding societal, cultural, and economic conditions.
  • Segregation type is a broader conceptual category that encompasses multiple related forms of segregation. Types represent a generalization of shared underlying structures, processes or properties that may manifest through distinct but related forms. For example, residential segregation might be considered a type that encompasses various forms, such as income-based or ethnic-based residential segregation. The defining feature of a type is its ability to group specific forms based on common socio-economic, spatial, or institutional mechanisms, allowing for general patterns of segregation to be identified across various contexts.

Segregation forms can intersect and belong to multiple types. For instance, 'metropolitan Hispanic segregation' encompasses ethnic, geographic, urban and spatial segregation forms. The ontological method therefore avoids a strictly hierarchical structure making relationships exclusively vertical, as found in taxonomies in biology, opting instead for a richer relational approach. The method should also identify typological relationships between segregation forms and types from the bottom up, meaning that such relationships emerge from information produced or latent in the literature.

We employed a natural language processing (NLP) approach to group and rank SFs based on their semantic similarity using hierarchical clustering. Methodological procedures can be seen in Netto et al. (2024).

The complexity of semantic clustering for SFs lies in the fact that some SFs can theoretically belong to multiple clusters. For instance, ethnic residential segregation could cluster with both economic residential segregation (as they both address residential segregation) and ethnic school segregation (as they both involve ethnicity). Meanwhile, economic residential segregation and ethnic school segregation do not share a semantic commonality. This overlap in thematic relationships made it difficult to rely solely on traditional clustering metrics such as silhouette score or Davies–Bouldin index, which fail to account for such intersections. Manual evaluation, therefore, was necessary to assess the coherence and interpretability of the clusters. Multiple authors with expert knowledge qualitatively associated SFs with relevant labels, following predefined criteria developed in the coding phase of this research (see the codebook in SI) to reduce subjectivity and ensure consistent assessment. Clusters were assessed based on their ability to group SFs that shared similar meanings or contexts and were assigned with labels. We identified 32 labels able to sufficiently represent such clusters of common features as segregation types (STs).

After clustering, we finalized the ontology. Each SF could belong to one or more types, with only one SF being assigned a maximum of eight types. Each ST is associated with a cluster as a node in its local network of directly related SFs. Since SFs can belong to multiple clusters, they form a network of relationships between the SFs and their types, culminating in an integrated ontology comprising 32 distinct segregation types. This method allows us to identify key groupings and map the overall relational structure of SFs. We produced a network graph (above) using color and size to distinguish between SFs and types, making it easier to interpret how different forms of segregation correspond to specific ontological categories. This visualization highlights the differences in complexity among SFs, offering a clear view of their categorization and relationships. A full exploration of the segregation ontology requires an interactive graph.


Reference

Netto, V.M., Krenz, K., Fiszon, M., Peres, O., & Rosalino, D. (2024). Decoding segregation: Navigating a century of segregation research across disciplines and introducing a bottom-up ontology. ArXiv. https://doi.org/[DOI]