Structural Analysis of Biomedical Ontologies Center (SABOC)

The Structural Analysis of Biomedical Ontologies Center, located in Department of Computer Science at the New Jersey Institute of Technology (NJIT), is devoted to research exploring structural issues in biomedical ontologies (e.g., SNOMED CT, NCIt, NDF-RT, and ChEBI). Our research interests include ontology quality assurance methodologies, ontology summarization techniques, ontology change analysis, and analysis of families of ontologies.

SABOC is currently funded by the National Cancer Institute of the National Institutes of Health under Award Number R01 CA190779: A family-based framework of quality assurance for biomedical ontologies (P.I. Yehoshua Perl).

Current Project at a Glance: A family-based framework of quality assurance for biomedical ontologies

We are currently developing a family-based Quality Assurance (QA) framework for biomedical ontologies. Ontology QA is critical for increasing the use of ontologies in interdisciplinary research and in electronic health records (EHRs). We are developing computational techniques for identifying concepts with high probability of errors to improve efficiency and effectiveness of ontology QA. Biomedical ontologies are large, complex knowledge representation systems that enable the integration of knowledge from different fields. The largest, best-known ontology repository is the National Center for Biomedical Ontologies (NCBO) BioPortal, containing more than 580 biomedical ontologies. However, errors have been discovered in BioPortal's ontologies. QA in BioPortal has been mostly focused on use-cases and ad hoc techniques. Our computational techniques will automatically identify sets of concepts with a high likelihood of errors to empower ontology QA.


The general process of creating an abstraction network from an ontology. We have developed different types of abstraction networks that capture different aspects of an ontology's structure. These abstraction networks are applicable to families of structurally similar ontologies. (a) Represents a subhierarchy of concepts (classes) from an ontology. (b) Represents the abstraction network summarizing (a).

In past research, we have designed QA techniques for individual ontologies and we have shown that sets of complex and uncommonly classified concepts have significantly higher percentages of errors. The theoretical basis for our QA are ontology summaries called Abstraction Networks (AbNs). Using AbNs, we identified error-prone concepts. In this project, we are performing QA for entire families of structurally similar ontologies. We have identified several important families, based on structural properties. If a classification of concepts yields higher than usual error rates in several ontologies of a family F, then we hypothesize that this will be true for such classifications for most ontologies of F. Our primary test beds are cancer-related ontologies, e.g., the National Cancer Institute thesaurus (NCIt), with different properties and purposes. Several non-cancer ontologies are also being analyzed.

The summarization of NCIt concepts according to their defining semantic relationships. (a) A subhierarchy of NCIt concepts. (b) A summary of these concepts from (a) according to the types of semantic relationship types (an "area taxonomy"). (c) A summary identifying the subhierachies of concepts that are modeled with the same types of semantic relationship types (a "partial-area taxonomy").

Project Objectives

  • Identify families of ontologies in the NCBO BioPortal, based on the structure of the ontologies
  • Design a unified methodology for deriving abstraction networks for families of ontologies
  • Build a software tool, the Ontology Abstraction Framework, to create abstraction networks for the ontologies in each family
  • Investigate classifications that can indicate erroneous concepts in a family of ontologies
  • Perform evaluation of our QA methodologies and usability studies for OAF
  • Develop additional abstraction-network-based techniques and tools to support ontology development

Software

As part of our research we are developing several software tools to enable the derivation, visualization, and exploration of summaries of ontologies. The software system supporting our abstraction-network-based studies is named the Ontology Abstraction Framework (OAF) and it is available as free and open source software. To download the Ontology Abstraction Framework see our Software page.

Selected Publications

Below are a selected list of the most relevant publications for the family-based QA project. For a complete list of publications associated with this project click here to view our complete Publications page.

Ochs, C., He, Z., Zheng, L., Geller, J., Perl, Y., Hripcsak, G., & Musen, M. A. (2016).
Utilizing a structural meta-ontology for family-based quality assurance of the BioPortal ontologies.
Journal of biomedical informatics, 61, 63-76. Click to read

Halper, M., Perl, Y., Ochs, C., & Zheng, L. (2017)
Taxonomy-Based Approaches to Quality Assurance of Ontologies
Journal of Healthcare Engineering. Accepted for publication.

Ochs, C., Geller, J., Perl, Y., & Musen, M. A. (2016).
A unified software framework for deriving, visualizing, and exploring abstraction networks for ontologies.
Journal of biomedical informatics, 62, 90-105. Click to read

Halper, M., Gu, H., Perl, Y., & Ochs, C. (2015).
Abstraction networks for terminologies: supporting management of “big knowledge”.
Artificial intelligence in medicine, 64(1), 1-16. Click to read

Ochs, C., Perl, Y., Geller, J., Haendel, M., Brush, M., Arabandi, S., & Tu, S. (2015).
Summarizing and visualizing structural changes during the evolution of biomedical ontologies using a Diff Abstraction Network.
Journal of biomedical informatics, 56, 127-144. Click to read

Ochs, C., Perl, Y., Geller, J., Arabandi, S., Tudorache, T., & Musen, M. A. (2017).
An Empirical Analysis of Ontology Reuse in BioPortal.
Journal of Biomedical Informatics, 71, 165-177. Click to read

Contact Us

Please send any questions to Dr. James Geller (geller@njit.edu).
The SABOC Team
Dr. Yehoshua Perl, FACMI and Dr. James Geller, FACMI have directed SABOC for over 20 years. The SABOC team consists of researchers, PhD students, external collaborators, undergraduate student developers, and alumni. Read more ...
Research
SABOC's research focuses on ontology quality assurance (QA). We utilize ontology summaries called abstraction networks to support ontology QA, ontology change analysis, and ontology development, among other use cases. Read more ...
Software
We have developed several software tools for browsing and visualizing ontologies. We are currently developing the Ontology Abstraction Framework (OAF), an open source software tool for creating and visualizing summaries of ontologies. Read more ...
Ⓒ 2017 - 2023
Structural Analysis of Biomedical Ontologies Center (SABOC)
Department of Computer Science, Ying Wu College of Computing
New Jersey Institute of Technology
University Heights, Newark, New Jersey 07102