Context-Aware Metadata Enrichment in Enterprise Master Data Management: A Natural Language Processing Approach for EBX Repositories
Abstract
Organizations that rely on enterprise master data platforms often encounter persistent limitations in metadata quality, particularly in areas such as semantic clarity, contextual relevance, and cross domain interpretability. This study examines the use of natural language processing to enable context aware metadata enrichment within EBX repositories, addressing the challenge of transforming fragmented descriptive fields into structured, meaningful knowledge assets. The purpose of this research is to design and evaluate a systematic enrichment approach that can interpret textual attributes, infer relationships, and enhance metadata usability for governance, integration, and analytics. A mixed research method was applied, combining architectural modeling, controlled prototype implementation, and qualitative assessment of stewardship workflows in simulated enterprise scenarios. Observed outcomes demonstrate measurable improvements in classification consistency, metadata coverage, and retrieval efficiency, while also reducing dependence on manual interpretation. The proposed framework introduces a scalable enrichment pipeline that integrates linguistic analysis, semantic mapping, and governance driven validation within the operational lifecycle of EBX master data. This study argues that embedding language aware intelligence into metadata management practices can significantly strengthen data reliability and transparency. The findings provide a foundation for future research on semantic infrastructure in enterprise data ecosystems and offer practical guidance for organizations seeking to modernize metadata governance in complex master data environments.
Full Text:
PDFReferences
Rahm, E., & Bernstein, P. A. (2001). A survey of approaches to automatic schema matching. The VLDB Journal, 10(4), 334–350. https://doi.org/10.1007/s007780100057
Otto, B. (2011). Data governance. Business & Information Systems Engineering, 3(4), 241–244. https://doi.org/10.1007/s12599-011-0162-8
Abraham, R., Schneider, J., & vom Brocke, J. (2019). Data governance: A conceptual framework, structured review, and research agenda. International Journal of Information Management, 49, 424–438. https://doi.org/10.1016/j.ijinfomgt.2019.07.008
Ofner, M. H., Otto, B., Österle, H., & Stein, A. (2013). Management of the master data lifecycle: A framework for analysis. Journal of Enterprise Information Management, 26(4), 472–491. https://doi.org/10.1108/JEIM-05-2013-0026
Stvilia, B., Gasser, L., Twidale, M. B., & Smith, L. C. (2007). A framework for information quality assessment. Journal of the American Society for Information Science and Technology, 58(12), 1720–1733. https://doi.org/10.1002/asi.20652
Margaritopoulos, T., Margaritopoulos, M., Mavridis, I., & Manitsaris, A. (2008). A conceptual framework for metadata quality assessment. Proceedings of the International Conference on Dublin Core and Metadata Applications. https://doi.org/10.23106/dcmi.952109222
Bellini, E., & Nesi, P. (2013). Metadata quality assessment tool for open access cultural heritage institutional repositories. In Information Technologies for Performing Arts, Media Access, and Entertainment (pp. 90–103). https://doi.org/10.1007/978-3-642-40050-6_9
Liolios, K., Schriml, L., Hirschman, L., et al. (2012). The Metadata Coverage Index (MCI): A standardized metric for quantifying database metadata richness. Standards in Genomic Sciences, 6, 444–453. https://doi.org/10.4056/sigs.2675953
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018. https://doi.org/10.1038/sdata.2016.18
Missier, P., Belhajjame, K., & Cheney, J. (2013). The W3C PROV family of specifications for modelling provenance metadata. Proceedings of the EDBT Conference. https://doi.org/10.1145/2452376.2452478
Euzenat, J., & Shvaiko, P. (2013). Ontology Matching (2nd ed.). Springer. https://doi.org/10.1007/978-3-642-38721-0
Bellahsene, Z., Bonifati, A., & Rahm, E. (2011). Schema Matching and Mapping. Springer. https://doi.org/10.1007/978-3-642-16518-4
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z. (2007). DBpedia: A nucleus for a web of open data. In The Semantic Web (pp. 722–735). https://doi.org/10.1007/978-3-540-76298-0_52
Kiryakov, A., Popov, B., Terziev, I., Manov, D., & Ognyanoff, D. (2004). Semantic annotation, indexing, and retrieval. Web Semantics, 2(1), 49–79. https://doi.org/10.1016/j.websem.2004.07.005
Petasis, G., Karkaletsis, V., Paliouras, G., Krithara, A., & Zavitsanos, E. (2011). Ontology population and enrichment: State of the art. In Knowledge-Driven Multimedia Information Extraction and Ontology Evolution (pp. 134–166). https://doi.org/10.1007/978-3-642-20795-2_6
Martínez-Rodríguez, J. L., Hogan, A., & López-Arevalo, I. (2018). Information extraction meets the Semantic Web: A survey. Semantic Web, 11(2), 255–335. https://doi.org/10.3233/SW-180333
Newman, D., Hagedorn, K., Chemudugunta, C., & Smyth, P. (2007). Subject metadata enrichment using statistical topic models. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (pp. 366–375). https://doi.org/10.1145/1255175.1255248
Glowacka-Musial, M. (2022). Applying topic modeling for automated creation of descriptive metadata for digital collections. Information Technology and Libraries, 41(2). https://doi.org/10.6017/ital.v41i2.13799
Ristoski, P., & Paulheim, H. (2016). RDF2Vec: RDF graph embeddings for data mining. In The Semantic Web (pp. 498–514). https://doi.org/10.1007/978-3-319-46523-4_30
Lubani, M., & Deters, R. (2019). Ontology population: Approaches and design aspects. Journal of Information Science, 45(4), 456–470. https://doi.org/10.1177/0165551518801819
Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. Proceedings of the Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.3115/1219840.1219885
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. Proceedings of ACL System Demonstrations, 55–60. https://doi.org/10.3115/v1/P14-5010
Refbacks
- There are currently no refbacks.
Copyright (c) 2026 International Journal of Sustainable Development in Computing Science

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
A Double-Blind Peer Reviewed Journal