Metadata Accelerator: Improving scientific data descriptions with Natural Language Processing methods (NLP) and Instant Feedback

Maria Juliana Rodriguez Cubillos; Andrew J. Millar; Ian Simpson; Jason Swedlow; Tomasz Zieliński

doi:10.2218/eor.2024.9660

Authors

Maria Juliana Rodriguez Cubillos University of Edinburgh
Andrew J. Millar University of Edinburgh
Ian Simpson University of Edinburgh
Jason Swedlow University of Edinburgh
Tomasz Zieliński University of Edinburgh

DOI:

https://doi.org/10.2218/eor.2024.9660

Abstract

Promoting data availability and accessibility is a foundational principle of FAIR data guidance. However, better metadata is needed to ensure knowledge dissemination, highlighting the vital role of documenting research studies.

Aim: Develop an AI metadata enrichment tool focusing on named entities within unstructured textual data. Using text mining, Machine Learning, and NLP models like GPT and BERT, my strategic goal is to offer feedback on free text descriptions to improve metadata quality and dataset reusability.

Downloads

Download data is not yet available.

Metadata Accelerator: Improving scientific data descriptions with Natural Language Processing methods (NLP) and Instant Feedback

Authors

DOI:

Abstract

Downloads

Downloads

Published

Issue

Section

License