(none)
| Nanopublication | Part | Subject | Predicate | Object | Published By | Published On |
|---|---|---|---|---|---|---|
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
|
DOI bot
|
2026-03-05T10:45:18.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the “big picture” of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.
|
DOI bot
|
2026-03-05T10:40:38.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
Linking the biomedical literature to other data resources is notoriously difficult and requires text mining. Text mining aims to automatically extract facts from literature. Since authors write in natural language, text mining is a great natural language processing challenge, which is far from being solved. We propose an alternative: If authors and editors summarize the main facts in a controlled natural language, text mining will become easier and more powerful. To demonstrate this approach, we use the language Attempto Controlled English (ACE). We define a simple model to capture the main aspects of protein interactions. To evaluate our approach, we collected a dataset of 459 paragraph headings about protein interaction from literature. 56% of these headings can be represented exactly in ACE and another 23% partially. These results indicate that our approach is feasible.
|
DOI bot
|
2026-02-25T09:39:45.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
In this paper, we present Coral, an interface in which complex corpus queries can be expressed in a controlled subset of natural English. With the help of a predictive editor, users can compose queries and submit them to the Coral system, which then automatically translates them into formal AQL statements. We give an overview of the controlled natural language developed for Coral and describe the functionalities of the predictive editor provided for it. We also report on a user experiment in which the system was evaluated. The results show that, with Coral, corpora of annotated texts can be queried more easily and more quickly than with the existing ANNIS interface. Our system demonstrates that complex corpora can be accessed without the need to learn a complicated, formal query language.
|
DOI bot
|
2026-02-25T09:27:16.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
In response to the increasing volume of research data being generated, more and more data portals have been designed to facilitate data findability and accessibility. However, a significant portion of this data remains confidential or restricted due to its sensitive nature, such as patient data or census microdata. While maintaining confidentiality prohibits its public release, the emergence of portals supporting rich metadata can help enable researchers to at least discover the existence of restricted access data, empowering them to assess the suitability of the data before requesting access. Existing standards, such as CSV on the Web and RDF Data Cube, have been adopted to facilitate data management, integration, and re-use of data on the Web. However, the current landscape still lacks adequate standards not only to effectively describe restricted access data while preserving confidentiality but also to facilitate its discovery. In this work, we investigate the relationship between the structural, statistical, and semantic elements of restricted access tabular data, and we explore how such relationship can be formally modeled in a way that is Findable, Accessible, Interoperable, and Reusable. We introduce the DataSet-Variable Ontology (DSV), that by combining CSV on the Web and RDF Data Cube standards, leveraging semantic technologies and Linked Data principles, and introducing variable-level metadata, aims to capture high-quality metadata to support the management and re-use of restricted access data on the Web. As evaluation, we conducted a case study where we applied DSV to four different datasets from different statistical governmental agencies. We employed a set of competency questions to assess the ontology's ability to support knowledge discovery and data exploration. By describing high-quality metadata, both at the dataset- and variable levels, while maintaining data privacy, this novel ontology facilitates data interoperability, discovery, and re-use and it empowers researchers to manage, integrate, and analyze complex restricted access data sources.
|
DOI bot
|
2026-02-22T19:45:15.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
Beginning in 1995, early Internet pioneers proposed Digital Objects as encapsulations of data and metadata made accessible through persistent identifier resolution services. In recent years, this Digital Object Architecture has been extended to include the FAIR Guiding Principles, resulting in the concept of a FAIR Digital Object (FDO), a minimal, uniform container making any digital resource machine-actionable. Beginning in 2009, nanopublications were independently conceived as a minimal, uniform container making individual semantic assertions and their associated provenance metadata, machine-actionable. These two technologies share the same vision of a data infrastructure, and act as instances of Machine-Actionable Containers (MACs) that make use of minimal uniform standards to enable FAIR operations. Here, we compare the structure and computational behaviors of the existing nanopublication infrastructure, to those in the proposed FAIR Digital Object Framework. Although developed independently there are clear parallels between the vision and the approach of nanopublication and FDOF. We find a remarkable congruence between the currently proposed FDO requirements and the existing nanopublication infrastructure, including several FDO-like qualities already embodied in the nanopublication ecosystem.
|
DOI bot
|
2026-02-22T19:38:10.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
Climate change, vaccination, abortion, Trump: Many topics are surrounded by fierce controversies. The nature of such heated debates and their elements have been studied extensively in the social science literature. More recently, various computational approaches to controversy analysis have appeared, using new data sources such as Wikipedia, which help us now better understand these phenomena. However, compared to what social sciences have discovered about such debates, the existing computational approaches mostly focus on just a few of the many important aspects around the concept of controversies. In order to link the two strands, we provide and evaluate here a controversy model that is both, rooted in the findings of the social science literature and at the same time strongly linked to computational methods. We show how this model can lead to computational controversy analytics that have full coverage over all the crucial aspects that make up a controversy.
|
DOI bot
|
2026-02-22T18:57:18.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
Scientific communication still mainly relies on natural language written in scientific papers, which makes the described knowledge very difficult to access with automatic means. We can therefore only make limited use of formal knowledge organization methods to support researchers and other interested parties with features such as automatic aggregations, fact checking, consistency checking, question answering, and powerful semantic search. Existing approaches to solve this problem by improving the scientific communication methods have either very restricted coverage, require formal logic skills on the side of the researchers, or depend on unreliable machine learning for the formalization of knowledge. Here, I propose an approach to this problem that is general, intuitive, and flexible. It is based on a unique kind of controlled natural language, called AIDA, consisting of English sentences that are atomic, independent, declarative, and absolute. Such sentences can then serve as nodes in a network of scientific claims linked to publications, researchers, and domain elements. I present here some small studies on preliminary applications of this language. The results indicate that it is well accepted by users and provides a good basis for the creation of a knowledge graph of scientific findings.
|
DOI bot
|
2026-02-22T18:44:42.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
The number of scientific articles has grown rapidly over the years and there are no signs that this growth will slow down in the near future. Because of this, it becomes increasingly difficult to keep up with the latest developments in a scientific field. To address this problem, we present here an approach to help researchers learn about the latest developments and findings by extracting in a normalized form core claims from scientific articles. This normalized representation is a controlled natural language of English sentences called AIDA, which has been proposed in previous work as a method to formally structure and organize scientific findings and discourse. We show how such AIDA sentences can be automatically extracted by detecting the core claim of an article, checking for AIDA compliance, and – if necessary – transforming it into a compliant sentence. While our algorithm is still far from perfect, our results indicate that the different steps are feasible and they support the claim that AIDA sentences might be a promising approach to improve scientific communication in the future.
|
DOI bot
|
2026-02-22T18:34:21.000Z
|
||
|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
abstract
|
Scientific publishing seems to be at a turning point. Its paradigm has stayed basically the same for 300 years but is now challenged by the increasing volume of articles that makes it very hard for scientists to stay up to date in their respective fields. In fact, many have pointed out serious flaws of current scientific publishing practices, including the lack of accuracy and efficiency of the reviewing process. To address some of these problems, we apply here the general principles of the Web and the Semantic Web to scientific publishing, focusing on the reviewing process. We want to determine if a fine-grained model of the scientific publishing workflow can help us make the reviewing processes better organized and more accurate, by ensuring that review comments are created with formal links and semantics from the start. Our contributions include a novel model called Linkflows that allows for such detailed and semantically rich representations of reviews and the reviewing processes. We evaluate our approach on a manually curated dataset from several recent Computer Science journals and conferences that come with open peer reviews. We gathered ground-truth data by contacting the original reviewers and asking them to categorize their own review comments according to our model. Comparing this ground truth to answers provided by model experts, peers, and automated techniques confirms that our approach of formally capturing the reviewers' intentions from the start prevents substantial discrepancies compared to when this information is later extracted from the plain-text comments. In general, our analysis shows that our model is well understood and easy to apply, and it revealed the semantic properties of such review comments.
|
DOI bot
|
2026-02-22T17:05:54.000Z
|
||