Go to the VSTO Data Portal

VSTO


Project Description

The goal of this project is to develop at the National Center for Atmospheric Research (NCAR), based within the High Altitude Observatory (HAO) and in collaboration with the Scientific Computing Division (SCD), a prototype for a Virtual Solar-Terrestrial Observatory (VSTO).

This projects addresses key science areas such as solar-terrestrial physics datasets, the highly interdisciplinary Center for Integrated Space-Weather Modeling (CISM) model intercomparison, providing a framework for collaboration and a basis for building and distributing advanced data assimilation tools for the solar-terrestrial physics community. This project will directly addresses key needs in Cyberinfrastructure (CI) such as software tools and services, interdisciplinary data integration, representation, metadata, documentation, quality control and user community building.

The term Cyberinfrastructure has been given to the set of reliable, well-specified and interoperable connections of electronic hardware and software that allow people to discover, learn, teach, collaborate, disseminate, access and preserve knowledge in their domain. Solar and solar-terrestrial physics utilizes a balance of observational data, theoretical models and analysis/interpretation to make effective progress. For some years the concept of the Virtual Observatory (much like the emergence of digital libraries in the 1990's) has attracted attention in a number of specific scientific disciplines, notably and successfully in nighttime astronomy after the decadal study for Astronomy and Astrophysics made the Virtual Observatory a priority.

Since many of the data collections are increasingly growing in volume and complexity, the task of truly making them a research resource that is easy to find, access, compare and utilize is still a very significant challenge of high merit to discipline researchers.

The plan for VSTO is to address the next logical and intellectual challenge: that of an interdisciplinary virtual observatory which requires research in computer science areas such as knowledge representation and ontology development.

Another strategic focus for the community and NCAR is data assimilation set within the world of distributed models and data. To successfully address this need requires a coordinated and a strongly metadata-enabled approach to data collections; one that does not exist today, although key cyberinfrastructure elements are being assembled. An underlying premise is that the requirements to make interdisciplinary data available within the VSTO will enable a set of data assimilation tools to be built into it.

Definition

The prototype Virtual Solar-Terrestrial Observatory (VSTO) is a distributed, scalable education and research environment for searching, integrating, and analyzing observational, experimental and model databases in fields of solar, solar-terrestrial and space physics (hereafter referred to as SSTSP).

VSTO comprises a framework which provides virtual access to specific SSTSP data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use. The prototype will be a fully functional system directly addressing the immediate and substantial needs within the SSTSP community, allowing science projects to advance more rapidly. E.g. in solar coronal physics there is a need to cohesively assemble multiwavelength images of the dynamic solar upper atmosphere.

Space weather model intercomparisons, and Assimilative Mapping of Ionospheric Electrodynamic results need to be distributed to their communities. Solar activity indicators and solar total and spectral irradiance data and models need to be made available to terrestrial atmospheric researchers for use in their studies.

We plan to further specify and document these substantial needs early in the project in conjunction with community input/meetings and a steering committee comprising data providers, users, agency representatives and domain experts in computer science.

What problem does VSTO address?

In discussions with data providers and users, the needs are clear: ``Fast access to `portable' data, in a way that works with the tools they have; information must be easy to access, retrieve and work with.''

Too often users (and data providers) have to deal with the organizational structure of the data sets which varies significantly --- data may be stored at one site in a small number of large files while similar data may be stored at another site in a large number of relatively smaller files.

There is an equally large problem with the range of metadata descriptions for the data. Users and providers are still frustrated with the use of data and knowing or specifying its heritage. Users often only want subsets of the data and struggle with getting it efficiently. In summary one user expresses it as: ``(Please) solve the interface problem.'' VSTO addresses this specific problem.

Since there are an increasing number of discipline-specific virtual observatories either operating or under development, we propose to scope this interdisciplinary virtual observatory as follows: the research fields of solar, solar-terrestrial and space physics are substantially interdisciplinary already, the demand to provide an interoperable information exchange infrastructure is increasing, the practical needs of data providers and data consumers are clear, and this project provides an opportunity at addressing a fundamental computer science question of knowledge representation across disciplines.

It is our expectation that progress in building a VSTO will shed light on how virtual observatory principles can be applied to a variety of other scientific disciplines and, in fact, bridge disciplines.

It is also our premise that a substantially positive impact will result for the SSTSP community through this prototype because there is both a significant amount of scaffolding already in existence and the next steps and tools to address the interdisciplinary problem are now clear and available (and are presented within this proposal).

The specific tasks proposed are divided into two areas: cyberinfrastructure development, and the demonstration of this infrastructure in the current science data and modeling environment of HAO.

Examples of use - what a user would see

A student browsing the educational materials within the VSTO may find a composite picture of the outer solar atmosphere built up from full disk, near limb prominence data, and coronameter images for a typical representation of a coronal mass ejection. That student could then browse the current days' images, assemble a similarly constructed composite image and note that a visually much larger CME has just occurred. Further, the analysis model which measures CME speed is listed as a tool and the student is able to connect the relevant data sets into that model and estimate the speed. All of this is performed without the need for details of the dataset or how to run the model, but all the descriptions of the images and the model accompany what the student sees.

As a second example:

A researcher has just joined the CISM project, contributing a new model of magnetospheric response to solar wind forcing during a time when the IMF is southward pointing. He/she is able to register the model outputs with VSTO, filling in some forms to describe the meaning of their data variables, units, etc. and is very quickly able to intercompare with at least two other models in the holdings. The comparison is provided graphically as well as in RMS variances from the other models. The researcher is also able to directly read the other model output into their own copy of IDL and make more detailed comparisons to determine the underlying physics differences that affect the model results. All of this is performed without the need for details of the dataset or how is it organized.

Examples of use - what a data provider would do

A new synoptic stream of images of the solar atmosphere has just become available, one that substantially adds to a space weather forecasters ability to predict significant events. The data provider is able to register the model outputs with VSTO, filling in some forms based on a template from a similar/earlier dataset to document the meaning and use of their data quantities, units, derivation, etc. Shortly afterwards, at a space weather forecast center whose forecast system interfaces to VSTO, this image stream appears in a list of resources of interest in event forecasting. Most importantly, neither the data provider nor the ultimate data consumer need be troubled with the data formats, application interfaces or that the use from either end is correct.

Next ... Key Concepts (Ontologies, Web Services, Data Assimilation, Interdisciplinary science)