CBI-DNR maintains a central repository for all data collected by TCOON and other CBI environmental networks. The varied applications for CBI's environmental data sets present diverse requirements for data management. For example, the computation of tidal datums used in determination of property boundaries requires detailed analysis of long-term data sets in accordance with NOS standards and sufficient audit capability to defend the accuracy of the datums in legal contexts. Recreational and lay users desire easy-to-understand presentations of data (e.g., graphics or summaries), while scientists need access to the raw data in a form that can be easily imported into models or research projects. Applications such as marine navigation and weather forecasting need near-real-time access to data sets and automated quality-control systems.
Since 1991, the Conrad Blucher Institute for Surveying and Science has placed data management as a "mission critical" component of its observation networks. The Division of Nearshore Research at CBI (CBI-DNR) is principally responsible for data management at CBI. CBI-DNR recognizes that the success of its observation network efforts depends on the quality of the end products. Because many of CBI-DNR's products are used to determine property boundaries and support engineering designs, it's possible that these products will need to be defended in a court of law. Therefore, CBI-DNR maintains detailed records and audit trails for all of the steps used in the creation of its data products. Electronic data management and highly automated systems have been the keys that allow CBI-DNR to achieve these results within limited budgets. The data management strategy used by CBI-DNR can be summarized by these design principles.
Rigorous adherence to these design principles has produced a system that is robust, stable, and flexible enough to accommodate a wide variety of observational-data needs and changes in requirements. In its nine-year history, CBI-DNR's data acquisition and reporting system has been able to quickly and cleanly adapt to changes in sensor packages, hardware environments, operating systems, database management systems, and communications environments. Our present data management system is running on a 1 GHz Pentium-processor based personal computer using the Linux operating system and open-source software packages such as Perl, Apache, and MySQL. The overall architecture of the data management system can be divided into loosely integrated subsystems.
Each business morning, one or more CBI-DNR personnel perform additional quality control by visually inspecting recently received data in the online database. This is facilitated by a Web-based interface that automatically graphs the previous fourteen days' collected data for each station in the network. An analyst detecting a potential problem in the network can use this same interface to enter a message into the online system indicating that a problem or anomaly needs to be investigated and/or corrected. These quality-control messages are then distributed daily to field operations staff and management, who then arrange for necessary repairs and recovery of missing data. Operators also have the ability to suppress distribution of erroneous data that may not have been detected by the automated quality-control systems.
CBI-DNR's extensive use of automation has resulted in a cost-effective, reliable, and flexible implementation of data management. Data acquisition, archiving, and distribution take place autonomously with only occasional operator intervention in cases of platform malfunction or data transmission errors. The daily data inspection results in timely platform repairs and excellent data quality. A CBI-DNR staff member can generally perform a complete inspection of the data from all stations in the network in less than an hour.
Furthermore, the use of automated systems for the majority of the data processing tasks makes it possible to provide environmental data to end-users in near-real time. Observations that pass the automated quality-control features of the system are generally made available to end-users within seconds of the data's arrival at CBI. For stations equipped with radio transmission facilities, this means that data are typically available to end-users within fifteen minutes of the actual time of measurement.