Recent Changes - Search:

Project Status

Project Documents


Bug Tracking

edit SideBar


1. Introduction

This section will provide an outline of the purpose of this document, the scope of the product, and an overview of the rest of the Project Requirements.

1.1. Purpose

The Project Requirements will outline the functional and non-functional requirements of the Photizo project. It will also outline the user and system requirements. The intended audience for this document includes the Photizo development team; the project stakeholders (EE Internet and GW Scientific); as well as the members of my graduate committee: Dr. Knoke, Dr. Nance, and Dr. Roth.

1.2. Product Scope

GW Scientific currently collects sensor data from a wide range of data collection networks. This data is collected and displayed (both textual and graphical representations) on a variety of web pages, and is distributed to a variety of clients. Quality assurance (QA) and quality control (QC) on this data, however, if done, must be done by manual inspection. Needless to say, this is a labor intensive process. The goal of the Photizo project is to create a framework that will provide data processing, QA/QC, and data visualization, all in an automated fashion.

1.3. Document Organization

Section 2 will give an overall description of the software, including the product perspective, product functions, user characteristics, constraints, and assumptions. Section 3 details the software's interface requirements. Section 4 gives detailed descriptions of each of the product's functional requirements and features. Section 5 describes the product's nonfunctional requirements, including performance, maintainability, quality, documentation, security, and re-usability.

2. Overall Description

This section will give an overview of the product being specified and the environment in which it will be used, the anticipated users of the product, and the known constraints, assumptions and dependencies.

2.1. Product Perspective

This project is being developed as an original piece of software designed to run on a modern computer. While being an original piece of software, it will most likely use many Open Source components in its construction. As such, it will be released under and Open Source compatible license. After the initial development of the software, its maintenance will be the responsibility of EEI/GWS, probably by the primary author.

2.2. Product Functions

2.2.1. Sensor/Station Metadata

The software will maintain a collection of metadata on the stations and sensors, and will include facilities for editing this metadata.

2.2.2. Data Import

The software will import sensor data into the database

2.2.3. Data Processing

The software will do required post-collection processing using the metadata collection.

2.2.4. Data QA/QC

The software will do quality analysis/control on the collected data using the metadata collection as well as testing modules. This includes activating alerts if certain tests fail.

2.3. Data output

The software will produce web pages and graphs to visualize the data that has been imported.

2.4. User Classes and Characteristics

There will be two primary classes of users: The operators, who will maintain the reporting templates and graphs, as well as the sensor and station metadata; And the end users of the data, who will obtain the data either by reading generated reports or by the custom exports of the collected data.

2.5. Operating Environment

The software will intentionally be designed to be cross platform. Our choice for a programming language is currently Python, so should run on any platform with a Python implementation. In places where speed is a concern, such as numerical processing, optimized libraries (such as NumPy ) will be used; the goal being, that all speed limitations will be on the CPU and database back end.

2.6. Design and Implementation Constraints

2.6.1. Data Formats

The data coming from the data loggers can be in a variety of formats. The format of the file itself can vary (such as how many header lines there are, if any), as well as the meanings of the various columns of data. Accommodation must be made for the currently used formats, as well as having the flexibility to be adapted to future formats.

2.6.2. Data QC/QA tests

The data coming in must be run through a variety of checks. The software must have a way of defining and executing these tests outside of modifying the core code of the software (e.g. test plugins).

2.6.3. Speed

The software must process the incoming data, and produce the desired output in an acceptable amount of time. Since data usually comes in from a station once per hour, it is assumed that a worse case scenario would be the software taking 59 minutes to process the data. For purposes of requirements, however, an "outside threshold" will be considered to be 15 minutes. A time much lower than this is anticipated.

2.6.4. The Data Store Back-end

The imported data will most likely be stored in a database, thus, a general facility must be provided to map incoming sensor data onto a database schema that is both efficient (for the database) and easy to understand (for the programmer/user).

2.7. Assumptions and Dependencies

2.7.1. Compatibility

It is assumed the software will run on any reasonable platform. Since Python has been ported to most Unixes, Windows, Mac, OS/2, and even the Nokia Series 60 Cell Phone, we do not forsee the destination platform being a problem.

2.7.2. Database Connectivity

It is assumed the location in which the software is installed will have the connectivity required to utilize the needed back-end database.

3. External Interface Requirements

This section will describe how the software will connect to external components.

3.1. User Interface

3.1.1. Data Output

The data will be output, most likely to web pages, data files, and graphs. These will display in any compatible browser.

3.1.2. Metadata Editing

There will be an interface for editing the station and sensor metadata. It is as of yet undecided whether this will be a GUI (local) application or a web-based application, although preference is currently toward a web-based application, mainly because meta-data can then be edited anywhere.

3.2. Hardware Interfaces

The software will not directly any hardware interfaces as it is much higher up the communications stack. Actual data collection will be done by the data loggers, and will be accessed by reading the data files, not by any direct query of stations or sensors.

3.3. Software Interfaces

The software will connect to a back-end database via a database library, and will utilize other libraries for the processing of data, and the generation of data graphs. A browser will be used for editing metadata and viewing the output of the software.

4. Detailed Requirements (aka System Features)

This section contains the system features required in the software program, and includes detailed descriptions of each feature.

4.1. Station and Sensor Metadeta Entry/Editing

4.1.1. Description:

The software will be processing data from a variety of stations and sensors. The software must "know" certain information about these data points in order to make educated decisions regarding the processing and testing of the data. Thus, the software must have facilities for entering and editing the metadata about the stations and sensors.

4.1.2. Priority:

High. The sensors are central to what the whole project is about, so this requirement actually must be implemented before any of the other implemented requirements can be put to full use.

4.1.3. Functional Requirements: Facilities for entering and editing metadata about stations Facilities for defining meta-sensors and sensor selection tests Facilities for sensor data tests Required Station Metadata Required Sensor Metadata

4.2. Data import

4.2.1. Description:

The software must obtain data to process. Thus, there must be facilities to import a variety of data formats, probably most CSV-like formats, but that is not guaranteed. This data must be stored into a format that will be easily retrieved later.

4.2.2. Priority:

High. Processing data is another central item to the project, so is depended on by many other aspects.

4.2.3. Functional Requirements: Import from a wide range of formats must be supported. Initial formats will will include the CSV-like formats of the Campbell data loggers and the Outback power systems. A plugin system that can be used to dynamically define import formats without modifying the core import code. This feature will have to be investigated for feasibility. While I'm sure we can do it, whether it can be done in a way that is efficient, elegant, and easy to understand, is still open for debate. Some formats do not have headers that define column names. Thus, there must be a way to define column names externally from the imported file. In the files that do have column headers, there are an arbitrary number of "header" lines; the data may not necessarily begin on line two. Thus, there must be the ability to define on which line the desired "column header" lies, as well as on which line the data actually begins (i.e. which lines to ignore).

4.3. Data Processing

4.3.1. Description:

Data that comes in may be in a form that is not "human readable." The usual example is that of temperature sensors reading out in resistance values. This is especially seen in older data loggers as processing these values to human readable values at the data logger would take too much processing power. These values must then be converted once the readings arrive in the data files. In addition, the functions/equations needed to convert the values must be defined in the sensor metadata.

4.3.2. Priority:

High. Checking the sensor readings is central to the software.

4.3.3. Functional Requirements: The metadata for the sensors must contain the equation (if needed) to convert the readings from sensor values to human readable values. The equations must be easy to define and use, ideally being able to simply enter the equation, and the software will substitute the variable, such as "x". If an equation cannot be reduced to "x substitution," it will be implemented as a value-conversion plugin can is called with the given raw value. If data conversion is required, then the original value will be retained, and the converted value will be stored as well.

4.4. QA/QC Tests

4.4.1. Description:

Data coming in must go through a series of tests to make sure it is current; within operational parameters for the sensor; makes sense in context (e.g. seasonal temperatures); makes sense in a certain time period (e.g. a one hour temperature swing of 40 degree Celsius, while possible, probably isn't likely); and other criteria.

4.4.2. Priority:

High. This relates directly to the reason for this project: quality assurance on the data.

4.4.3. Functional Requirements: A sensor or group of sensors [most likely a station] can have a test defined on it. This test must be able to be defined without modifying the core code of the application A test plugin architecture is needed so tests can be devised and defined outside of core code. Two kinds of tests can be defined: A "push" test. This test will most likely be tied to a specific sensor, and will only be fed one value, which will be tested against a series of criteria, which might be passed to the test itself; or the test may request the criteria it needs to run the test (such as high/low values). A "pull" test. This will most likely be defined at the station or Meta Sensor level (see Meta Sensors below). This test will be queried by the software to determine what sensors, what values, and over what range (time or number) it wants to test. This must tie in with the diagnostics facility (see below) to report abnormal operation via the web page reporting and the out-of-band reporting.

4.5. Meta-sensors

4.5.1. Description:

In some cases, readings for the same information (e.g. temperature) may come from more than one sensor. This may be due to redundancy requirements (three sensors buried on concrete), or sensor range requirements (one of the sensor's readings is only valid in a certain range). Thus, there must be a way to define which reading takes precedence.

4.5.2. Priority:

High. While this will be a feature that is needed once the we go into production, the absence of this feature can be worked around. However, the "hooks" for this feature must be built in to the software from the ground-up, as not doing so will require a major rewrite of parts of the core when the feature is needed.

4.5.3. Functional Requirements: Facility to define/edit meta sensors, which will be groups of other sensors. A plugin architecture that can define tests which will determine which of the sensor values to use. This most likely will be "push" tests, as the tests are already expecting certain values and/or a certain number of values, and will return which value (or which sensor) to use when the test is called. The software must be able to treat the meta sensor as a normal sensor. I.e. a sensor value that can be used in a reporting medium such as a web page.

4.6. Public-facing reporting

4.6.1. Description:

Web pages with sensor readings and graphed sensor data will be produced and placed on public-facing web sites

4.6.2. Priority:

Medium. These will need to be produced, but there are many pieces that will have to be in place first before these pages can be produced. Thus, the "medium" priority is more an indication of needed dependencies.

4.6.3. Functional Requirements: Data must be extracted from the data store and displayed on web pages. Data must be extracted from the data store and rendered into graphs of readings All web interfaces must have editable headers and footers that can be maintained by non-programmers, either through web forms or by editing server-side includes (preferred) The software must create pages that are usable by search engines, via user bookmarks, and web statistics programs. Text configuration files must exist to allow non-programmers to edit graphs on basic level

4.7. Private-facing reporting

4.7.1. Description:

Web pages with diagnostic data, such as last report, panel temperatures, battery voltages, sensor-out-of-range warnings, as well as other pertinent data must be generated and placed on private-facing web sites.

4.7.2. Priority:

High. Diagnostics are an important part of reason why this project is in place.

4.7.3. Functional Requirements: Data must be extracted from the data store and displayed on web pages. Facilities must exist for defining/editing important sensors and ranges (e.g. the important ranges and error states for battery voltage). In addition to defining groups (such as projects and networks) it must also be possible to define the order in which stations are displayed on the diagnostics page.

4.8. Out-of-band reporting

4.8.1. Description:

In addition to the web-based reporting, the software must also support error reporting by e-mail and/or SMS.

4.8.2. Priority:

High. Alerts of adherent operation are desired.

4.8.3. Functional Requirements: As part of the sensor ranges and error states, a facility must exist for defining what messages are generated on what error states, and how those messages are disseminated. Example: On low battery (a condition defined via the diagnostics facility) generate a "Low Battery" message, and send it via the e-mail plugin to addresses,, etc. A plugin architecture to facilitate several notification backends. E-mail, SMS, even windows-popup come to mind. But adding more notification methods without having to touch the core code seems highly desireable.

4.9. Security/Login Facilities

4.9.1. Description:

The parts of this application that edit important data (station metadata, sensor metadata, test definitions, and the like) must be protected by a login facility to restrict these actions to known individuals.

4.9.2. Priority:

High. This is an integral part of the overall system, as the integrity of the data is very important to the project.

4.9.3. Function Requirements: Facility to define accounts, passwords, and the account's capabilities in the system. A plugin architecture for authentication. This would, for example, allow different username/password stores, such as LDAP. A plugin architecture for authorization is also an intriguing idea. This would require defining a structure for capabilities in the system, and determining how to query these capabilities in a generic way. This may be implemented, but possibly only a single plugin for local authorization information. This would be much more complicated than an authentication plugin, as even designing the architecture seems like it could be quite involved.

5. Nonfunctional Requirements

This section details the other, nonfunctional requirements of the software (Some of these are on the list thanks to WikiPedia )

5.1. Availability

The software must be available to process data whenever new data is received.

5.2. Documentation

The software must be documented. This includes all non-obvious code to be thoroughly commented, as well as a guide for the end users who will be maintaining station and sensor metadata. The usage of the output pages and graphs should be intuitive to anyone who knows how to use a browser and is familiar with the data being reported should need no additional documentation to use the web site.

5.3. Durability/Stability

The software must be written in such a way that when an error is encountered, the error is properly reported, and all transactions/operations are cancelled cleanly.

5.4. Maintainability

The software will be written in such a way that continuing maintenance will be straightforward. This relates to the the above point regarding documentation.

5.5. Modifiability/Adaptability

The software where maintain a high enough level of abstraction, where feasible and practical, such that subsequent changes to the behavior of the software will require the minimum amount of disruption to the core code.

5.6. Modularity

Various functionality of the software will be divided into various modules, with well defined interfaces. Plugin architectures will be implemented in areas where underlying rules or mechanisms may change.

5.7. Performance

As alluded to earlier, performance is an important part of this software, but not a critical part. Operations must be kept within a "reasonable time" in order to support timely processing of incoming data. Ideally, data will be available for public viewing within five to ten minutes after arriving from the data loggers.

5.8. Reliability

The software must produce correct results, ideally, 100% of the time. This will require the metadata and constructed tests to be correct.

5.9. Security

While the data will be available to anyone that comes to the web site to view the various pages produced, the editing interface to the metadata must be protected in such a way that only authorized users will be able to make changes to the metadata.

5.10. Simplicity/Understandability

The code should be written in such a way that is easily understandable, without the use of coding "tricks" or shortcuts. For example: if there is a choice between a shorter (or even better performing) piece of code that is more opaque than a longer, easier to understand piece of code, the longer code would be used, unless there is a significant (order of magnitude or better) performance penalty.

5.11. Testability

As the phrase goes, "Who tests the testers?" The software will be written using Test Driven Development (TDD). All modules will have tests written against them (preferably even before the modules themselves are written). These test will make sure the modules behave correctly under a range of input values, both valid, and invalid, values. (:notoc:)

Edit - History - Print - Recent Changes - Search
Page last modified on April 04, 2007, at 11:15 PM