Traditionally files have been kept in a variety of time formats: Per orbit, per day, per month - whatever. Originally PaPCo was written with CRRES in mind, where everything went by orbit. In fact, that is still the only way you can access CRRES data - and for CRRES, it makes little sense to plot the data any other way.
For ISTP the decision has been made to have data in files per day. Consequently, there is the need for a generic read-routine that will take as input either a time range or an orbit number; this routine will then internally and automatically select and concatenate the required files and hand back a continuous data array of the required length.
PAPCO provides the time range required in the form of a common block in MJDT format (modified Julian Date plus seconds since start of that day, a structure), containing end and start time.
common mjdt, mjdt_start, mjdt_end
There are conversion routines in
convert this to other time formats.
All that PaPCo will provide is this common block, and the orbit number which is passed by parameter. It's up to the user to provide a read-routine that can handle this input.
Your read routine needs to know where the data is - in which directory, or
even at which site. PaPCo provides some functionality for this. In each module
there is a
defaults.config file that contains a list of environmental
variables and their default settings. These are used to contain the data paths
needed by the read routine, and can be interactively modified through the
module's panel editor (see Section 6.4.1). So it makes sense
not to hard-code any paths that are site-dependent, but rather to use
Further, it is desirable to have data in a format which can be read as quickly as possible (this problem has already been discussed in Section 1.6.1).
A good example of how this problem is solved in practice, is the use of Los Alamos geostationary data (courtesy of Dick Belian and Geoff Reeves) which was used as part of a joint study. Here CRRES data was being plotted by orbit, and the need arose to have Los Alamos data for the corresponding period. Los Alamos data is supplied as ASCII files per day, which are then further compressed using gzip to save disk space.
The following procedure was adopted and implemented in code. Zipped raw data was kept in mass storage, and IDL-binary data for fast-read in a local directory. When the routine was called to return data for a given time period the following actions were performed:
Using this procedure, only wrapper routines (using existing ASCII read-routines) had to be written for the Los Alamos data to be integrated into PaPCo. The whole data compatibility problem is solved by the simple method of reading slow once only, and then fast, using IDL-binaries. This does produce an overhead and needs extra disk space. In practice, we batch-process a given data set to produce the required IDL-binaries and archive the original data.
The procedure described here, however, is not prescribed. The user may do things any which way he/she wants. The current scenario has been adopted to maximize the speed of reading in large data sets - but any read-routine which presents to the corresponding plotting routine data for the requested time interval will do. Since both are user-written, PaPCo has no place in prescribing how things are done. All we can do is show how things have been done in PaPCo history and suggest reasons for why this might have been advantageous.
With PaPCo use becoming more widespread, modules need to be written with portability to other platforms / architectures in mind.
One way of doing this is to use a data format which is portable - such as CDF or IDL savesets. At least the data which are made public through a PaPCo module should be portable - or the read routine in the module should be able to read the data no matter from which platform or architecture.
This is particularly important if the module makes uses of the remote get data facility, which should be encouraged.
As a further extension PaPCo now provides a routine to ``fetch'' data from a remote site using the GNU freeware program ``wget''. You can now write your read routine in such a way that if the data is not found locally, it is copied via ftp from a remote site (see Section C.2)!
This feature is under development and has so far only been implemented under UNIX. The GNU wGet facility was chosen as a ``vehicle'' because it's free, and available for most platforms. It has so far not been tested under VMS or Windows 95.
PaPCo provides a set of routines in
papco_XX/papco to interface with the wGet program and to provide status
reports to PaPCo of data being downloaded. This interface depends on
interrogating the wget log files produced. The version of wGet used is:
GNU Wget/1.4.5 by Hrvoje Niksic <firstname.lastname@example.org>
As this is considered to be an extremely powerful feature of PaPCo further development in this area will take place with the aim of full portability.
Instead of obtaining data via an ftp utility such as wGet the preferred way would be to provide remote mount points for data on your system, so that a remote site ``looks'' like a normal directory path which you can configure your module to use.