PapcoDoc/documents/CommentBlock
From PapcoWiki
PaPCo IDLDoc conventions
The purpose of this document is to identify and define conventions for comment blocks in PaPCo IDL codes. These conventions are important as they allow for automatic mechanisms to utilize the documentation, and of course as an aide to developers. IDL's comment block conventions have not been used in papco, so the hope is to find a convention that adopts some of the IDL conventions. Also, there are shortcomings in the IDL conventions we can address.
Here is an example code block from one of the papco codes:
;******************************************************************************
;* NAME: PAPCO_DS_COLLAPSE
;*
;* DESCRIPTION: collapse a dataset along one or more dimensions, reducing rank, by
;* averaging or totalling along the dimension.
;*
;* INPUTS:
;* DS_IN, {papco dataset}, a papco dataset of rank N to be reduced.
;* DIMS, strarr, the dimensions to collapse
;*
;* KEYWORDS:
;* TOTAL, int, if keyword_set don't normalize the results. By default, the collapsed dimension is averaged.
;* SL_TIMERANGE, {papco_timerange}, is ignored as input and used for output.
;*
;* OUTPUTS:
;* NAME, type, positional parameter description. Note may also appear in INPUTS if input as well.
;*
;* OUTPUT KEYWORDS:
;* NAME, type, description. Note may also appear in KEYWORDS if input as well.
;*
;* RETURNS:
;* {papco dataset}, a papco dataset of rank N-1. This dataset will have
;* additional properties '<dim>_range' where <dim> is each dimension
;* reduced.
;*
;* SIDE EFFECTS:
;* SL_TIMERANGE, {papco_timerange}, contains papco_timerange object if a collapsed dimension had time units.
;*
;* EXCEPTIONS:
;* IDL_M_BADARRSUB, if the dimensions are specified and they are out-of-bounds
;*
;* EXAMPLES:
;* x= papco_ds_collapse( ds, 'time' )
;* x= papco_ds_collapse( ds, time=[40, 50] )
;*
;* UNIT TEST: test_papco_ds_collapse.pro
;*
;* CURATOR: Jeremy Faden
;*
;* CATEGORY: dataset model
;*
;* CVSTAG:
;* $Name: $
;* $Revision: $
;*
;* HISTORY:
;* March 2006, 1.3, written by Jeremy Faden
;* Jan 2007, 1.7, added total keyword
;*
;******************************************************************************
Sections
- NAME: identifies the name of the function or procedure.
- FUNCTION: <name>[, <parameter list>] also identifies the name, or may be followed on the next line by the entire signature.
- PROCEDURE: <name>[, <parameter list>] also identifies the name, or may be followed on the next line by the entire signature.
- DESCRIPTION: a short description of the purpose of the procedure. Indentation implies line continuation.
- CALLING SEQUENCE: inclusion of calling sequence is typically discouraged, this is redundant when all the parameters are inputs. Tools that use the doc blocks should use "ROUTINE_INFO" to derive the calling sequence. This should be used when argument list contains both inputs and outputs and the separate INPUTS and OUTPUTS section would be ambiguous. If used, it should look something like this: Result = BILINEAR(P, IX, JY).
- INPUTS: a list of the positional parameters for the procedure. Name of the parameter should be CAPITALIZED, so that references to the parameter will be clear. Also, the type of the parameter should follow the name. Types include string, strarr, int for example. Structures should be identified by struct or colloquial type, such as "papco dataset". Alternatively, an example value can be provided. Last, a short description of the parameter should follow. It seems that we should identify keywords and parameters as input, output, and optional.
- NAME, [optional|required] TYPE[|TYPE|...][=DEFAULT][, DESCRIPTION]
- see types below
- KEYWORDS: a list of the keyword parameters for the procedure. These should be formatted just as in inputs.
- RETURNS: for functions, describe the returned value. Just as with the inputs, the type is identified, along with a description.
- OUTPUTS: a list of positional parameters that output data. Note a parameter can be both an input and an output (though this is probably bad form).
- OUTPUT KEYWORDS: results returned by keyword.
- STRUCT TAGS: tag usage, no distinction between input and output (read and write).
- INPUTS STRUCT TAGS: tag usage
- NAME.TAG, [optional|required] TYPE[|TYPE|...][=DEFAULT][, DESCRIPTION]
- NAME1.TAG1, NAME1.TAG2, etc.
- OUTPUTS STRUCT TAGS: tag usage
- RETURNS TAG
- EXCEPTIONS: conditions that will cause the routine to stop, that can be caught with a catch command. Refer to !error_state.name to uniquely identify the error type. The developer should generate the error, then do "help, /struct, !error_state"
- IDL_M_USER_ERR is when the message command is used without /cont or /info
- IDL_M_BADARRSUB is index out of bounds.
- SIDE EFFECTS: indicate changes to common blocks, input parameters, or object state.
- EXAMPLES: example code that shows parameter usage.
- UNIT TEST: indicates code that is the automated test for this routine. These make good, working, use examples.
- USAGES: list of routines using this code. This should serve only as a reference and must not be considered a complete list.
- CATEGORY: group identifier for generating a book of documentation. See "categories" below.
- CVS TAGS: Use the CVS keywords here so that revision identification number is put in. This is useful for identifying patches and local modifications to supplement CVS capabilities.
- CURATOR: knowledgeable person responsible for the maintenence of code.
- HISTORY: description of interface changes over time. Do not identify implementation changes (such as bug fixes) here, that's what the CVS is for. These should include version numbers so compatibility can be checked.
- DEPRECATED: mark the routine as deprecated, and suggest alternative
Name, Procedure, Function
IDL's doc convention calls for a tag "NAME" which identifies the name the documentation belongs to. Legacy documentation blocks in papco use "PROCEDURE" and "FUNCTION" as well, which contain both the name and the parameter lists. Both will be supported by papco_make_html_documentation. Also IDL's convention has the name following the NAME tag, and we support both this and the inline version.
Inputs, Keywords, Outputs, Output Keywords and Returns
- Inputs are required unless optional is indicated.
- Keywords are optional unless required is indicated.
- Output keywords should be used when keywords return values. This indicates that a named variable (not an expression) must be used in the procedure call.
- f= find_file( '*', count=c )
- Output should be used then positional parameters return values. Note good form dictates that output parameters should be last in the parameter list, and should generally be avoided.
- RETURNS is used for functions and this tag should not exist for procedures.
Argument Lists
The INPUTS, OUTPUTS, KEYWORDS, and OUTPUT KEYWORDS sections specify parameter lists. These specify parameter name, type, and a description. IDL is a loosely typed language, but often it is helpful to constrain parameter type in the documentation to avoid runtime errors.
NAME, TYPE, DESCRIPTION
For example:
DATA, dblarr(3,3), the matrix to reduce.
When the parameter name is surrounded by angle brackets (<>) and the keyword _extra is used, then these are EXTRA KEYWORDS and their use should be described in the EXTRA_KEYWORDS section.
Optional and Required Parameters
Keywords are optional unless required is indicated:
KEYWORDS:
AVG_TYPE, required string, the averaging type used to reduce data. Values are 'log' 'mod24' or 'linear'
optional can be used where required might be assumed, and a default value is indicated like such:
KEYWORDS:
AVG_TYPE, optional string='linear', the averaging type used to reduce data. Values are 'log' 'mod24' or 'linear'
Positional parameters (under the INPUTS section ) are required unless optional is indicated.
types
IDL is a loosely typed language. This has advantages, but relies more on documentation when variable types are required. This is a list of types that should be used to constrain variable types in documentation. Any input that can be automatically converted to the required type is valid.
- byte, int, long, float, double, complex, etc = a scalar of the given type
- intarr a 1D array of unconstrained dimension
- intarr[3] a 1D array of constrained dimension
- intarr[*,*] a 2D array with no constraints on dimension sizes.
- boolarr, array of booleans
- boolean, integer where 0 is false, 1 is true.
- keyw, keyword_set(kw) is used, explicitly indicating <undefined> is valid. This is equivalent to "optional boolean=false" or just "boolean" in the KEYWORDS list.
- struct, A structure and tag descriptions should follow.
- { structure_name } a named structure
- { struct name } a colloquial structure name. e.g. { papco dataset }. See section below on convention for identifying tags that are used.
- *{structure_name} a pointer to a structure
- object_name:: an object type name.
- <object_name> an object type name.
- *TYPE a pointer to the type.
structure tag use
It's useful to identify which parts of a structure (such as papco datasets) are used. This should be identified using the STRUCT TAGS section:
INPUTS:
DS, { papco dataset }, the dataset identifying the bins.
INPUTS STRUCT TAGS:
DS.data, DS.bins, DS.bin_width
Special Notes for PaPCo DataSets
Papco datasets are papco's internal data model, and often appear in argument lists. They are a structure with a prescribed schema and conventions.
When they appear in argument lists, they should indicate the allowed rank of the dataset, for example:
INPUTS:
DS, { papco dataset }, a rank N dataset. N=1,2,3,4+
In this example, datasets of rank 1,2,3, and 4 are known to work, and the plus (+) indicates that the code doesn't appear to introduce any limitations on rank.
Specific dataset tags that are supported should be documented in the "INPUT STRUCT TAGS" and "OUTPUT STRUCT TAGS" section.
EXTRA KEYWORDS
Extra keywords should be documented in the KEYWORDS section along with the internal name, prefixed by extra=<NAME>. This is because often this internal name hints at its use, and allows further documentation via INPUT STRUCT TAGS section.
KEYWORDS: _extra=E, struct, extra keywords are passed to IDL's plot command INPUT STRUCT TAGS: E.xrange, fltarr(2), the xaxis range. E.xtype, boolean, 1 if the axis should be log.
This is from papco_ds_trim, where the extra keywords identify dimensions to trim:
INPUTS:
DS_IN, { papco dataset }, rank N papco dataset, N=1,2,3,4
KEYWORDS:
_extra=E, struct, extra keywords identify tags.
INPUT STRUCT TAGS:
E.*, intarr(2), dimension to be trimmed first element is the start, end element is the end (inclusive)
DS_IN.CORRELATE_*, { papco dataset }, correlated datasets are trimmed as well.
This is also intended to support IDE's (e.g. Netbeans and nbidl) with automatic completion. Since the code treats the extra parameters as a structure, this is consistent.
exceptions
This is a place for people to put in runtime exceptions that may occur. Any time "message" is called in a routine which would stop IDL's execution (no /info, etc.), this should be documented here. Also, clients using a code who run into an exception thrown by the code that should be documented should feel welcome to add the exception to the list.
categories
To give some order to the routines, we identify categories. Here they are:
- datasets.model: Build up the internal model for papco datasets and provide utility functions.
- datasets.operators: Functions that operate on datasets, such as slice and collapse.
- datasets.plotting: Routines for plotting papco datasets.
- opsys: operating system utilities
- lib.timetags: time utilities
- lib.logger:
- lib.monitor:
- lib.plotting:
- lib.misc: miscellaneous bin
- lib.string: string functions
- core.model:
- core.ui: user interface routines
- developer: routines to aid in papco development
- developer.runtime: routines for supporting runtime papco development
prose voice
Prose descriptions should be written in 3rd person (descriptive) not 2nd person (prescriptive). For example:
Gets the label. (preferred) Get the label. (avoid)
Incomplete sentences are preferred for brevity, and the subject is implied:
ROT90, keyw, rotates output string by 90 degrees
(ROT90) rotates output string by 90 degrees. And sometimes "is a" is also implied:
PANEL, intarr[3], vector for position of panel
(PANEL is a) vector for position of panel.
(Taken from java doc writing conventions, http://java.sun.com/j2se/javadoc/writingdoccomments/#styleguide)
public vs private
It's useful to be able to define "private" routines that are only for use within a module. Private routines don't need a documentation block as described above, since they aren't for general use. In fact, having a documentation common would generally imply that the routine is public and useful (and worth the burdon of reading through the documentation). However if you wish to document a private routine using the above conventions, add a doc block field "ACCESS: private". Also these routines should be prefixed "p_<name>".
unit test
A note on "unit tests." A unit test is a code that tests another code. This is the name of a function that ideally can exercise every branch of the tested code and quickly indicate when bugs are introduced. Unit tests are extremely important in a loosely typed language like IDL where the compiler does little to enforcing code correctness and most every coding error is only detected at runtime.
Also, unit tests make excellent, working code examples, since they precisely test a code. This is why a unit_test, if available, should be mentioned in the documentation.
Creating HTML Documentation
Since Papco's document blocks don't follow IDL's plus/minus convention, a hacked version of mk_html_help is found in PAPCO_HOME/papco_lib/papco_make_html_help.pro. Here is its documentation block:
;******************************************************************************* ;* NAME: PAPCO_MAKE_HTML_HELP ;* DESCRIPTION: scrapes thorugh IDL files creating HTML documentation for papco's doc block conventions. ;* INPUTS: ;* SOURCES, string|strarr, directories to search ;* OUTFILE, string, html output file. ;* KEYWORDS: ;* VERBOSE, keyw, print version progress info ;* TITLE, string, provide a title ;* STRICT, keyw, html symbols are escaped, e.g. '<' -> '<' ;* STOPS, strarr, array of places to stop for debugging, each element in the format "<strmatch>:<int linenum>" ;* RECURSIVE, keyw, recurse through sub directories as well. ;* EXCLUDE, string|strarr, patterns that if strmatch is true, is excluded. ;* SIDE EFFECTS: ;* EXCEPTIONS: ;* EXAMPLES: ;* papco_make_html_help, '/media/mini/nbprojects/idl/papco/working_mar09/papco_lib/dataset', '/home/jbf/temp/papco_dataset.html', /recurs ;* UNIT TEST: ;* CVSTAG: ;* $Name: $ ;* $Revision: 1.4 $ ;* CURATOR: jbfaden ;* HISTORY: ;* Apr 15 2007, 1.1, hacked from IDL mk_html_help by Jeremy Faden ;* May 17, 2007, 1.4, added EXCLUDE and RECURSIVE keywords, jbf ;*******************************************************************************
parting thoughts
- brevity is paramount. The fewer words the better. Documentation is worthless if it cannot be consumed by a human. But,
- completeness is required.
examples
where
The command "where" should be familiar and is used to show papco documentation conventions. Compare it to IDL's documentation.
;******************************************************************************* ;* NAME: where ;* DESCRIPTION: The WHERE function returns a vector that contains the ;* one-dimensional subscripts of the nonzero elements of Array_Expression. The length ;* of the resulting vector is equal to the number of nonzero elements in Array_Expression. ;* Frequently the result of WHERE is used as a vector subscript to select elements of an ;* array using given criteria. ;* INPUTS: ;* ARRAY_EXPRESSION, any type of array, The array to be searched. Both the real and imaginary ;* parts of a complex number must be zero for the number to be considered zero. ;* KEYWORDS: ;* L64, keyw, 1 indicates the result should be a 64 bit integer ;* thread pool keywords, see thread pool keywords. ;* OUTPUTS: ;* COUNT, long, the number of nonzero elements found in Array_Expression. ;* OUTPUT KEYWORDS: ;* COMPLEMENT, lonarr|int, returns the subscripts of the zero elements or -1 if no elements are found ;* NCOMPLEMENT, int, return the number of zero elements found in array ;* RETURNS: ;* lonarr|int, the subscripts of non-zero array elements or -1 if no elements are found. ;* SIDE EFFECTS: ;* EXCEPTIONS: ;* EXAMPLES: ;* array = INDGEN(10) ;* B = WHERE(array GT 5, count, COMPLEMENT=B_C, NCOMPLEMENT=count_c) ;* print, array[B] ;* UNIT TEST: ;* CVSTAG: ;* $Name: $ ;* $Revision: $ ;* CURATOR: NAME ;* HISTORY: ;* DATE, REV, written by NAME ;*******************************************************************************
papco_make_choice_names
This example has a keyword TITLE that is both input and output. This is bad practice, but is well-documented with papco convention:
;****************************************************************************** ;* FUNCTION: papco_make_choice_names, INSTR ;* ;* DESCRIPTION: converts the "_info" tag information from a module's ;* control structure to a list of choice names fo button widgets. ;* ;* INPUTS: ;* INSTR, string, _info string of control block. For example: ;* 'HEO Satellite Name; 0: heo_1, 1: heo_3' ;* ;* KEYWORDS: ;* TITLE, boolean, return title of choice ;* ;* OUTPUT KEYWORDS: ;* TITLE, string, the title of the choice ;* ;* RETURNS: ;* strarr, list of choice names ;* ;* HISTORY: written June 2003, Reiner Friedel ;******************************************************************************

