.ds CP 2003-2004 .ds TC \fCtconfpy\fP .TH TCONFPY 3 "TundraWare Inc." .SH NAME tconfpy.py Configuration File Support For Python Applications .SH SYNOPSIS It is common to provide an external "configuration file" when writing sophisticated applications. This gives the end-user the ability to easily change program options by editing that file. \*(TC is a Python module for parsing such configuration files. \*(TC understands and parses a configuration "language" which has a rich set of string-substitution, variable name, conditional, and validation features. By using \*(TC, you relieve your program of the major responsibility of configuration file parsing and validation, while providing your users a rich set of configuration features. If you run \*(TC directly, it will dump version and copyright information, as well as the value of the current predefined System Variables: .ft C \" courier .nf python tconfpy.py .fi .ft \" revert .SH DOCUMENT ORGANIZATION This document is divided into 4 major sections: .B PROGRAMMING USING THE \*(TC API discusses how to call the configuration file parser, the options available when doing this, and what the parser returns. This is the "Programmer's View" of the module and provides in-depth descriptions of the API, data structures, and options available to the programmer. .B CONFIGURATION LANGUAGE REFERENCE describes the syntax and semantics of the configuration language recognized by \*(TC. This is the "User's View" of the package, but both programmers and people writing configuration files will find this helpful. .B ADVANCED TOPICS FOR PROGRAMMERS describes some ways to combine the various \*(TC features to do some fairly nifty things. .B INSTALLATION explains how to install this package on various platforms. This information can also be found in the \fCREAD-1ST.txt\fP file distributed with the package. .SH PROGRAMMING USING THE \*(TC API \*(TC is a Python module and thus available for use by any Python program. This section discusses how to invoke the \*(TC parser, the options available when doing so, and what the parser returns to the calling program. One small note is in order here. As a matter of coding style and brevity, the code examples here assume the following Python import syntax: .ft C \" Courier .nf from tconfpy import * .fi .ft \" revert If you prefer the more pedestrian: .ft C \" Courier .nf import tconfpy .fi .ft \" revert you will have to prepend all references to a \*(TC object with \fCtconfpy.\fP. So \fCretval=ParseConfig(...\fP becomes \fCretval = tconfpy.ParseConfig(...\fP and so on. You will also find the test driver code provided in the \*(TC package helpful as you read through the following sections. \fCtest-tc.py\fP is a utility to help you learn and exercise the \*(TC API. Perusing the code therein is helpful as an example of the topics discussed below. .SS API Overview The \*(TC API consists of a single call. Only the configuration file to be processed is a required parameter, all the others are optional and default as described below: .ft C \" Courier .nf from tconfpy import * retval = ParseConfig(cfgfile, InitialSymTable={}, AllowNewVars=True, Templates={}, TemplatesOnly={}, LiteralVars=True, ReturnPredefs=True, Debug=False ) .fi .ft \" revert where: .TP .B cfgfile (Required Parameter - No Default) The the name of a file containing configuration information .TP .B InitialSymTable (Default: \fC{}\fP) A prepopulated symbol table (a Python dictionary). As described below, this must contain valid \fCVarDescriptor\fP entries for each symbol in the table. .TP .B AllowNewVars (Default: \fCTrue\fP) Allow the user to create new variables in the configuration file. .TP .B Templates (Default: \fC{}\fP) This option is used to pass variable templates to the parser. If present, \*(TC expects this option to pass a data structure in the same format as a symbol table. i.e., You must pass a Python dictonary whose keys are the names of the variable templates and each entry must be a \fCVarDescriptor\fP object. See the section below entitled, .B Using Variable Templates for all the details. By default, no variable templates are passed to the parser. .TP .B TemplatesOnly (Default: \fCFalse\fP) If this option is set to \fCTrue\fP, \*(TC will not permit a new variable to be created unless a variable template exists for it. By default, \*(TC will use a variable template if one is present for a new variable, but it does not require one. If a new variable is created, and no Template exists for it, the variable is just created as a string type with no restrictions on content or length. When this option is set to \fCTrue\fP, then a template .B must exist for each newly created variable. .TP .B LiteralVars (Default: \fCFalse\fP) If set to \fCTrue\fP, this option enables variable substitutions within \fC.literal\fP blocks of a configuration file. See the section in the language reference below on \fC.literal\fP usage for details. .TP .B ReturnPredefs (Default: \fCTrue\fP) \fCtconfpy\fP "prefefines" some variables internally. By default, these are returned in the symbol table along with the variables actually defined in the configuration file. If you want a "pure" symbol table - that is, a table with .B only your variables in it - set this option to \fCFalse\fP. .TP .B Debug (Default: \fCFalse\fP) If set to \fCTrue\fP, \*(TC will provide detailed debugging information about each line processed when it returns. .TP .B retval An object of type \fCtconfpy.RetObj\fP used to return parsing results. .SS The Initial Symbol Table API Option The simplest way to parse a configuration file is just to call the parser with the name of that file: .ft C \" Courier .nf retval = ParseConfig("MyConfigFile") .fi .ft \" revert Assuming your configuration file is valid, \fCParseConfig()\fP will return a symbol table populated with all the variables defined in the file and their associated values. This symbol table will have .B only the symbols defined in that file (plus a few built-in and predefined symbols needed internally by \*(TC). However, the API provides a way for you to pass a "primed" symbol table to the parser that contains predefined symbols/values of your own choosing. Why on earth would you want to do this? There are a number of reasons: .IP \(bu 4 You may wish to write a configuration file which somehow depends on a predefined variable that only the calling program can know: .ft C \" Courier .nf .if [APPVERSION] == 1.0 # Set configuration for original application version .else # Set configuration for newer releases .endif .fi .ft \" revert In this example, only the calling application can know its own version, so it sets the variable APPVERSION in a symbol table which is passed to \fCParseConfig()\fP. .IP \(bu 4 You may wish to "protect" certain variable names be creating them ahead of time and marking them as "Read Only". This is useful when you want a variable to be available for use within a configuration file, but you do not want users to be able to change its value. In this case, the variable can be referenced in a string substitution or conditional test, but cannot be changed. .IP \(bu 4 You may want to place limits on what values can be assigned to a particular variable. When a variable is newly defined in a a configuration file, it just defaults to being a string variable without any limits on its length or content (unless you are using Variable Templates). But variables that are created by a program have access to the variable's "descriptor". By setting various attribues of the variable descriptor you can control variable type, content, and range of values. In other words, you can have \*(TC "validate" what values the user assigns to particular variables. This substantially simplifies your application because no invalid variable value will ever be returned from the parser. .SS How To Create An Initial Symbol Table A \*(TC "Symbol Table" is really nothing more than a Python dictionary. The key for each dictionary entry is the variable's name and the value is a \*(TC-specific object called a "variable descriptor". Creating new variables in the symbol table involves nothing more than this: .ft C \" Courier .nf from tconfpy import * # Create an empty symbol table MySymTable = {} # Create descriptor for new variable MyVarDes = VarDescriptor() # Code to fiddle with descriptor contents goes here MyVarDes.Value = "MyVal" # Now load the variable into the symbol table MySymTable["MyVariableName"] = MyVarDes # Repeat this process for all variables, then call the parser retval = ParseConfig("MyConfigFile", InitialSymTable=MySymTable) .fi .ft \" revert The heart of this whole business the \fCVarDescriptor\fP object. It "describes" the value and properties of a variable. These descriptor objects have the following attributes and defaults: .ft C \" Courier .nf VarDescriptor.Value = "" VarDescriptor.Writeable = True VarDescriptor.Type = TYPE_STRING VarDescriptor.Default = "" VarDescriptor.LegalVals = [] VarDescriptor.Min = None VarDescriptor.Max = None .fi .ft \" revert When \*(TC encounters a new variable in a configuration file, it just instantiates one of these descriptor objects with these defaults for that variable. That is, variables newly-defined in a configuration file are entered into the symbol table as string types, with an initial value of "" and with no restriction on content or length. But, when you create variables under program control to "prime" an initial symbol table, you can modify the content of any of these attributes for each variable. These descriptor attributes are what \*(TC uses to validate subsequent attempts to change the variable's value in the configuration file. In other words, modifying a variable's descriptor tells \*(TC just what you'll accept as "legal" values for that variable. Each attribute has a specific role: .TP .B VarDescriptor.Value (Default: \fCEmpty String\fP) Holds the current value for the variable. .TP .B VarDescriptor.Writeable (Default: \fCTrue\fP) Sets whether or not the user can change the variable's value. Setting this attribute to \fCFalse\fP makes the variable .B Read Only. .TP .B VarDescriptor.Type (Default: \fCTYPE_STRING\fP) One of TYPE_BOOL, TYPE_COMPLEX, TYPE_FLOAT, TYPE_INT, or TYPE_STRING. This defines the type of the variable. Each time \*(TC sees a value being assigned to a variable in the configuration file, it checks to see if that variable already exists in the symbol table. If it does, the parser checks the value being assigned and makes sure it matches the type declared for that variable. For example, suppose you did this when defining the variable, \fCfoo\fP: .ft C \" Courier .nf VarDescriptor.Type = TYPE_INT .fi .ft \" revert Now suppose the user puts this in the configuration file: .ft C \" Courier .nf foo = bar .fi .ft \" revert This will cause a type mismatch error because \fCbar\fP cannot be coerced into an integer type - it is a string. As a general matter, for existing variables, \*(TC attempts to coerce the right-hand-side of an assignment to the type declared for that variable. The least fussy operation here is when the variable is defined as TYPE_STRING because pretty much everything can be coerced into a string. For example, here is how \fCfoo = 3+8j\fP is treated for different type declarations: .ft C \" Courier .nf VarDescriptor.Type VarDescriptor.Value ------------------ ------------------- TYPE_BOOL Type Error TYPE_COMPLEX 3+8j (A complex number) TYPE_FLOAT Type Error TYPE_INT Type Error TYPE_STRING "3+8j" (A string) .fi .ft \" revert This is why the default type for newly-defined variables in the configuration file is TYPE_STRING: they can accept pretty much .B any value. .TP .B VarDescriptor.Default (Default: \fCEmpty String\fP) This is a place to store the default value for a given variable. When a variable is newly-defined in a configuration file, \*(TC places the first value assigned to that variable into this attribute. For variables already in the symbol table, \*(TC does nothing to this attribute. This attribute is not actually used by \*(TC for anything. It is provided as a convenience so that the calling program can easily "reset" every variable to its default value if desired. .TP .B VarDescriptor.LegalVals (Default: \fC[]\fP) Sometimes you want to limit a variable to a specific set of values. That's what this attribute is for. \fCLegalVals\fP explictly lists every legal value for the variable in question. If the list is empty,then this validation check is skipped. The exact semantics of LegalVals varies depending on the type of the variable. .ft C \" Courier .nf Variable Type What LegalVals Does ------------- ------------------- Boolean Nothing - Ignored Integer, Float, Complex List of numeric values the user can assign to this variable Examples: [1, 2, 34] [3.14, 2.73, 6.023e23] [3.8-4j, 5+8j] String List of Python regular expressions. User must assign a value to this variable that matches at least one of these regular expressions. Example: [r'a+.*', r'^AnExactString$'] .fi .ft \" revert The general semantic here is "If Legal Vals is not an empty list, the user must assign a value that matches one of the items in LegalVals." One special note applies to \fCLegalVals\fP for string variables. \*(TC always assumes that this list contains Python regular expressions. For validation, it grabs each entry in the list, attempts to compile it as a regex, and checks to see if the value the user wants to set matches. If you define an illegal regular expression here, \*(TC will catch it and produce an appropriate error. You may also want to specify a set of legal strings that are .B exact matches not open-ended regular expressions. For example, suppose you have a variable, \fCCOLOR\fP and you only want the user to be able to only set it to one of, \fCRed\fP, \fCWhite\fP, or \fCBlue\fP. In that case, use the Python regular expression metacharacters that indicate "Start Of String" and "End Of String" do do this: .ft C \" Courier .nf des = VarDescriptor() des.LegalVals = [r'^Red$', r'^White$', r'^Blue$'] ... SymTable['COLOR'] = des .fi .ft \" revert .B NOTE: If you want this test to be skipped, then set \fCLegalVals\fP to an empty list, []. (This is the default when you first create an instance of \fCtconfpy.VarDescriptor\fP.) Do not set it to a Python None or anything else. \*(TC expects this attribute to be a list in every case. .TP .B VarDescriptor.Min and VarDescriptor.Max (Default: \fCNone\fP) These set the minimum and maxium legal values for the variables, but the semantics vary by variable type: .ft C \" Courier .nf Variable Type What Min/Max Do ------------- --------------- Boolean, Complex Nothing - Ignored Integer, Float Set Minimum/Maxium allowed values. String Set Minimum/Maximum string length .fi .ft \" revert In all cases, if you want either of these tests skipped, set \fCMin\fP or \fCMax\fP to the Python None. .P All these various validations are logically "ANDed" together. i.e., A new value for a variable must be allowed AND of the appropriate type AND one of the legal values AND within the min/max range. \*(TC makes no attempt to harmonize these validation conditions with each other. If you specify a value in \fCLegalVals\fP that is, say, lower than allowed by \fCMin\fP you will always get an error when the user sets the variable to that value: It passed the \fCLegalVals\fP validation but failed it for \fCMin\fP. .SS The Initial Symbol Table And Lexical Namespaces The .B CONFIGURATION LANGUAGE REFERENCE section below discusses lexical namespaces in some detail from the user's point-of-view. However, it is useful for the programmer to understand how they are implemented. \*(TC is written to use a predefined variable named \fCNAMESPACE\fP as the place where the current namespace is kept. If you do not define this variable in the initial symbol table passed to the parser, \*(TC will create it automatically with an initial value of "". From a programmer's perspective, there are are few important things to know about namespaces and the \fCNAMESPACE\fP variable: .IP \(bu 4 You can manually set the initial namespace to something other than "". You do this by creating the \fCNAMESPACE\fP variable in the initial symbol table passed to the parser, and setting the \fCValue\fP attribute of its descriptor to whatever you want as the initial namespace. At startup \*(TC will check this initial value to make sure it conforms to the rules for properly formed names - i.e., It it will check for blank space, a leading \fC$\fP, the presence of square brackets, and so on. If the initial namespace value you provide is illegal, \*(TC will produce an error and reset the initial namespace to "". .IP \(bu 4 Because lexical namespaces are implemented by treating \fCNAMESPACE\fP as just another variable, all the type and value validations available for string variables can be applied to \fCNAMESPACE\fP. As discussed above, this means you can limit the length and content of what the user assigns to \fCNAMESPACE\fP. In effect, this means you can limit the number and name of namespaces available for use by the user. There is one slight difference here than for other variables. .B The root namespace is always legal, regardless of what other limitations you may impose via the \fCLegalVals\fP, \fCMin\fP, and \fCMax\fP attributes of the \fCNAMESPACE\fP variable descriptor. .IP \(bu 4 When the call to \fCParseConfig()\fP completes, the \fCValue\fP attribute of the \fCNAMESPACE\fP variable descriptor will contain the namespace that was in effect when the parse completed. i.e., It will contain the last namespace used. .SS How The \*(TC Parser Validates The Initial Symbol Table When you pass an initial symbol table to the parser, \*(TC does some basic validation that the table contents properly conform to the \fCVarDescriptor\fP format and generates error messages if it finds problems. However, the program does .B not check your specifications to see if they make sense. For instance if you define an integer with a minimum value of 100 and a maximum value of 50, \*(TC cheerfully accepts these limits even though they are impossible. You'll just be unable to do anything with that variable - any attempt to change its value will cause an error to be recorded. Similarly, if you put a value in \fCLegalVals\fP that is outside the range of \fCMin\fP to \fCMax\fP, \*(TC will accept it quietly. .SS The \fCAllowNewVars\fP API Option By default, \*(TC lets the user define any new variables they wish in a configuration file, merely by placing a line in the file in this form: .ft C \" Courier .nf Varname = Value .fi .ft \" revert However, you can disable this capability by calling the parser like this: .ft C \" Courier .nf retval = ParseConfig("myconfigfile", AllowNewVars=False) .fi .ft \" revert This means that the configuration file can "reference" any predefined variables, and even change their values (if they are not Read-Only), but it cannot create .B new variables. This feature is primarily intended for use when you pass an initial symbol table to the parser and you do not want any other variables defined by the user. Why? There are several possible uses for this option: .IP \(bu 4 You know every configuration variable name the calling program will use ahead of time. Disabling new variable names keeps the configuration file from getting cluttered with variables that the calling program will ignore anyway, thereby keeping the file more readable. .IP \(bu 4 You want to insulate your user from silent errors caused by misspellings. Say your program looks for a configuration variable called \fCMyEmail\fP but the user enters something like \fCmyemail = foo@bar.com\fP. \fCMyEmail\fP and \fCmyemail\fP are entirely different variables and only the former is recognized by your calling program. By turning off new variable creation, the user's inadvertent misspelling of the desired variable name will be flagged as an error. Note, however, that there is one big drawback to disabling new variable creation. \*(TC processes the configuration file on a line-by-line basis. No line "continuation" is supported. For really long variable values and ease of maintenance, it is sometimes helpful to create "intermediate" variables what hold temporary values used to construct a variable actually needed by the calling program. For example: .ft C \" Courier .nf inter1 = Really, really, really, really, long argument #1 inter2 = Really, really, really, really, long argument #2 realvar = command [inter1] [inter2] .fi .ft \" revert If you disable new variable creation you can't do this anymore unless all the variables \fCinter1\fP, \fCinter2\fP, and \fCrealvar\fP are predefined in the initial symbol table passed to the parser. .SS Using Variable Templates By default, any time a new variable is encountered in a configuration file, it is created as a string type with no restrictions on its content or length. As described above, you can predefine the variable in the initial symbol table you pass to the parser. This allows you to define that variable's type and to optionally place various restrictions on the values it may take. In other words, you can "declare" the variable ahead of time and \*(TC will do so-called "type and value enforcement". "Variable Templates" are a related kind of idea, with a bit of a twist. They give you a way to "declare" variable type and content restrictions for selected .B new variables discovered in the configuration file. In other words, by using Variable Templates, you can make sure that a new variable also has restrictions placed on its type and/or values. The obvious question here is, "Why not just do this by predefining every variable of interest in the initial symbol table passed to the parser?" There are several answers to this: .IP \(bu 4 The \*(TC configuration language has very powerful "existential" conditional tests. These test to see if a variable "exists". If you predefine every variable you will ever need, then the kinds of existential tests you can do will be somewhat limited (since every variable .B does already exist). With Variable Templates, you can define the type and value constraints of a variable which will be applied, .B but only if you actually bring that variable into existence. This allows constructs like this to work: .ft C \" Courier .nf .if [.PLATFORM] == posix posix = True .endif .if [.PLATFORM] == nt nt = True .endif .ifall posix ... .endif .ifall nt ... .endif .ifnone posix nt ... .endif .fi .ft \" revert In this example, notice that the variables \fCposix\fP and \fCnt\fP may- or may not be actually created, depending on the value of \fC.PLATFORM\fP. The logic later in the example depends upon this. If you were to predefine these two variables (to specify type and/or value restrictions), this type of logical flow would not be possible. By providing Variable Templates for \fCposix\fP and \fCnt\fP, you can define their type (likely Boolean in this case) ahead of time .B and this will be applied if the variable does come into existence. .IP \(bu 4 The other reason for Variable Templates is more subtle, but gives \*(TC tremendous versatility beyond just processing configuration files. Variable Templates give you a way to use \*(TC to build data validation tools. Suppose you have a list of employee records exported in this general format (easy to do with most databases): .ft C \" Courier .nf [Employee#] LastName = ... FirstName = ... Address = ... City = ... ... and so on .fi .ft \" revert By using the empoyee's ID as a lexical namespace, we end up creating new variables for each employee. Say the employee number is \fC1234\fP. Then we would get, \fC1234.LastName\fP, \fC1234.FirstName\fP, and so on. Now, here's the subtle part. Notice that the type and content restrictions of these variables is likely to be the .B same for each different employee. By defining Variable Templates for each of the variables we intend to use over and over again in different namespace contexts, we can .B validate each of them to make sure their content, type, length, and so forth are correct. This makes it possible to use \*(TC as the underpinnings of a "data validation" or "cleansing" program. .IP \(bu 4 Another way to look at this is that Variable Templates give you a way to define type/value restrictions on an entire "class" of variables. Instead of having to explictly predefine variables for every employee in our example above, you just define templates for the variable set that is common to all employees. This is .B way simpler than predefining every possible variable combination ahead of time. .SS The \fCTemplates\fP And \fCTemplatesOnly\fP API Options Variable Templates are supported with two API options: \fCTemplates\fP And \fCTemplatesOnly\fP. \fCTemplates\fP is used to pass a symbol table (separate from the main symbol table) containing the Variable Templates. By default, this option is set to \fC{}\fP which means no templates are defined. So what exactly is a "Variable Template"? It is the .B exact same thing as a predefined variable you might pass in the initial symbol table. In other words, it is a Python dictionary entry where the key is the variable name and the entry is in \fCVarDescriptor\fP format. The only difference is that a templated variable does not come into existence in the main symbol table until a variable by that .B name is encountered in the configuration file. Then the variable is created using the template as its entry in the main symbol table. For example: .ft C \" Courier .nf [1234] LastName = Jones FirstName = William Address = 123 Main Street City = Anywhere State = WI ZIP = 00000-0000 [1235] LastName = Jones FirstName = Susan Address = 123 Main Street City = Anywhere State = WI ZIP = 00000-0000 .fi .ft \" revert Suppose you define variable templates for \fCLastName\fP, \fCFirstName\fP, \fCAddress\fP, and so on. That is, you define variables by these names, and define whatever type and content restrictions you want in each of their \fCVarDescriptor\fPs. You then pass these to the parser via the \fCTemplates=\fP option. As \*(TC parses the file and encounters the new variables \fC1234.LastName\fP ... \fC1235.ZIP\fP, it uses the following "rules" when creating new variables: .IP \(bu 4 See if there is a template variable whose name is the same as the "base" name of the new variable. (The "base" name is just the variable name without the prepended namespace.) If there is a template with a matching name, see if the value the user wants to assign to that variable passes all the type/validation rules. If so, load the variable into the symbol table and set its value as requested, .B using the \fCVarDescriptor\fP object from the template. (This ensures that future attempts to change the variable's value will also be type/validation checked.) If the assignment fails the validation tests, issue an appropriate error and do .B not create the variable in the symbol table. .IP \(bu 4 If there is no template with a matching name, then just create a new variable as usual - string type with no restrictions, .B unless \fCTemplatesOnly\fP is set to \fCTrue\fP. Setting this option to \fCTrue\fP tells the program that you want to allow the creation of .B only those variables for which templates are defined. This is a way to restrict just what new variables can be created in any namespace. \fCTemplatesOnly\fP defaults to \fCFalse\fP which means you can create new variables even when no template for them exists. .P In summary, Variable Templates give you a way to place restrictions on variable type and content .B in the event that the variable actually comes into existence. They also give you a way to define such restrictions for an entire class of variables without having to explicitly name each such variable ahead of time. Finally, Variable Templates are an interesting way to use \*(TC as the basis for data validation programs. .SS The \fCLiteralVars\fP API Option \*(TC supports the inclusion of literal text anywhere in a configuration file via the \fC.literal\fP directive. This directive effectively tells the \*(TC parser to pass every line it encounters "literally" until it sees a corresponding \fC.endlinteral\fP directive. By default, \*(TC does .B exactly this. However, \*(TC has very powerful variable substitution mechanisms. You may want to embed variable references in a "literal" block and have them replaced by \*(TC. Here is an example: .ft C \" Courier .nf MyEmail = me@here.com # This defines variable MyEmail .literal printf("[MyEmail]"); /* A C Statement */ .endliteral .fi .ft \" revert By default, \fCParseConfig()\fP will leave everything within the \fC.literal\fP/\fC.endliteral\fP block unchanged. In our example, the string: .ft C \" Courier .nf printf("[MyEmail]"); /* A C Statement */ .fi .ft \" revert would be in the list of literals returned by \fCParseConfig()\fP. However, we can ask \*(TC to do variable replacement .B within literal blocks by setting \fCLiteralVars=True\fP in the \fCParseConfig()\fP call: .ft C \" Courier .nf retval = ParseConfig("myconfigfile", LiteralVars=True) .fi .ft \" revert In this example, \*(TC would return: .ft C \" Courier .nf printf("me@here.com"); /* A C Statement */ .fi .ft \" revert At first glance this seems only mildly useful, but it is actually very handy. As described later in this document, \*(TC has a rich set of conditional operators and string sustitution facilities. You can use these features along with literal block processing and variable substitution within those blocks. This effectively lets you use \*(TC as a preprocessor for .B any other language or text. .SS The \fCReturnPredefs\fP API Option As described below, \fCtconfpy\fP internally "predefines" a number of variables. These include variables that describe the current runtime environment as well as variables that substitute for language keywords. These predefined variables are just stored in the symbol table like any other variable. By default, they are returned with all the "real" variables discovered in the configuration file. If you want .B only the variables actually encountered in the configuration file itself, set \fCReturnPredefs=False\fP in the \fCParseConfig()\fP API call. This will cause \fCtconfpy\fP to strip out all the predefined variables before returning the final symbol table. Note that this option also removes the \fCNAMESPACE\fP variable since it is understood to also be outside the configuration file (even though you may have passed an initial version of \fCNAMESPACE\fP to the parser). Note also that this option applies only to the variables predefined by .B \fCtconfpy\fP itself. Any variables .B you predefine when passing an initial symbol table will be returned as usual, regardless of the state of this option. .SS The \fCDebug\fP API Option \*(TC has a fairly rich set of debugging features built into its parser. It can provide some detail about each line parsed as well as overall information about the parse. Be default, debugging is turned off. To enable debugging, merely set \fCDebug=True\fP in the API call: .ft C \" Courier .nf retval = ParseConfig("myconfigfile", Debug=True) .fi .ft \" revert .SS How \*(TC Processes Errors As a general matter, when \*(TC encounters an error in the configuration file currently being parsed, it does two things. First, it adds a descriptive error message into the list of errors returned to the calling program (see the next section). Secondly, in many cases, noteably during conditional processing, it sets the parser state so the block in which the error occurred is logically \fCFalse\fP. This does not happen in every case, however. If you are having problems with errors, enable the Debugging features of the package and look at the debug output. It provides detailed information about what caused the error and why. .SS \*(TC Return Values When \*(TC is finished processing the configuration file, it returns an object that contains the entire results of the parse. This includes a symbol table, any relevant error or warning messages, debug information (if you requested this via the API), and any "literal" lines encountred in the configuration. The return object is an instance of the class \fCtwander.RetObj\fP which is nothing more than a container to hold return data. In the simplest case, we can parse and extract results like this: .ft C \" Courier .nf from tconfpy import * retval = ParseConfig("myconfigfile", Debug=True) .fi .ft \" revert \fCretval\fP now contains the results of the parse: .TP .B retval.Errors A Python list containing error messages. If this list is empty, you can infer that there were no parsing errors - i.e., The configuration file was OK. .TP .B retval.Warnings A Python list containing warning messages. These describe minor problems not fatal to the parse process, but that you really ought to clean up in the configuration file. .TP .B retval.SymTable A Python dictionary which lists all the defined symbols and their associated values. A "value" in this case is always an object of type tconfpy.VarDescriptor (as described above). .TP .B retval.Literals As described below, the \*(TC configuration language supports a \fC.literal\fP directive. This directive allows the user to embed literal text anywhere in the configuration file. This effectively makes \*(TC useful as a preprocessor for any other language or text. retval.Literals is a Python list containing all literal text discovered during the parse. The lines appear there in the order they were discovered in the configuration file. .TP .B retval.Debug A Python list containing detailed debug information for each line parsed as well as some brief summary information about the parse. retval.Debug defaults to an empty list and is only populated if you set \fCDebug=True\fP in the API call that initiated the parse (as in the example above). .SH CONFIGURATION LANGUAGE REFERENCE \*(TC recognizes a full-featured configuration language that includes variable creation and value assignment, a full preprocessor with conditionals, type and value enforcement, and lexical namespaces. This section of the document describes that language and provides examples of how each feature can be used. .SS \*(TC Configuration Language Syntax \*(TC supports a fairly simple and direct configuration language syntax: .IP \(bu 4 Each line is treated independently. There is no line "continuation". .IP \(bu 4 The \fC#\fP can begin a comment anywhere on a line. This is done blindly. If you need to embed this symbol somewhere within a variable value, use the \fC[HASH]\fP variable reference. .IP \(bu 4 Whitespace is (mostly) insignificant. Leading and trailing whitespace is ignored, as is whitespace around comparison operators. However, there are some places where whitespace matters: - Variable names may not contain whitespace - Directives must be followed by whitespace if they take other arguments. - When assigning a value to a string variable, whitespace within the value on the right-hand-side is preserved. Leading- and trailing whitespace around the right-hand- side of the assignment is ignored. - Whitespace within both the left- and right-hand-side arguments of a conditional comparison (\fC.if ... == / != ...\fP) is significant for purposes of the comparison. .IP \(bu 4 Case is always significant except when assigning a value to Booleans (described in the section below entitled, .B Some Notes On Boolean Variables ). .IP \(bu 4 Regardless of a variable's type, all variable references return .B a string representation of the variable's value! This is done so that the variable's value can be used for comparison testing and string substitution/concatenation. In other words, variables are stored in their native type in the symbol table that is returned to the calling program. However, they are treated as strings during the parsing of the configuration file whenever they are used in a comparison test or in a substitution. .IP \(bu 4 Text inside a literal block (see section below on the \fC.literal\fP directive) is left untouched. Whitespace, the \fC#\fP symbol, and so on are not intepreted in any way and are passed back to the calling program as-is. The one exception to this rule is when variable substitution inside literal blocks is enabled. This is discussed in a later section of this document as well. .IP \(bu 4 Any line which does not conform to these rules and/or is not in the proper format for one of the operations described below, is considered an error. .SS Creating Variables And Assigning A Value The heart of a configuration file is a "variable". Variables are stored in a "Symbol Table" which is returned to the calling program once the configuration file has been processed. The calling program can predefine any variables it wishes before processing a configuration file. You can normally also define your own new variables in the configuration file as desired (unless the programmer has inhibited new variable creation). Variables are assigned values like this: .ft C \" Courier .nf MyVariable = Some string of text .fi .ft \" revert If \fCMyVariable\fP is a new variable, \*(TC will create it on the spot. If it already exists, \*(TC will first check and make sure that \fCSome string of text\fP is a legal value for this variable. If not, it will produce an error and refuse to change the current value of \fCMyVariable\fP. Anytime you create a new variable, the first value assigned to it is also considered its "default" value. This may (or may not) be meaningful to the application program. Variables which are newly-defined in a configuration file are always understood to be .B string variables - i.e., They hold "strings" of text. However, it is possible for the applications programmer to predefine variables with other types and place limitations on what values the variable can take and/or how short or long a string variable may be. (See the previous section, .B PROGRAMMING USING THE \*(TC API for all the gory details.) The programmer can also arrange for the configuration file to only have access to variables predefined by the program ahead of time. In that case, if you try to create a new variable, \*(TC will produce an appropriate error and the new variable will not be created. .SS Variable Names Variables can be named pretty much anything you like, with certain restrictions: .IP \(bu 4 Variable names may not contain whitespace. .IP \(bu 4 Variable names may not begin with the \fC$\fP character. The one exception to this is when you are referencing the value of an environment variable. References to environment variables begin with \fC$\fP: .ft C \" Courier .nf # A reference to an environment variable is legal x = [$TERM] # Attempting to create a new variable starting with $ is illegal $MYVAR = something .fi .ft \" revert .IP \(bu 4 Variable names cannot have the \fC#\fP character anywhere in them because \*(TC sees that character as the beginning a comment. .IP \(bu 4 Variable names cannot begin with \fC.\fP character. \*(TC understands a leading period in a variable name to be a "namespace escape". This is discussed in a later section on lexical namespaces. .IP \(bu 4 Variable names cannot contain the \fC[\fP or \fC]\fP characters. These are reserved symbols used to indicate a variable .B reference. .IP \(bu 4 You cannot have a variable whose name is the empty string. This is illegal: .ft C \" Courier .nf = String .fi .ft \" revert .IP \(bu 4 The variable named \fCNAMESPACE\fP is not available for your own use. \*(TC understands this variable to hold the current lexical namespace as described later in this document. If you set it to a new value, it will change the namespace, so be sure this is what you wanted to do. .SS Getting And Using The Value Of A Variable You can get the value of any currently defined variable by .B referencing it like this: .ft C \" Courier .nf .... [MyVariable] ... .fi .ft \" revert The brackets surrounding any name are what indicate that you want that variable's value. You can also get the value of any Environment Variable on your system by naming the variable with a leading \fC$\fP: .ft C \" Courier .nf ... [$USER] ... # Gets the value of the USER environment variable .fi .ft \" revert However you cannot set the value of an environment variable: .ft C \" Courier .nf $USER = me # This is not permitted .fi .ft \" revert This ability to both set and retrieve variable content makes it easy to combine variables through "substitution": .ft C \" Courier .nf MYNAME = Mr. Tconfpy MYAGE = 101 Greeting = Hello [MYNAME], you look great for someone [MYAGE]! .fi .ft \" revert Several observations are worth noting here: .IP \(bu 4 The substitution of variables takes place as soon as the parser processes the line \fCGreeting = ...\fP. That is, variable substitution happens as it is encountered in the configuration file. The only exception to this is if an attempt is made to refer to an undefined/non-existent variable. This generates an error. .IP \(bu 4 The variable \fCGreeting\fP now contains the .B string "Hello Mr. Tconfpy, you look great for someone 101!" This is true even if variable \fCMYAGE\fP has been defined by the calling program to be an integer type. To repeat a previously-made point: All variable substitution and comparison operations in a configuration file are done with .B strings regardless of the actual type of the variables involved. .IP \(bu 4 Variables must be .B defined before they can be .B referenced. \*(TC does not support so-called "forward" references. .IP \(bu 4 Unless a variable as been marked as "Read Only" by the application program, you can continue to change its value as you go. Simply adding another line at the end of our example above will change the value of \fCGreeting\fP to something new: .ft C \" Courier .nf Greeting = Generic Greeting Message .fi .ft \" revert In other words, the last assignment statement for a given variable "wins". This may seem sort of pointless, but it actually has great utility. You can use the \fC.include\fP directive to get, say, a "standard" configuration provided by the system administrator for a particular application. You can then selectively override the variables you want to change in your own configuration file. .SS Indirect Variable Assignment The dereferencing of a variable's value can take place on either the right- or .B left-hand-side of an assignment statement. This means so-called "indirect" variable assignments are permitted: .ft C \" Courier .nf CurrentTask = HouseCleaning [CurrentTask] = Dad .fi .ft \" revert To understand what this does you need to realize that before \*(TC does anything with a statement in a configuration file, it replaces every variable reference with its associated value (or produces an error for references to non-existent variables). So the second statement above is first converted to: .ft C \" Courier .nf HouseCleaning = Dad .fi .ft \" revert i.e., The value \fCDad\fP is assigned to a (new) variable called \fCHouseCleaning\fP. In other words, putting a variable reference on the left-hand-side of an assignment like this allows you to access another variable which is named "indirectly". You have to be careful when doing this, though. Consider a similar, but slightly different example: .ft C \" Courier .nf CurrentTask = House Cleaning # This is fine [CurrentTask] = Dad # Bad! .fi .ft \" revert The reason this no longer works is that the indirect reference causes the second line to parse to: .ft C \" Courier .nf House Cleaning = Dad .fi .ft \" revert This is illegal because whitespace is not permitted in variable names. \*(TC will produce an error if it sees such a construct. As a general matter, any variable you construct through this indirection method must still conform to all the rules of variable naming: It cannot contain whitespace, begin with \fC$\fP, contain \fC#\fP, \fC[\fP, or \fC]\fP and so on. Another example of how indirection can "bite" you is when the value of the variable begins with a period. As you'll see in the following section on Lexical Namespaces, a variable name beginning with a period is understood to be an "absolute" variable name reference (relative to the root namespace). This can cause unexpected (though correct) behavior when doing indirect variable access: .ft C \" Courier .nf NAMESPACE = NS1 foo = .bar # Creates variable NS1.foo with value .bar [foo] = baz # Means [NS1.foo] = baz .fi .ft \" revert The second assignment statement in this example does not do what you might initially think. Remember, \*(TC always does variable dereferencing before anything else, so the second statement becomes: .ft C \" Courier .nf .bar = baz .fi .ft \" revert As you'll see in the section on Lexical Namespaces below, this actually means, "Set the variable \fCbar\fP in the root namespace to the value \fCbaz\fP." In other words, if you do indirect variable assignment, and the content of that variable begins with a period, you will be creating/setting a variable in .B the root namespace. Be sure this is what you intended to do. Get into the habit of reading \fC[something]\fP as, "The current value of \fCsomething\fP". See if you understand what the following does (if you don't, try it out with \fCtest-tc.py\fP): .ft C \" Courier .nf foo = 1 bar = 2 [foo] = bar [bar] = [foo] .fi .ft \" revert You can get pretty creative with this since variable references can occur pretty much anywhere in an assignment statement. The only place they cannot appear is .B within another variable reference. That is, you cannot "nest" references: .ft C \" Courier .nf # The Following Is Fine FOO = Goodness BAR = Me Oh[FOO][BAR] = Goodness Gracious Me! # But This Kind Of Nesting Attempt Causes An Error [FOO[BAR]] = Something Or Other .fi .ft \" revert .SS Introducing Lexical Namespaces So far,the discussion of variables and references has conveniently ignored the presence of another related \*(TC feature, "Lexical Namespaces." Namespaces are a way to automatically group related variables together. Suppose you wanted to describe the options on your car in a configuration file. You might do this: .ft C \" Courier .nf MyCar.Brand = Ferrari MyCar.Model = 250 GTO MyCar.Color = Red # And so on ... .fi .ft \" revert You'll notice that every variable start with the "thing" that each item has in common - they are features of \fCMyCar\fP. We can simplify this considerably by introducing a lexical namespace: .ft C \" Courier .nf [MyCar] Brand = Ferrari Model = 250 GTO Color = Red .fi .ft \" revert The first statement looks like a variable reference, but it is not. .B A string inside square brackets by itself on a line introduces a namespace. The first statement in this example sets the namespace to \fCMyCar\fP. From that point forward until the namespace is changed again, every variable assignment .B and reference is "relative" to the namespace. What this really means is that \*(TC sticks the namspace plus a \fC.\fP in front of every variable assigned or referenced. It does this automatically and invisibly, so \fCBrand\fP is turned into \fCMyCar.Brand\fP and so on. You can actually check this by loading the example above into a test configuration file and running the \fCtest-tc.py\fP program on it. You will see the "fully qualified" variable names that actually were loaded into the symbol table, each beginning with \fCMyCar.\fP and ending with the variable name you specified. Realize that this is entirely a naming "trick". \*(TC has no clue what the namespace .B means, it just combines the current namespace with the variable name to create the actual variable name that will be returned in the symbol table. You're likely scratching your head wondering why on earth this feature present in \*(TC. There are several good reasons for it: .IP \(bu 4 It reduces typing repetetive information throughout the configuration file. In turn, this reduces the likelyhood of a typographical or spelling error. .IP \(bu 4 It helps visibly organize the configuration file. A namespace makes it clear which variables are related to each other somehow. This is no big deal in small configurations, but \*(TC was written with the idea of supporting configuration files that might contain thousands or even tens of thousands of entries. .IP \(bu 4 It simplifies the application programmer's job. Say I want to write a program that extracts all the information about your car from the configuration file, but I don't know ahead of time how many things you will describe. All I really have to know is that you are using \fCMyCar\fP as the namespace for this information. My program can then just scan the symbol table after the configuration file has been parsed, looking for variables whose name begins with \fCMyCar.\fP. So if you want to add other details about your auto like, say, \fCAge\fP, \fCPrice\fP, and so on, you can do so later .B and the program does not have to be rewritten. .IP \(bu 4 It helps enforce correct configuration files. By default, you can introduce new namespaces into the configuration file any time you like. However, as described in the previous section on the \*(TC API, the application programmer can limit you to a predefined set of legal namespaces (via the \fCLegalVals\fP attribute of the \fCNAMESPACE\fP variable descriptor). By doing this, the programmer is helping you avoid incorrect configuration file entries by limiting just which namespaces you can enter to reference or create variables. .SS Rules For Using Lexical Namespace Creating and using lexical namespaces is fairly straightforward, but there are a few restrictions and rules: .IP \(bu 4 The default initial namespace is the empty string, "". In this one case, \*(TC does nothing to variables assigned or referenced. That's why our early examples in the previous section worked. When we assigned a value to a variable and then referenced that variable value, we did so while in the so-called "root" namespace, "". When the namespace is "", nothing is done to the variable names. Bear in mind that the programmer can change this default namespace to something other than "" before the configuration file is ever processed. If they do this, they would be well advised to let their users know this fact. .IP \(bu 4 There two ways to change to a new namespace: .ft C \" Courier .nf [NewNameSpace] # May optionally have a comment OR NAMESPACE = NewNamespace # May optionally have a comment .fi .ft \" revert If, at any point, you want to return to the root namespace, you can use one of these two methods: .ft C \" Courier .nf [] OR NAMESPACE = .fi .ft \" revert So, why are there two ways to do the same thing? The first way is the more common, and the more readable way to do it. It appears on a line by itself and makes it clear that the namespace is being changed. However, because variable references cannot be "nested", you can only use strings of text here. Suppose you want to change the namespace in a way that depends on the value of another variable. For instance: .ft C \" Courier .nf LOCATION = Timbuktu NAMESPACE = [LOCATION]-East .fi .ft \" revert In other words, the second form of a namespace change allows you to employ the \*(TC string substitution and variable referencing features. Bear in mind that \*(TC is case-sensitive so this will not work as you expect: .ft C \" Courier .nf Namespace = something .fi .ft \" revert This just set the value of the variable \fCNamespace\fP to \fCsomething\fP and has nothing whatsoever to do with lexical namespaces. .IP \(bu 4 Whichever method you use to change it, .B the new namespace must follow all the same rules used for naming variables. For example, both of the following will cause an error: .ft C \" Courier .nf [$FOO] OR x = $FOO NAMESPACE = [x] .fi .ft \" revert .IP \(bu 4 By default, all variable assignments and references are .B relative to the currently active namespace: .ft C \" Courier .nf [MyNameSpace] foo = 123 # Creates a variable called MyNameSpace.foo x = [bar] # Means: MyNameSpace.x = [MyNameSpace.bar] .fi .ft \" revert .IP \(bu 4 If you want to set or reference a variable in a namespace different than the current namespace, you must use a so-called "absolute" variable name. You do this by "escaping" the variable name. To escape the name, begin it with a \fC.\fP and then use the .B full name (including namespace) of that variable. (This is called the "fully qualified variable name".) For example: .ft C \" Courier .nf [NS1] # Switch to the namespace NS1 foo = 14 # Creates NS1.foo [NS2] # Switch to the NS2 namespace foo = [.NS1.foo] # Sets NS2.foo = 14 .fi .ft \" revert There is another clever way to do this without using the escape character. \*(TC has no understanding whatsoever of what a lexical namespace actually is. It does nothing more than "glue" the current namespace to any variable names and references in your configuration file. Internally, all variables are named .B relative to the root namespace. This means that you can use the fully qualified variable name without any escape character any time you are in the root namespace: .ft C \" Courier .nf [NS1] # Switch to the namespace NS1 foo = 14 # Creates NS1.foo [] # Switch to the root namespace foo = [NS1.foo] # Sets foo = 14 - no escape needed .fi .ft \" revert .IP \(bu 4 Lexical namspaces are implemented by having \fCNAMESPACE\fP just be nothing more than (yet) another variable in the symbol table. \*(TC just understands that variable to be special - it treats it as the repository of the current lexical namespace. This means you can use the value of NAMESPACE in your own string substitutions: .ft C \" Courier .nf MyVar = [NAMESPACE]-Isn't This Cool? .fi .ft \" revert You can even use the current value of NAMESPACE when setting a new namespace: .ft C \" Courier .nf NAMESPACE = [NAMESPACE]-New .fi .ft \" revert One final, but very important point is worth noting here. The \fCNAMESPACE\fP variable itself is always understood to be .B relative to the root namespace. No matter what the current namespace actually is, \fC[NAMESPACE]\fP or \fCNAMESPACE = ...\fP always set a variable by that name in the root namespace. Similarly, when we use a variable reference to get the current namespace value (as we did in the example above), \fCNAMESPACE\fP is understood to be relative to the root namespace. That's why things like this work: .ft C \" Courier .nf [MyNewSpace] x = 100 # MyNewSpace.x = 100 y = [NAMESPACE]-1 # MyNewSpace.y = MyNewSpace-1 NAMESPACE = NewSpace # .NAMESPACE = NewSpace .fi .ft \" revert .SS Predefined Variables \*(TC predefines a number of variables. The \fCNAMESPACE\fP variable we discussed in the previous section is one of them, but there are a number of others of which you should be aware. Note that all predefined variables .B are relative to the root namespace. Except for the \fCNAMESPACE\fP variable, they are all Read Only and cannot be modified in your configuration file. The first group of predefined variables are called "System Variables". As the name implies, they provide information about the system on which you're running. These are primarily useful when doing conditional tests (described later in this document). For example, by doing conditional tests with System Variables you can have one configuration file that works on both Unix and Windows operating systems. The System Variables are: .ft C \" Courier .nf Variable Name Contains ------------- -------- .MACHINENAME - The name of the computer on which you are running. May also include full domain name, depending on system. .OSDETAILS - Detailed information about the operating system in use. .OSNAME - The name of the operating system in use. .OSRELEASE - The version of the operating system in use. .OSTYPE - Generic name of the operating system in use. .PLATFORM - Generic type of the operating system in use. .PYTHONVERSION - The version of Python in use. .fi .ft \" revert By combining these System Variables as well as the content of selected Environment Variables, you can create complex conditional configurations that "adapt" to the system on which a Python application is running. For example: .ft C \" Courier .nf .if [.MACHINENAME] == foo.bar.com BKU = tar .else BKU = [$BACKUPPROGRAM] .endif .fi .ft \" revert The other kind of predefined variables are called "Reserved Variables". \*(TC understands a number of symbols as part of its own language. For example, the string \fC#\fP tells \*(TC to begin a comment until end-of-line. There may be times, however, when .B you need these strings for your own use. In other words, you would like to use one of the strings which comprise the \*(TC language for your own purposes and have \*(TC ignore them. The Reserved Variables give you a way to do this. The Reserved Variables are: .ft C \" Courier .nf Variable Name Contains ------------- -------- DELIML [ DELIMR ] DOLLAR $ ELSE .else ENDIF .endif ENDLITERAL .endliteral EQUAL = EQUIV == HASH # IF .if IFALL .ifall IFANY .ifall IFNONE .ifnone INCLUDE .include LITERAL .literal NOTEQUIV != PERIOD . .fi For instance, suppose you wanted to include the \fC#\fP symbol in the value of one of your variables. This will not work, because \*(TC interprets it as the beginning of a comment, which is not what you want: .ft C \" Courier .nf MyJersey = Is #23 .fi .ft \" revert So, we use one of the Reserved Variables to get what we want: .ft C \" Courier .nf MyJersey = Is [HASH]23 .fi .ft \" revert One word of warning, though. At the end of the day, you still have to create variable names or namespace names that are legal. You can't "sneak" illegal characters into these names using Reserved Variables: .ft C \" Courier .nf foo = [DOLLAR]MyNewNamespace # No problem NAMESPACE = [foo] # No way - namespace cannot start with $ .fi .ft \" revert .SS Type And Value Enforcement By default, any variable (or namespace) you create in a configuration file is understood to just hold a string of characters. There are no limits to what that string may contain, how long it is, and so on. However, \*(TC gives the programmer considerable power to enforce variable types and values, if they so choose. (See the section above entitled, .B PROGRAMMING USING THE \*(TC API for the details.) The programmer can set all kinds of limitations about a variable's type, permissible values, and (in the case of strings) how long or short it may be. The programmer does this by defining these limitations for each variable of interest .B prior to calling \*(TC to parse your configuration file. In that case, when \*(TC actually processes the configuration file, it "enforces" these restrictions any time you attempt to change the value of one of these variables. If you try to assign a value that fails one of these "validation" tests, \*(TC will produce an error and leave the variable's value unchanged. For instance, suppose the programmer has defined variable "Foo" to be a floating point number, and that it must have a value between -10.5 and 100.1. In that case: .ft C \" Courier .nf Foo = 6.023E23 # Error - Value is out of range Foo = MyGoodness # Error - Value must be a FP number, not a string Foo = -2.387 # Good - Value is both FP an in range .fi .ft \" revert .SS What Specific Validations Are Available? The programmer has several different restrictions they can place on a variable's value. You do not need to understand how they work, merely what they are so that any error messages you see will make sense. .IP \(bu 4 The programmer may declare any variable to be .B Read Only. This means you can still use references to that variable to extract its value, but any attempt to change it value within the configuration file will fail and produce an error. .IP \(bu 4 The programmer may specify the variable's .B type as string (the default), integer, floating point, complex, or boolean. .IP \(bu 4 The programmer may specify .B the set of all legal values that can be assigned to a variable. For instance, the programmer might specify that the floating point variable \fCTranscend\fP can only be set to either 3.14 or 2.73. Similarly, the programmer might specify that the string variable \fCCOLOR\fP can only ever be set to \fCRed\fP, \fCGreen\fP, or \fCBlue\fP. In fact, in the case of string variables, the programmer can actually specify a set of patterns (regular expressions) the value has to match. For instance, they can demand that you only set a particular string variable to strings that begin with \fCa\fP and end with \fCbob\fP. .IP \(bu 4 For integer and floating point variables, the programmer can specify a legal .B value range for the variable. If you change the value of such a variable, that value must be within the defined range or you'll get an error. .IP \(bu 4 For string variables, the programmer can specify a minimum and maxium .B length for the strings you assign to the variable in question. .IP \(bu 4 The programmer can limit you to .B only being able to use existing variables. (.i.e. The Predefined variables and any variables the programmer has defined ahead of time.) In that case, any attempt to create a new variable in the configuration file will fail and produce an error. .IP \(bu 4 The programmer can limit you to .B only being able to use namespaces they have defined ahead of time. In that case, if you attempt to enter a namespace not on the list the programmer created ahead of time will fail and produce an error. .IP \(bu 4 The programmer can enable or prevent .B the substitution of variable references in literal blocks (see below). If they disable this option, something like \fC[Foo]\fP is left unchanged within the literal block. i.e., It too, is treated "literally". .SS Notes On Variable Type/Value Enforcement There are a few other things you should know about how \*(TC enforces restrictions on variables: .IP \(bu 4 For purposes of processing the configuration file, .B variable references are always converted to strings regardless of the actual type of the variable in question. (Variables are stored in the symbol table in their actual type.) For instance, suppose the programmer defines variable \fCFoo\fP to be floating point. Then: .ft C \" Courier .nf Foo = 1.23 Bar = Value is [Foo] # Creates a new *string* variable with the # value: "Value is 1.23" .fi .ft \" revert In other words, variable values are "coerced" into strings for the purposes of substitution and conditional testing within a configuration file. This is primarily an issue with the conditional comparisons below. For example, the following conditional is \fCFalse\fP because the string representations of the two numbers are different. Assume \fCf1\fP and \fCf2\fP have been defined as floating point variables by the calling program: .ft C \" Courier .nf f1 = 1.0 f2 = 1.00 .if [f1] == [f2] # False because "1.0" is not the same string as "1.00" ... .fi .ft \" revert .IP \(bu 4 You cannot create anything but a string variable within a configuration file. This variable will have no restrictions placed on its values. All validation features .B require the limitations to be specified by the calling program ahead of time. .IP \(bu 4 Similarly, you cannot change any of the enforcement options from within a configuration file. These features are only available under program control, presumably by the application program that is calling \*(TC. .IP \(bu 4 There is no way to know what the limitations are on a particular variable from within the configuration file. Programmers who use these features should document the variable restrictions they've employed as a part of the documentation for the application in question. .SS Some Further Notes On Boolean Variables One last note here concerns Boolean variables. Booleans are actually stored in the symbol table as the Python boolean values, \fCTrue\fP or \fCFalse\fP. However, \*(TC accepts user statements that set the value of the boolean in a number of formats: .ft C \" Courier .nf Boolean True Boolean False ------------ ------------- foo = 1 foo = 0 foo = True foo = False foo = Yes foo = No foo = On foo = Off .fi .ft \" revert This is the one case where \*(TC is insensitive to case - \fCtRUE\fP, \fCTRUE\fP, and \fCtrue\fP are all accepted, for example. .B NOTE HOWEVER: If the user wants to do a conditional test on the value of a boolean they .B must observe case and test for either \fCTrue\fP or \fCFalse\fP: .ft C \" Courier .nf boolvar = No .if [boolvar] == False # This works fine .if [boolvar] == FALSE # This does not work - Case is not being observed .if [boolvar] == Off # Neither does this - Only True/False can be tested .fi .ft \" revert .SS The \fC.include\fP Directive At any point in a configuration file, you can "include" another configuration file like this: .ft C \" Courier .nf .include filename .fi .ft \" revert In fact, you can use all the variable substitution and string concatenation features we've already discussed to do this symbolically: .ft C \" Courier .nf Base = MyConfig Ver = 1.01 .include [Base]-[Ver].cfg .fi .ft \" revert The whitespace after the \fC.include\fP directive is mandatory to separate it from the file name. You can have as many \fC.include\fP statements in your configuration file as you wish, and they may appear anywhere. The only restriction is that they must appear on a line by themselves (with an optional comment). Why bother? There are several reasons: .IP \(bu 4 This makes it easy to break up large, complex configurations into simpler (smaller) pieces. This generally makes things easier to maintain. .IP \(bu 4 This makes is easy to "factor" common configuration information into separate files which can then be used by different programs as needed. .IP \(bu 4 The most common use for \fC.include\fP is to load a "standard" configuration for your program. Recall that the last assignment of a variable's value "wins". Suppose you want all the standard settings for a program, but you just want to change one or two options. Instead of requiring each user to have the whole set of standard settings in their own configuration file, the system administrator can make them available as a common configuration. You then \fC.include\fP that file and override any options you like: .ft C \" Courier .nf # Get the standard options .include /usr/local/etc/MyAppStandardConfig.cfg # Override the ones you like ScreenColor = Blue Currency = Euros .fi .ft \" revert This makes maintenance of complex configuration files much simpler. There is only one master copy of the configuration that needs to be edited when system-wide changes are required. .P One last thing needs to be noted here. \*(TC does not detect so-called "circular" inclusions. If file \fCa\fP \fC.include\fPs file \fCb\fP and file \fCb\fP \fC.include\fPs file \fCa\fP, you will have an infinite loop of inclusion, which, uh ..., is a Bad Thing... .SS Conditional Directives One of the most powerful features of \*(TC is its "conditional processing" capabilities. The general idea is to test some condition and .B include or exclude configuration information based on the outcome of the test. What's the point? You can build large, complex configurations that test things like environment variables, one of the Predefined Variables, or even a variable you've set previously in the configuration file. In other words, resulting configuration is then produced in a way that is appropriate for that particular system, on that particular day, for that particular user, ... By using conditional directives, you can create a single configuration file that works for every user regardless of operating system, location, and so on. There are two kinds of conditional directives. "Existential Conditionals" test to see if a configuration or environment variable .B exists. Existential Conditionals pay no attention to the .B value of the variables in question, merely whether or not those variables have been defined. "Comparison Conditionals" actually .B compare two strings. Typically, one or more variable references appear in the compared strings. In this case, the .B value of the variable is important. The general structure of any conditional looks like this: .ft C \" Courier .nf ConditionalDirective Argument(s) This is included if the conditional was True .else # Optional This is included if the conditional was False .endif # Required .fi .ft \" revert Except for the whitespace after the conditional directive itself, whitespace is not significant. You may indent as you wish. Conditionals may also be "nested". You can have a conditional within another conditional or \fC.else\fP block: .ft C \" Courier .nf ConditionalDirective Argument(s) stuff ConditionalDirective Argument(s) more stuff .endif interesting stuff .else yet more stuff ConditionalDirective Argument(s) other stuff .endif ending stuff .endif .fi .ft \" revert There are no explicit limits to how deeply you can nest a configuration. However, you must have an \fC.endif\fP that terminates each conditional test. Bear in mind that \*(TC pays no attention to your indentation. It associates an \fC.endif\fP .B with the last conditional it encountered. That's why it's a really good idea to use some consistent indentation style so .B you can understand the logical structure of the conditions. It's also a good idea to put comments throughout such conditional blocks so it's clear what is going on. There are a few general rules to keep in mind when dealing with conditionals: .IP \(bu 4 There must be whitespace between the conditional directive and its arguments (which may- or may not have whitespace in them). .IP \(bu 4 As with any other kind of \*(TC statement, you may place comments anywhere at the end of a conditional directive line or within the conditional blocks. .IP \(bu 4 Each conditional directive must have a corresponding \fC.endif\fP. If you have more conditionals than \fC.endif\fPs or vice-versa, \*(TC will produce an error message to that effect. It can get complicated to keep track of this, especially with deeply nested conditionals. It is therefore recommended that you always begin and end conditional blocks within the same file. i.e., Don't start a conditional in one file and then \fC.include\fP another file that has the terminating \fC.endif\fP in it. .IP \(bu 4 The \fC.else\fP clause is optional. However, it can only appear after some preceding conditional directive. .IP \(bu 4 As in other parts of the \*(TC language, variable names and references in conditional directives are always relative to the currently active namespace unless they are escaped with a leading period. Similarly, in this context, Environment Variables, Predefined Variables, and the NAMESPACE Variable are always relative to the root namespace, no matter what namespace is currently active. .SS Existential Conditional Directives There are three Existential Conditionals: \fC.ifall\fP, \fC.ifany\fP, and \fC.ifnone\fP. Each has the same syntax: .ft C \" Courier .nf ExistentialDirective varname ... included if test was True .else # optional included if test was False .fi .ft \" revert .fi In other words, existential conditionals require one or more .B variable names. In each case, the actual content of that variable is ignored. The test merely checks to see if a variable by that name .B exists. Nothing else may appear on an existential conditional line, except, perhaps, a comment. The three forms of existential conditional tests implement three different kinds of logic: .ft C \" Courier .nf .ifall var1 var2 ... This is a logical "AND" operation. ALL of the variables, var1, var2 ... must exist for this test to be True. .ifany var1 var2 ... This is a logical "OR" operation. It is True of ANY of the variables, var1, var2 ... exist. .ifnone var1 var2 ... This is a logical "NOR" operation. It is True only if NONE of the variables, var1, var2 ... exist. .fi .ft \" revert Here is an example: .ft C \" Courier .nf FOO = 1 BAR = 2 z = 0 .ifall FOO BAR x = 1 .endif .ifany FOO foo fOo y = 2 .endif .ifnone BAR bar Bar SOmething z=3 .endif .fi .ft \" revert When \*(TC finishes processing this, x=1, y=2, and z=0. You can also use references to environment variables in an existential conditional test: .ft C \" Courier .nf .ifany $MYPROGOPTIONS options = [$MYPROGOPTIONS] .else options = -b20 -c23 -z -r .endif .fi .ft \" revert Finally, you can use variable references here to get the name of a variable to test by "indirection" (as we saw in the previous section on accessing/setting variables indirectly). This should be used sparingly since it can be kind of obscure to understand, but it is possible to do this: .ft C \" Courier .nf foo = MyVarName .ifany [FOO] ... .endif .fi .ft \" revert This will test to see if either the variable \fCMyVarName\fP exists. You can also do indirection through an environment variable. Use this construct with restraint - it can introduce serious obscurity into your configuration file. Still, it has a place. Say the \fCTERM\fP environment variable is set to \fCvt100\fP: .ft C \" Courier .nf .ifany [$TERM] ... .endif .fi .ft \" revert This will test to see if a variable called \fCvt100\fP exists in the symbol table. This is a handy way to see if you have a local variable defined appropriate for the currently defined terminal, for instance. .SS Comparison Conditional Directives There are two Comparison Conditionals: .ft C \" Courier .nf .if string1 == string2 # True if string1 and string2 are identical .if string1 != string2 # True if string1 and string2 are different .fi .ft \" revert As a general matter, you can put literal strings on both sides of such a test, but the real value of these tests comes when you use variable references within the tested strings. In this case, the value of the variable .B does matter. It is the variable's value that is replaced in the string to test for equality or inequality: .ft C \" Courier .nf MyName = Tconfpy .if [MyName] == Tconfpy MyAge = 100.1 .else MyAge = Unknown .fi .ft \" revert .fi These are particularly useful when used in combination with the \*(TC Predefinded Variable or environment variables. You can build configurations that "sense" what system is currently running and "adapt" accordingly: .ft C \" Courier .nf AppFiles = MyAppFiles .if [.OSNAME] == FreeBSD files = [$HOME]/[AppFiles] .endif .if [.OSNAME] == Windows files = [$USERPROFILE]\\[AppFiles] .endif .ifnone [files] ErrorMessage = I don't know what kind of system I am running! .endif .fi .ft \" revert .SS The \fC.literal\fP Directive By default, \*(TC only permits statements it "recognizes" in the configuration file. Anything else is flagged as an unrecognized statement or "syntax error". However, it is possible to "embed" arbitrary text in a configuration file and have \*(TC pass it back to the calling program without comment by using the \fC.literal\fP directive. It works like this: .ft C \" Courier .nf .literal This is literal text that will be passed back. .endliteral .fi .ft \" revert This tells \*(TC to ignore everything between \fC.literal\fP and \fC.endliteral\fP and just pass it back to the calling program (in \fCretval.Literals\fP - see previous section on the \*(TC API). Literal text is returned in the order it is found in the configuration file. What good is this? It is a nifty way to embed plain text or even programs written in other languages within a configuration file and pass them back to the calling program. This is especially handy when used in combination with \*(TC conditional features: .ft C \" Courier .nf .if [.PLATFORM] == posix .literal We're Running On A Unix-Like System .endliteral .else .literal We're Not Running On A Unix-Like System .endliteral .endif .fi .ft \" revert In other words, we can use \*(TC as a "preprocessor" for other text or computer languages. Obviously, the program has to be written to understand whatever is returned as literal text. By default, \*(TC leaves text within the literal block completely untouched. It simply returns it as it finds it in the literal block. However, the programmer can invoke \*(TC with an option (\fCLiteralVars=True\fP) that allows .B variable substitution within literal blocks. This allows you to combine the results of your configuration into the literal text that is returned to the calling program. Here is how it works: .ft C \" Courier .nf .ifall $USER Greeting = Hello [$USER]. Welcome to [.MACHINENAME]! .else Greeting = Hello Random User. Welcome To Random Machine! .endif # Now embed the greeting in a C program .literal #include <stdio.h> main() { printf("[Greeting]"); } .endliteral .fi .ft \" revert If the calling program sets \fCLiteralVars=True\fP, the literal block will return a C program that prints the greeting defined at the top of this example. If they use the default \fCLiteralVars=False\fP, the C program would print \fC[Greeting]\fP. In other words, it is possible to have your literal blocks make reference to other configuration variables (and Predefined or Environment Variables). This makes it convenient to combine both configuration information for the program, .B and other, arbitrary textual information that the program may need, all in a single configuration file. Notice too that the \fC#\fP character can be freely included within a literal block. You don't have to use a Reserved Variable reference like \fC[HASH]\fP here because .B everything (including whitespace) inside a literal block is left untouched. If you fail to provide a terminating \fC.endliteral\fP, the program will treat everthing as literal until it reaches the end of the configuration file. This will generate an appropriate warning, but will work as you might expect. Everything from the \fC.literal\fP directive forward will be treated literally. As a matter of good style, you should always insert an explicit \fC.endliteral\fP, even if it is at the end of file. Placing an \fC.endliteral\fP in the configuration file without a preceding \fC.literal\fP will also generate a warning message, and the statement will be ignored. .SS GOTCHAS \*(TC is a "little language". It is purpose-built to do one and only one thing well: process configuration options. Even so, it is complex enough that there are a few things that can "bite" you when writing these configuration files: .IP \(bu 4 Probably the most common problem is attempting to do this: .ft C \" Courier .nf foo = bar .if foo == bar ... .endif .fi .ft \" revert But this will not work. \*(TC is very strict about requiring you to explicitly distinguish between .B variable names and .B variable references. The example above checks to see if the string \fCfoo\fP equals the string \fCbar\fP - which, of course, it never does. What you probably want is to compare the value of variable \fCfoo\fP with some string: .ft C \" Courier .nf foo = bar .if [foo] == bar ... .endif .fi .ft \" revert Now you're comparing the .B value of the variable \fCfoo\fP with the string \fCbar\fP. This was done for a very good reason. Because you have to explicitly note whether you want the name or value of a variable (instead of having \*(TC infer it from context), you can mix both literal text and variable values on either side of a comparison or assignment: .ft C \" Courier .nf foo = bar foo[foo]foo = bar # Means: foobarfoo = bar .if foo[foo] == foobar # Means: .if foobar == foobar .fi .ft \" revert .IP \(bu 4 Namespaces are a handy way to keep configuration options organized, especially in large or complex configurations. However, you need to keep track of the current namespace when doing things: .ft C \" Courier .nf foo = bar .... [NS-NEW] .if [foo] == something # Checks value of NS-NEW.foo - will cause error # since no such variable exists .fi .ft \" revert .IP \(bu 4 Remember that "last assignment wins" when setting variable values: .ft C \" Courier .nf myvar = 100 ... a long configuration file myvar = 200 .fi .ft \" revert At the end of all this, \fCmyvar\fP will be set to 200. This can be especially annoying if you \fC.include\fP a configuration file after you've set a value and the included file resets it. As a matter of style, it's best to do all the \fC.include\fPs at the top of the master configuration file so you won't get bitten by this one. .IP \(bu 4 Remember that case matters. \fCFoo\fP, \fCfoo\fP, and \fCfoO\fP are all different variable names. .IP \(bu 4 Remember that all variable references are .B string replacements no matter what the type of the variable actually is. \*(TC type and value enforcement is used to return the proper value and type to the calling program. But within the actual processing of a configuration file, variable references (i.e., the values of variables) are always treated as .B strings. .SH ADVANCED TOPICS FOR PROGRAMMERS Here are some ideas on how you might combine \*(TC features to enhance your own applications. .SS Guaranteeing A Correct Base Configuration While it is always nice to give users lots of "knobs" to turn, the problem is that the more options you give them, the more they can misconfigure a program. This is especially a problem when you are doing technical support. You'd really like to get them to a "standard" configuration and then work from there to help solve their problem. If your write your program with this in mind, \*(TC gives you several ways to easily do this: .IP \(bu 4 Provide a "standard" system-, or even, enterprise-wide configuration file for your application. This file presumably has all the program options set to "sane" values. All the user has to do is create a configuration file with one line in it: .ft C \" Courier .nf .include /wherever/the/standard/config/file/is .fi .ft \" revert .IP \(bu 4 Predefine every option variable the program will support. Populate the initial symbol table passed to \fCParseConfig()\fP with these definitions. By properly setting the \fCType\fP, \fCLegalVals\fP, and \fCMin/Max\fP for each of these variables ahead of time, you can prevent the user from ever entering option values that make no sense or are dangerous. .IP \(bu 4 Make sure ever program option has a reasonable \fCDefault\fP value in its variable descriptor. Recall that this attribute is provided for the programmer's convenience. (When a variable descriptor is first instantiated, it defaults to a string type and sets the default attribute to an empty string. However, you can change both type and default value under program control.) If you predefine a variable in the initial symbol table passed to the parser, \*(TC will leave this attribute alone. However, variables that are created for the first time in the configuration file will have this attribute set to the first value assigned to the variable. Now provide a "reset" feature in your application. All it has to do is scan through the symbol table and set each option to its default value. .SS Enforcing Mandatory Configurations The \*(TC type and value validation features give you a handy way to enforce what the legal values for a particular option may be. However, you may want to go further than this. For instance, you may only want to give certain classes of users the ability to change certain options. This is easily done. First, predefine all the options of interest in the symbol table prior to calling the \*(TC parser. Next, have your program decide which options the current user is permitted to change. Finally, mark all the options they may not change as "Read Only", by setting the "Writeable" attribute for those options to \fCFalse\fP. Now call the parser. This general approach allows you to write programs that support a wide range of options which are enabled/disabled on a per-user, per-machine, per-domain, per-ip, per-company... basis. .SS Iterative Parsing There may be situations where one "pass" through a configuration file may not be enough. For example, your program may need to read an initial configuration to decide how to further process the remainder of a configuration file. Although it sounds complicated, it is actually pretty easy to do. The idea is to have the program set some variable that selects which part of the configuration file to process, and then call the parser. When the parser returns the symbol table, the program examines the results, makes whatever adjustments to the symbol table it needs to, and passes it back to the parser for another "go". You can keep doing this as often as needed. For instance: .ft C \" Courier .nf # Program calls the parser with PASS set to 1 .if [PASS] == 1 # Do 1st Pass Stuff .endif # Program examines the results of the first pass, does # what is has to, and sets PASS to 2 .if [PASS] == 2 # Do 2nd Pass Stuff .endif # And so on .fi .ft \" revert In fact, you can even make this iterative parsing "goal driven". The program can keep calling the parser, modifing the results, and calling the parser again until some "goal" is met. The goal could be that a particular variable gets defined (like \fCCONFIGDONE\fP). The goal might be that a variable is set to a particular value (like, \fCSYSTEMS=3\fP). It might even be tempting to keep parsing iteratively until \*(TC no longer returns any errors. This is not recommended, though. A well-formed configuration file should have no errors on any pass. Iterating until \*(TC no longer detects errors makes it hard to debug complex configuration files. It is tough to distinguish actual configuration errors from errors would be resolved in a future parsing pass. .SH INSTALLATION There are three ways to install \*(TC depending on your preferences and type of system. In each of these installation methods you must be logged in with root authority on Unix-like systems or as the Administrator on Win32 systems. .SS Preparation - Getting And Extracting The Package For the first two installation methods, you must first download the latest release from: .ft C \" Courier .nf http://www.tundraware.com/Software/tconfpy/ .fi .ft \" revert Then unpack the contents by issuing the following command: .ft C \" Courier .nf tar -xzvf py-tconfpy-X.XXX.tar.gz (where X.XXX is the version number) .fi .ft \" revert Win32 users who do not have tar installed on their system can find a Windows version of the program at: .ft C \" Courier .nf http://unxutils.sourceforge.net/ .fi .ft \" revert .SS Install Method #1 - All Systems (Semi-Automated) Enter the directory created in the unpacking step above. Then issue the following command: .ft C \" Courier .nf python setup.py install .fi .ft \" revert This will install the \*(TC module and compile it. You will manually have to copy the 'test-tc.py' program to a directory somewhere in your executable path. Similarly, copy the documentation files to locations appropriate for your system. .SS Install Method #2 - All Systems (Manual) Enter the directory created in the unpacking step above. Then, manually copy the tconfpy.py file to a directory somewhere in your PYTHONPATH. The recommended location for Unix-like systems is: .ft C \" Courier .nf .../pythonX.Y/site-packages .fi .ft \" revert For Win32 systems, the recommended location is: .ft C \" Courier .nf ...\\PythonX.Y\\lib\\site-packages .fi .ft \" revert Where X.Y is the Python release number. You can precompile the \*(TC module by starting Python interactively and then issuing the command: .ft C \" Courier .nf import tconfpy .fi .ft \" revert Manually copy the 'test-tc.py' program to a directory somewhere in your executable path. Copy the documentation files to locations appropriate for your system. .SS Install Method #3 - FreeBSD Only (Fully-Automated) Make sure you are logged in as root, then: .ft C \" Courier .nf cd /usr/ports/devel/py-tconfpy make install .fi .ft \" revert This is a fully-automated install that puts both code and documentation where it belongs. After this command has completed you'll find the license agreement and all the documentation (in the various formats) in: .ft C \" Courier .nf /usr/local/share/doc/py-tconfpy .fi .ft \" revert The 'man' pages will have been properly installed so either of these commands will work: .ft C \" Courier .nf man tconfpy man test-tc .fi .ft \" revert .SS Bundling \*(TC With Your Own Programs If you write a program that depends on \*(TC you'll need to ensure that the end-users have it installed on their systems. There are two ways to do this: .IP \(bu 4 Tell them to download and install the package as described above. This is not recommended since you cannot rely on the technical ability of end users to do this correctly. .IP \(bu 4 Just include 'tconfpy.py' in your program distribution directory. This ensures that the module is available to your program regardless of what the end-user system has installed. .SH THE \*(TC MAILING LIST TundraWare Inc. maintains a mailing list to help you with your \*(TC questions and bug reports. To join the list, send email to .B majordomo@tundraware.com with a single line of text in the body (not the Subject line) of the message: .ft C \" Courier .nf subscribe tconfpy-users your-email-address-goes-here .fi .ft \" revert You will be notified when your subscription has been approved. You will also receive detailed information about how to use the list, access archives of previous messages, unsubscribe, and so on. .SH OTHER \*(TC requires Python 2.3 or later. .SH BUGS AND MISFEATURES None known as of this release. .SH COPYRIGHT AND LICENSING \*(TC is Copyright (c) \*(CP TundraWare Inc. For terms of use, see the tconfpy-license.txt file in the program distribution. If you install \*(TC on a FreeBSD system using the 'ports' mechanism, you will also find this file in /usr/local/share/doc/py-tconfpy. .SH AUTHOR .nf Tim Daneliuk tconfpy@tundraware.com