This section defines the semantics of the ClassAd language by explaining how to evaluate an expression.In this section, ``expression'' means an internal-form expression tree. In general, a composite expression is evaluated by recursively evaluating its component sub-expressions and then using its top-level operator to combine the results. However, there are situations in which evaluation of an expression E depends on parts of a context, which is an expression containing E as a sub-expression. For example, in the expression
[ a = 3; b = [ c = a ] ],the second occurrence of
a
(an attribute reference) is evaluated by
searching the two containing Record expressions for a definition of a
,
yielding the constant 3
.
More formally, an expression in context (EIC) is a pair (E, C) consisting of an expression C (the context) and a designated occurrence of a sub-expression E of C. The semantics of the ClassAd language is defined by a recursive function eval from EICs to EICs. A top-level EIC is an EIC of the form (E, E). For brevity, we will occasionally abbreviate the top-level EIC (E, E) as E, particularly when E is a literal constant. For example, the EIC (error, error) may be written as error. An expression E is evaluated by computing eval(E, E) and extracting the sub-expression from the resulting EIC.
The set of EICs with context C is partially ordered by the relation , defined by iff E is a sub-expression of E'. When we speak of the ``minimal'' EIC with a given property, we mean the one that is minimal with respect to . An EIC (E, C) is called a scope if the top-level operator of E is RECORD.
Define lookup(s, (E, C)), where s is a string and (E, C) is an EIC, to be the EIC (E', C'), where
"parent"
and
there is a scope (Ep, C) such that
,
then
(E', C') = (Ep, C), where (Ep, C) is the minimal such scope.
For example, let C be the expression
[ a = x; b = [ a = y; c = a]; d = a ],and let R denote the inner Record expression. C contains two occurrences of the attribute-reference expression
a
.
Let E1 denote the occurrence inside R and E2 the other occurrence.
Then
,
,
,
,
, and
.
Each expression has a type, which is one of Integer, Real, String, Boolean, AbsTime, RelTime, Undefined, Error, List, or Record. The types Integer and Real are collectively called numeric types. The types AbsTime and RelTime are collectively called timestamp types. Each operator imposes constraints on the types of its operands. If these constraints are not met, the value returned by the operator is error.
An attribute reference with attribute name N evaluates to undefined if the reference is not contained in any scope that defines N. It may also evaluate to undefined in the presence of loops, as in
[ a = b; b = a ].
Most operators are ``strict'' with respect to undefined and error. The only exceptions are the Boolean operators described in Section 4.3.1, the operators is and isnt described in Section 4.3.2, and the LIST and RECORD constructors described in Section 4.3.8. Strict evaluation obeys the following ordered sequence of rules.
A literal constant evaluates to itself. More precisely, if c is an occurrence of a literal constant, then eval(c, C) = (c, C).
If x is an attribute reference with attribute name N, then eval(x, C) = eval(lookup(N, (x, C))). In particular, (x, C) evaluates to undefined if there is no scope (R, C) containing the indicated occurrence of x such that R defines N. If this recursive definition leads directly or indirectly to a call eval(x, C), the result is undefined.
List and Record expressions evaluate to themselves.
More precisely, if E is an expression whose root operator is LIST or
RECORD, eval(E, C) = (E, C).
The operators SELECT and SUBSCRIPT are discussed below.
For all other operators, evaluation is ``bottom-up'' and the result is
a ``pure value''.
More precisely, if is a binary operator other than
SELECT, or SUBSCRIPT, then
The operators found in C, C++, or Java are generally evaluated according to the rules of those languages. In cases where the specifications of those languages differ, the ClassAd language follows the Java semantics because it is more precise (the C and C++ specifications occasionally say the results are ``undefined'' or ``implementation defined'' in unusual situations). The only deviations from Java semantics involve exceptions. In cases where Java specifies that evaluation throws an exception, the ClassAd language returns the constant error. The constants error and undefined also require special treatment when supplied as arguments to operators.
&&
and ||
and the ternary operator _?_:_
are evaluated ``left to right''
with respect to error, and ``optimistically'' with respect to
undefined.
For example,
trueeven if x evaluates to error or undefined.||
x = true
false && x = false
undefined||
true = true
true?
val:
x = val
false?
x:
val = val
The Boolean operators treat Boolean true, false, and undefined as a three-element lattice with
falseWith respect to this lattice,<
undefined<
true.
&&
returns the minimum of its operands,
||
returns the maximum, and !
interchanges true and
false.
The complete definition of the operators &&
, &&
,
!
, and _?_:_
is given by the tables
&& | F U T O || | F U T O ! | ?:| ---+-------- ---+-------- --+-- ---+--- F | F F F F F | F U T E F | T F | expr3 U | F U U E U | U U T E U | U U | U T | F U T E T | T T T T T | F T | expr2 O | E E E E O | E E E E O | E O | EIn these tables, the letters
T
, F
, U
, and E
stand for the constants
true, false, undefined, and error,
respectively;
O
stands for any expression other than
true, false, or
undefined (including error);
and expr2
and expr3
represent the second and third operands of
the expression expr1 ?
expr2 :
expr3.
result = (expr is undefined) ? 0 : (expr + 1);but they can be used to compare arbitrary values.
For the purposes of this section, the relationship ``identical'' is defined as follows.
R
is the expression [ a = { 1, 2 }; b = { 1, 2 }; c = a is b; d = a is a ],then R.c evaluates to false, while R.d evaluates to true.
<
, <=
, ==
, !=
, >=
, and >
,
both operands must be numeric (Integer or Real), both String, both AbsTime, or
both RelTime.
Otherwise, the result is error.
If one operand is Integer and the other is Real, the Integer argument is
first converted to Real.
The results are calculated as in Java [6].
If the operands are Strings, they are converted to lower case and compared lexicographically.
If the operands are AbsTimes, they are equal if they correspond to the same instant (according to UTC). Otherwise, the earlier time is less than the later one. If the operands are RelTimes, they are compared as signed integers.
+
,
-
,
and binary operators
+
,
-
,
*
,
/
,
%
,
take numeric operands.15The results are calculated as in Java [6],16with one exception:
Integer division or
remainder when the second operand is zero throws an ArithmeticException in
Java, but returns error in the ClassAd language.
In particular, if operands are Integers, the result is an Integer, and if one
operand of a binary operation is an Integer and the other is a Real, the
Integer operand is converted to a Real and the result is computed using 64-bit
floating point arithmetic.
The integral /
operation truncates the result towards zero, and the
integral %
operation generally returns a result with the same sign as
the dividend (the left operand).
See the Java language specification [6] for details.
The unary and binary operators +
and -
are also defined for
certain timestamp operands.
The unary +
operator is applicable to both AbsTime and RelTime operands
and returns the value of its operand unchanged.
The unary -
operator is applicable only to RelTime operands and returns
the RelTime value with the same magnitude and opposite sign.
The rules for binary operators are summarized in Table 6.
If the result of an expression is an AbsTime, its time zone is the same as
the time zone of the AbsTime argument.
~
and binary operators
|
,
^
, and
&
are defined only for Integer and Boolean operands.
They are defined to return the same results as the corresponding operators in
Java [6].
<<
(left shift),
>>
(right shift with sign extension), and
>>>
(right shift with zero fill) are defined only for Integer
operands.17They are defined to return the same results as the corresponding operators in
Java [6].
.
selector.
It is semantically equivalent to base["
selector"]
.
That is, an instance of SUBSCRIPT operator where the subscript is
the string value corresponding to the attribute name.
For example,
[ rec = [ One = 1; Two = 2 ]; val = rec.one ].valand
[ rec = [ One = 1; Two = 2 ]; val = rec["one"] ].valboth evaluate to
1
.
The SELECT syntax is more concise, but the SUBSCRIPT syntax is more flexible,
because it allows the selector to be computed rather than requiring a literal
string.
The SUBSCRIPT operator has two operands, the base and the
subscript.
In the native syntax, it is written
base[
subscript]
.
The subscript expression must have type Integer or String.
If the subscript is an Integer i, the base expression must have type List and
the result is the ith element of the list, counting from zero.
If the subscript is a String s, the base expression must be a
Record or List. If the base expression has type Record,
the result is computed by searching the base and its containing scopes for an
attribute definition matching the attribute name s.
If the base expression is a List, the SUBSCRIPT operator is applied to
each member of the list and the result is a new ``top-level'' list of the
results.
In all other cases, the result is error.
More precisely,
[
name0 = value0; ... ; namen-1 = valuen-1]
List and Record expressions evaluate to themselves. That is, eval(E, C) = (E, C) if E is of type List or Record.
Currently, all functions are strict with respect to error and undefined, unless otherwise specified. In other words, all arguments are evaluated, and if any argument evaluates to error or undefined, the result is error or undefined, respectively. If arguments of both types are present, the result is error.
Currently, all functions return ``top-level'' values that are independent of the the context of the call. That is , where for i = 1,...,n and V is a value computed from E1', ..., En' as described in the following table.
The following table lists all functions required by the current version of this specification; others may be added in future versions. The description of each function is preceded by a prototype indicating restrictions on the number and types of arguments and indicating the type of the result returned. If the restrictions are violated, the result is error. In the prototypes, ``const'' stands for any literal constant of type Integer, Real, String, Boolean, AbsTime, or RelTime (but not Undefined, Error, List, or Record), and ``any'' means any expression. A type followed by an asterisk indicates any number of arguments of the indicated type, including none. Square brackets are used to indicate optional arguments.
The result is x converted to an Integer. If x is an Integer, the result is x. If x is a Real, it is truncated (towards zero) to an Integer. If x is true the result is 1. If x is false the result is 0. If x is an AbsTime, it it converted to the number of seconds since the epoch, UTC. If x is a RelTime, it it converted to a number of seconds. If x is a String, it is parsed according to the native syntax for integer_literal or floating_point_literal as in Table 4 and then converted to an Integer as above. If x is a String that does not represent a valid Integer or floating-point literal, the result is error.
The result is x converted to a Real. If x is a Real, the result is x. If x is an Integer, it is converted to Real. If x is true the result is 1.0. If x is false the result is 0.0. If x is an AbsTime, it it converted to the number of seconds since the epoch, UTC. If x is a RelTime, it it converted to a number of seconds. If x is a String, it is parsed according to the native syntax for integer_literal or floating_point_literal as in Table 4 and then converted to a Real as above. In addition, the strings INF, -INF and NaN (in any combination of upper and lower case) are recognized as representing the IEEE754 values for positive and negative infinity and not-a-number, respectively. If x is a String that does not represent a valid Integer or floating-point literal, the result is error. For any other type, x is converted to an Integer as if by ``int'', and the result is converted to a Real (or error if the conversion to Integer fails).
If x is a String, the result is x. Otherwise, the result is the canonical unparsing of x (see Section 3.3.3).
If x is an Integer, the result is x. Otherwise, x is converted to a Real by the function ``real'' above, and the result is the largest Integer not greater than that value (or error if the conversion fails).
If x is an Integer, the result is x. Otherwise, x is converted to a Real by the function ``real'' above, and the result is the smallest Integer not less than that value (or error if the conversion fails).
If x is an Integer, the result is x. Otherwise, x is converted to a Real y by the function ``real'' above, and the result is the nearest Integer to y. If y is midway between two Integers, the even Integer is returned. The result is error if the conversion fails or the resulting integral value does not fit in 32 bits.
If x is a positive Integer, the result is a random Integer r uniformly chosen from the range . If x is a positive Real number, the result is a random Real uniformly chosen from the same range. If x is omitted, the result is the same as random(1.0). If x is anything else, the result is an error.
Each argument is converted to a String by the function ``string'' above. The result is the concatenation of the strings.
The same as strcmp except that upper and lower case letters are considered equivalent.
The operand is converted to a String by the ``string'' function above. The result is a String that is identical to s except that all lowercase letters in s are converted to uppercase.
The operand is converted to a String by the ``string'' function above. The result is a String that is identical to s except that all uppercase letters in s are converted to lowercase.
If x is not a constant or l is not a list, then the
result is an error. Otherwise, the elements of l are evaluated and if any of
the values are equal to x in the sense of the ==
operator, then
the result is true, otherwise it is false.
If x is not a constant or l is not a list, then the
result is an error. Otherwise, the elements of l are evaluated and if any of
the values are equal to x in the sense of the is
operator, then
the result is true, otherwise it is false.
If any of the arguments is not of type String or if pattern is not a valid regular expresison, the result is an error. Otherwise, if pattern matches target, the result is true, otherwise it is false.
The details of the syntax and semantics of the regular expressions supported currently depends on the implementation. The Java implementation as of Version 2.2 supports perl-compatible regular expressions with certain minor differences as documented by the Java 1.4 documentation at
http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html.The C++ release as of Version 0.9.7 and later support as a build-time option either perl-compatible regular expressions as supported by the pcre library (see
http://www.pcre.org/
) or POSIX regular expressions
http://www.opengroup.org/onlinepubs/007908799/xbd/re.htmlas implemented by the GNU
regex
library
http://www.gnu.org/software/libc/manual/html_node/Regular-Expressions.html.
The options argument, if present, may contain the following characters
to alter the exact details. Unrecognized options are silently ignored.
Only i
and I
are supported by the POSIX version.
Ignore case.
Multi-line: A carat (^
) matches not only the start of the subject
string, but also after each newline. Similarly, dollar ($
) matches
before a newline.
Single-line: Dot (.
) matches any character, including newline.
Extended: Whitespace and comments (from #
to the next newline) in
the pattern are ignored.
If x is not a constant or l is not a list, then the result is an error. Otherwise, the elements of l are evaluated and if any of them evaluates to anything other than a String, the result is an error. Otherwise, if any of values in the list matches the pattern according to the regexp function, the result is true. If there is no match, then the result is false.
<
'', ``<=
'',
``==
'', ``>
'', ``>=
'', ``!=
'', ``is'', or
``isnt'' or l is not a list, the result is an error. Otherwise, the
elements of l are evaluated and compared to t using the ClassAd operator
corresponding to s. If any of the comparisons evaluate to true in
the case of anycompare or all of the comparisons evaluate to
true in the case of allcompare, the result is true.
Otherwise, the result is false.
Returns the current Coordinated Universal Time, in seconds since midnight January 1, 1970.
The operand t is treated as a number of seconds. The result is a String of the form days+hh:mm:ss. Leading components are omitted if they are zero. For example, if the operand is 1472523 = 17*24*60*60 + 1*60*60 + 2*60 + 3 (seventeen days, one hour, two minutes, and three seconds), the result is "17+1:02:03"; if the operand is 67, the result is "1:07".
The operand s is parsed as a specification of an instant in time (date and time). This function accepts the canonical native representation of AbsTime values, but minor variations in format are allowed.
The default format is yyyy-mm-ddThh:mm:sszzzzz
where zzzzz is a time
zone in the format +hh:mm
or -hh:mm
, but variations are allowed.
-
, :
, or T
may be omitted
or replaced by any sequence of non-digits. Note, however, that the
-
in a time zone of the form -hh:mm
may not be omitted.
hh
and mm
in the time zone may be omitted.
zzzzz
or yyyy
.
z
or Z
, which is
equivalent to -00:00
.
+dddd
, -dddd
, z
, or
Z
, where each d
is a digit, this suffix is considered to be
the time zone indication. For example, in 2003+1030
, the suffix
1030
is interpreted as a time zone 10 hours and 30 minutes east,
rather than as October 30.
ss
, mm
, hh
, etc. may be omitted (from
right to left), in which case the omitted fields are assumed to be zero.
D* dddd [D* dd [D* dd [D* dd [D* dd [D* dd D*]]]]] [-dd[:]dd|+dd[:]dd|z|Z]Where
d
stands for a digit and D
stands for a non-digit.
For example, in the United States central time zone, an AbsTime corresponding to ``9 am Jan 25, 2003 CST'' may be created by any of the function calls
2003-01-25T09:00:00-06:00 |
// canonical |
2003-01-25 09:00:00 -0600 |
// different separators |
20030125090000-0600 |
// compact format |
2003-01-25 16:00:00 +01:00 |
// different time zone |
2003-01-25 15:00Z |
// omitted seconds, UTC time zone |
2003-01-25 09:00:00 |
// default time zone (local) |
2003-01-25 09 |
// omitted minutes and seconds |
and AbsTimes corresponding to ``Jan 25, 2003'' (implicitly midnight, UTC) may be written
2003-01-24T18:00:00-06:00 |
// canonical |
2003-01-25T00:00:00 |
// default time zone: UTC |
2003-01-25 |
// omitted time of day |
2003/01/25 |
// different separators |
20030125 |
// compact format |
The strings 2003-01-25T09:00:00-06:00
and 2003-01-25 15:00Z
represent the same instant in time, but measured in different time zones.
The following strings are invalid.
2003-01-25T09:00:00-06 |
// incomplete time zone |
2003-01-25T09:00:00- 0600 |
// space in time zone |
2003-1-25 |
// missing digit in dd field |
Creates an AbsTime value corresponding to time t an time-zone offset z. If t is a String, then z must be omitted, and t is parsed as a specification as described above. If t and z are both omitted, the result is an AbsTime value representing the time and place where the function call is evaluated. Otherwise, t is converted to a Real by the function ``real'' above, and treated as a number of seconds from the epoch, Midnight January 1, 1970 UTC. If z is specified, it is treated as a number of seconds east of Greenwich. Otherwise, the offset is calculated from t according to the local rules for the place where the function is evaluated.
If the operand t is a String, it is parsed as a specification of a time interval. This function accepts the canonical native representation of RelTime values, but minor variations in format are allowed.
Otherwise, t is converted to a Real by the function ``real'' above, and treated as a number of seconds.
The default string format is [-]days+hh:mm:ss.fff
, where
leading components and the fraction .fff
are omitted if they are zero.
In the default syntax, days is a sequence of digits starting with a
non-zero digit, hh
, mm
, and ss
are strings of exactly two
digits (padded on the left with zeros if necessary) with values less than 24,
60, and 60, respectively and fff
is a string of exactly three digits.
In the relaxed syntax,
days
, hh
, etc.
+
may be replaced by d
or D
.
:
may be replaced by h
or H
.
:
may be replaced by m
or M
.
s
or S
may follow the last numeric field.
dDhHmMsS
and
the value of field i-1 is zero, field i-1, together with its
terminating field name (+
, :
, h
, etc.) may be omitted even
if field i-2 is not omitted.
.fff
may have any number of digits. If it has no
digits, the preceding decimal point may be omitted.
For example, one day, two minutes and three milliseconds may have any of the forms
1+00:02:00.003 |
// the result of relTimeToString |
1d0h2m0.003s |
// similar to ISO 8601 |
1d 2m 0.003s |
// add spaces, omit hours field |
1d 00:02:00.003 |
// mixed representations |
1d 00:00:120.003 |
// number of seconds greater than 59 |
86520.002991 |
// seconds, excess precision in fraction |
Creates a ClassAd with each component of the time as an element of the ClassAd. The ClassAd has five attributes:
Type |
// ``RelativeTime'' |
Days |
// the number of days |
Hours |
// the number hours |
Minutes |
// the number of minutes |
Seconds |
// the number of seconds |
Creates a ClassAd with each component of the time as an element of the ClassAd. The ClassAd has five attributes:
Type |
// ``AbsoluteTime'' |
Year |
// the year |
Month |
// the month, from 1 (January) through 12 (December) |
Day |
// the day, from 1 through 31 |
Hours |
// the number of hours |
Minutes |
// the number of minutes |
Seconds |
// the number of seconds |
Offset |
// the timezone offset in seconds |
This function creates a formatted String that is a representation of the absolute time t.
The argument s is interpreted similarly to the format argument of the ANSI C strftime function. It consists of arbitary text plus placeholders for elements of the time. These placeholders are percent signs (%) followed by a single letter. To have a percent sign in your output, you must use a double percent sign (%%).
Because an implementation may use strftime() to implement this, and some versions implement extra, non-ANSI C options, the exact options available to an implementation may vary. An implementation is only required to implement the ANSI C options, which are:
%a | // abbreviated weekday name |
%A | // full weekday name |
%b | // abbreviated month name |
%B | // full month name |
%c | // local date and time representation |
%d | // day of the month (01-31) |
%H | // hour in the 24-hour clock (0-23) |
%I | // hour in the 12-hour clock (01-12) |
%j | // day of the year (001-366) |
%m | // month (01-12) |
%M | // minute (00-59) |
%p | // local equivalent of AM or PM |
%S | // second (00-59) |
%U | // week number of the year (Sunday as first day of week) (00-53) |
%w | // weekday (0-6, Sunday is 0) |
%W | // week number of the year (Monday as first day of week) (00-53) |
%x | // local date representation |
%X | // local time representation |
%y | // year without century (00-99) |
%Y | // year with century |
%Z | // time zone name, if any |
% | // % |
Note that names may be locale-dependent, if the underlying operating system supports locales. Also note that some ClassAd implementations may have difficulty with time zone names for non-local time zones, since the names may vary.
This version of formatTime converts i to an absolute time, then behaves identically to the other version of formatTime.