Pascal Implementation by Steven Pemberton and Martin Daniels

Chapter 8: Compiling Declarations

Lines: [1025-1806]

Routines: typ, simpletype, fieldlist, labeldeclaration, constdeclaration, typedeclaration, vardeclaration, procdeclaration, parameterlist.

There are five kinds of declaration: label, constant, type, var, and routine.

Routine labeldeclaration, lines [1387-1415]

[1390] This repeat gets each of the labels being declared, [1392-7] Check that this label has not already been declared, by working down the list of labels declared at this level [1398-405]. If not, add it to the list.

Note

  1. Notice [1401-2] the code

    genlabel(lbname); labname:=lbname.

    in some places (e.g. [1722-3]). Code like this occurs because packed fields of records cannot be passed as variable parameters. However labname is not packed, so here can be written genlabel(labname).

Routine constdeclaration, lines [1417-38]

[1422] This while loop gets each constant declaration in the list.

[1424-5] Fill in the identifier details.

[1427] Check the presence of the equals sign.

[1428] Get the constant value.

[1429] Enter the identifier in the local tree.

[1430] Set the type and value fields.

Note

  1. The position of the enterid [1429] is very important, to prevent the wrong interpretation of

    const i=i;

Routine vardeclaration, lines [1483-1526]

[1486-7] The inner loop gets each identifier of a declaration like

i, j, k: integer

The outer loop accepts one of these groups repeatedly.

[1488-94] Get the identifier, fill in its details, and enter it in the tree.

[1495] Because the type of the identifier is not known until the end of the declaration, the identifiers have to be linked together in order to go back and fill in the type when known. They are linked by the next field. This statement is part of that process; nxt points to the last identifier processed.

[1505] Get the type.

[1506-11] Work back through the identifiers filling in the type and allocating addresses to the variables.

[1518] typedels is array record set file [3816]. This is for error recovery.

[1519-25] If any of the types in the variable declarations are pointers, for example

 var p1, p2: ^person;

and the type pointed to (person in this case) is undeclared, no error is reported because of the way typ handles forward declared pointers [1259-60] (for example

type p = ^r;
r = record ... end).

What typ does is insert such an identifier in a list of unresolved forward references, starting at fwptr. If this list is not empty, all these types must be reported as undeclared.

Notes

  1. Because the variables are chained backwards, they are allocated addresses in reverse order (the last identifier has a lower address than the first). This can be inconvenient when trying to interpret the post-mortem dump produced by the interpreter. Better to chain the variables forwards thus:

    initially

    last:= nil; first:= nil

    then repeatedly for each variable

    new(this); ... (*fill in details, etc.*)
    if last=nil
    then first := this
    else last^.next := this;
    this^.next := nil;
    last:=this
  2. Having to check for forward references here is messy because there cannot be forward references in a variable declaration, only in a type declaration. Typ should really have a parameter saying whether it was called from a var declaration or a type declaration, and only suppress error messages in the latter case.

  3. Error 118 would be better as 117 [1520].

Routine typedeclaration, lines [1440-8 1]

[1445] Repeatedly get one type declaration.

[1446-9] Get one identifier and fill in its details.

[1450] Check for the equals sign.

[1451] Get the type.

[1452] Enter the identifier on the tree.

[1453] Fill in the type field.

[1455-66] Work along the list of unresolved forward referenced types and see if this has resolved one.

[1458] If the names are equal, this has resolved one.

[1459] Only pointers cause forward references, so fill in the element type of the pointer type.

[1460-2] Delete it from the forward list.

[1467-72] Check for semicolon.

[1474-80] If the forward list is not empty at the end of the type declaration, then these are undeclared types and must be reported.

Note

  1. The code at [1479] is because the listing may have been interrupted mid-line. The line must be restarted at the right column so that error messages will point to the right characters.

    However there may be no listing being produced so this line should start

    if list and not eol then ....

Routine typ, lines [1246-1385]

This routine compiles a type. You might find it helpful to reread the section on the representation of types in the semantic analysis chapter.

[1249-50] Typebegsys and simptypebegsys are initialised at [3814-5], and are the symbols that may begin a type. A simple type is a type identifier, a subrange, or an enumeration.

[1251] Deal with a simple type.

[1253-78] Pointers.

[1254-6] Create a pointer structure.

[1258-78] Get the pointed to type identifier.

[1259-60] Look the type up. If it is not found, do not report an error, to allow forward declared pointer types (see [609] in searchid).

[1261-8] If lcp is nil, the identifier is undeclared, so insert it in the list of forward declared types.

[1271-4] Otherwise check it is not a file, (pointers to files are not allowed), and fill in the element type.

[1281-7] Ignore a packed symbol if present.

[1288-1327] Arrays.

[1290-1312] Get the indexes.

[1292] This repeat gets each of a list of indexes. Each time round, this loop creates an array structure whose element type is either the array structure created the next time round the loop, or the type following the indexes.

[1296] Get one index type.

[1298-1306] Check the index is scalar and not integer or real, and set the index type.

[1314] Get the element type.

[1315-26] Calculate the run-time size of the type, and the sizes of the arrays created for each index of the array.

[1329-50] Records.

[1331] Save the old value of top, to reset at [1348].

[1332-9] Create a new element on top of the display.

[1341] Set the initial displacement for the field address to zero.

[1342] Get the field list.

[1343-7] Create the structure for the record.

[1348] Reset top.

[1352-72] Sets.

[1355] Get the element type.

[1357-63] Check it is a scalar and not integer or real.

[1365-8] Check the type is small enough to be represented in the set size implemented.

[1369-72] Create the set structure.

[1374-7] Files are not implemented.

[1384] Return the size, or a default, in the formal parameter fsize.

Notes

  1. Consider the following procedure, necessarily using forward declared pointers:

    procedure house;
    
    type room = record h, w, l: real; occupant: ^person
                end;
    
    person = record name: alpha; place: ^room end;

    If this procedure was included in a program which had an outer declaration for a type called person

    program p;
    
    type person = 0..9;
    
    procedure house; (* as above *)
    begin
        ...
    end;

    then the person within the room record will now identify the outer person rather than the inner one as before (see (Welsh, 1981) for a full discussion of this). To solve this, all pointers should be treated as forward declared, and then resolved at the end of the type declaration. In this way local identifiers will be given priority over non-local ones.

    As a small refinement of this, searchsection could be used instead of searchid [1260], to reduce the number of pointers forward declared.

    Either way, this overcomes the problems of the prterror [1259-60], which can be eliminated.

  2. Type structure may as well have a packed field set to true at [1284] so that the rules for packed types can be checked.

  3. The calculation of lsize [1321] could well go out of range and there should be a check and an error message here to prevent this happening.

  4. Surprisingly, the syntax diagrams in the Pascal User Manual allow a record to have no fields, though the BNF in the Report does not.

    The compiler takes the former interpretation, allowing

    type r = record end;

    This is the only case where a type can have a size of zero.

  5. Sets are implemented as a fixed size, so that no matter how small the element size (for example set of boolean) no advantage is taken of it. Similarly, it means there is a maximum size beyond which sets are not accepted (for example set of char is not available on many implementations).

    Unfortunately, there is no easy remedy for this, since in many places it is difficult or impossible to discover what sort of set is required, and therefore how to compile it. For example:

    s :=[10];
    fp([]);

    This is especially true in the last case if fp is a formal procedure, since nothing is known about its parameters. See (Welsh, 1981) for more details.

  6. At least the syntax of file declarations should be checked [1374].

Routine simpletype, lines [1029-1112]

Deals with enumerations, subranges and type identifiers. For example,

 type
 colour = (red, green, blue);
 index = 1..10;
 subr = index;

[1037-63] Enumerations.

[1038] Save the value of top for resetting at [1061].

[1039] In a case like

type r = record
             rl: record f: (a, b, c) end
         end

several record elements will be on the top to the display. This line skips over these to locate the closest block to declare a, b and c in. This is because enterid uses top [557, 1053].

[1040-4] Create the structure for the type.

[1048-53] Create a constant identifier and enter it locally. Lcnt is the internal value of this identifier starting from zero for the first [1045] and increasing in ones. Each constant is linked to the preceding one in the list by the next field [1050].

[1061] The type that has been created points to the last of the identifiers declared. Reset top.

[1065-1106] Identifiers and subranges. These are intermingled because if you have a type that starts

type index = i

it might be the start of

type index = i;

or

type index = i..j;

[1066-7] It starts with an identifier.

[1069] If it is a constant, then it is the start of a subrange.

[1072-6] Create a subrange structure, checking the constant was not a string, and setting the min field.

[1078] Get the upper bound.

[1079] Set the max field of the structure.

[1080] Check lowerbound and upperbound are the same type.

[1083-5] The identifier was a type - copy its type field.

[1088-98] A subrange that does not start with an identifier: very similar to [1070-81].

[1099-1106] If the type was a subrange, check it is not of real, and that the lowerbound is less than or equal to the upperbound.

Notes

  1. There is no need to use semantic attributes to decide which path to take [1069]. This produces bad diagnostics when the identifier is undeclared. The presence of a colon after it can be used.

  2. Since real is checked for at [1103], the two checks for string [1073, 1090] can be moved there too.

  3. A subrange of real is an error, not an implementation restriction.

Routine fieldlist, lines [1114-1244]

A large and unwieldy routine, that compiles the fields of a record.

[1117-58] Deals with the fixed part.

[1159-239] Deals with the variant part.

[1120] This loop gets each group of fields like

a, b, c: integer

[1122] This loop gets each identifier of the group.

[1124-39] Create the identifier and enter it into the record. The fields are linked together with the next field.

[1141] Get the type.

[1142-7] Work down the list of identifiers assigning offsets, allocating space and filling in the type field.

[1155-8] Reverse the pointers of the list. It is unclear why this is done.

[1160-3] There is a variant part. Create a structure for the tag field.

[1165-71] Get the identifier, fill in its details, and enter it.

[1174-5] Get the type of the tag identifier.

[1176-80] Assign an offset and allocate space for the tag.

[1181-6] Check that the type of the tag is scalar and not real. Fill in the type of the tag, and the tagfield of the record.

[1196-1239] Get each variant part.

[1199-1217] Get a list of constants.

[1200-1] Check that the type of the constant matches the tag type.

[1202-6] Create a variant structure for the constant. They are chained together by the nxtvar field.

[1207-13] Check the list of variant parts to make sure this one has not been duplicated.

[1220] Get the field list for this variant part. Lsp2 points to the variant part for that fieldlist. All variants are chained together permanently by the nxtvar field. The variants in this list are temporarily chained together by the subvar field for later use [1222-6]. For example,

case x : z of
    a, b, c:(....);
    d, e, f: (....)
end

[1222-6] Fill in the subvar field of the latest list of variants to point at the subvariant part.

[1236] Each new set of variants starts at the same offset, so reset displ (displacement)

[1240] Finally set the displacement to the size of the whole record.

[1241] Point the tag field to the last variant (despite the name fstvar).

Notes

  1. This routine is too large. It could easily be split into at least two, for the fixed and variant parts, to make it more manageable.

  2. All the names of the variables declared on [1115] are badly chosen and not at all mnemonic. Names should have been chosen to help clarify some of the more difficult parts.

  3. Fldaddr should not be set at [1169], since this value may be wrong, and the correct value is set at [1179].

  4. A lot of trouble is gone to around [1183] to give the 399 error. However a string tag is an error, not a restriction.

Routine procdeclaration, lines [1528-1806]

Compile a procedure or function declaration: parameter fsy [1528] is either procsy or funcsy to indicate which.

[1703] Save the value of lc for resetting later [1805], and restart lc for a new routine. Lcaftermarkstack is the initial allocation for the system values.

[1704] Get the identifier.

[1705-15] If the identifier has already been declared at this level, check that it was a forward declaration for the same class of routine.

[1716-28] If it was not previously declared, create an identifier for it and enter it. Pfname will be the label of the first instruction. The code generated for a procedure or function is

 ENT 1 varsize
 ENT 2 stacksize
 [Code for the body]
 RETP or RETI etc.

[1729-41] If it was forward declared, allocate space for the parameters: set lc to the right initial value by moving it over the parameter area. This is done by finding the parameter with the highest address and allocating space for it.

[1746] Save the values of level and top for resetting later [1805].

[1747] This is the only place in the compiler where level is incremented.

[1748-58] Create an element on top of the display,

[1752-3] If it was forward declared, the display is pointed to the parameters, otherwise it is initialised to nil.

[1759-62] For procedures get the parameters.

[1764-82] For functions get the parameters and result type. In both cases a parameter list is accepted even if it was forward declared. Parameterlist prints the error message.

[1765] The next field of the identifier for the routine points to the first parameter.

[1766-81] Get the function type.

[1769] If it was forward declared the result type must not be redeclared.

[1770-2] Identify the type, and store it in the idtype field of the function identifier.

[1773-5] Check that the type of the result is permissible.

[1784-92] This declaration is a forward declaration.

[1786-7] Check it has not already been forward declared, and mark the routine identifier as forward declared.

[1794-1804] Get the block of the routine: declarations and body.

[1794] Flag the routine as not forward declared, and mark the current level of the heap. Everything allocated with a new statement in the call of block will be disposed in one go by the release statement [1803].

[1795] This repeat statement is error recovery. Normally there should only be one block in this position.

[1805] Reset level, top, and lc.

Notes

  1. The name should be procorfuncdeclaration.

  2. Variables lsy and parcnt [1529-1530] are not used.

  3. The parameter with the highest address is the last one - so instead of [1731-40], the earlier parameters could just be skipped, and the value of lc calculated.

  4. Presumably the values of top and level are equal at [1746], so only one need be saved.

  5. [1759-62] and [1764-5] could be combined and [1767] changed to

    begin if fsy = procsy then error(14); insymbol

    Then the type would still be analysed.

  6. Forward is not a reserved word, and should be treated as an ordinary identifier. The test at [1784] should then be

    if (sy = ident) and (id = 'forward') then

Routine parameterlist, lines [1533-1700]

Compile the parameters of a procedure or function declaration. At run-time the parameters of a routine are handled in three ways. Variable parameters have their addresses passed over, and are accessed indirectly via these addresses. Value parameters that are not arrays or records have their values passed over and these values are accessed directly. Value parameters that are arrays or records, and therefore potentially large, have their addresses passed across only, and then within the routine the values are copied into local cells and accessed directly. When addresses are allocated to such parameters, it is the address of the local cells that get used.

[1539-40] If there are parameters, this routine must not have been previously forward declared.

[1544] There are four classes of parameter: value, variable, procedure and function. This while gets each group of parameters (up to each semicolon).

[1546-66] Procedure parameters.

[1547] Procedure parameters are not implemented.

[1548] This repeat gets each of a list of identifiers.

[1550-6] Create an identifier for the parameter.

[1552] All parameters are linked together by their next field.

[1558-9] Allocate space for the parameter. 'Some size' would be whatever size a formal procedure would take.

[1569-1611] Function parameters.

[1569-89] Similar to procedure parameters.

[1591-1609] Get the function type.

[1592-4] Get the type identifier.

[1595-7] Check its type.

[1598-1602] Work down the list of function identifiers just created filling in the type.

[1603] Add the list of function identifiers to the list of parameters.

[1613-68] Variable and value parameters.

[1614-16] The only difference in treatment between these two is that variable parameters have kind formal and value parameters are actual

[1618] Count will count the number of parameters declared in the next group (i.e. of the same type) for the purposes of address assignment [1649].

[1619] The repeat gets each identifier in the group.

[1620-26] Create the identifier. As usual, they are linked by the next field.

[1636-66] Get the type.

[1638-40] Get the identifier.

[1641] The size of variable parameters, value records and value arrays is ptrsize [43], the size to hold an address.

[1643-4] The size of simple value parameters depends on their type.

[1645] Value files are not permitted.

[1646-61] Assign addresses to the parameters. Since the identifiers are chained together backwards, addresses are assigned backwards.

[1646] Make lsize a multiple of parmal.

[1648] Align lc.

[1649] Allocate space for all the variables declared in this group (count says how many).

[1651-9] Work back down the chain allocating addresses: llc decreases by lsize for each parameter.

[1660] Link this group into the list of parameters.

[1678] Fsy is either [semicolon] or [semicolon, colon] depending on whether this is a parameter list for a procedure or function [1760, 1764].

[1682-96] Since the parameters are chained backwards, the list must be reversed. While doing this, allocate the local space for the value arrays and records.

[1697] Return a pointer to the first parameter.

Notes

  1. This procedure is too long.

  2. [1547-65] and [1570-89] are so similar they should be a procedure.

  3. If the type identifier is missing for a formal function, no error is reported.

    else error(2)

    should appear after [1585]. Similarly, if there is no identifier in a variable or value parameter, there is no error reported.

    else error(2)

    should appear after [1629].

    Pascal Implementation by Steven Pemberton and Martin Daniels