Pascal Implementation by Steven Pemberton and Martin Daniels

Chapter 9: Compiling the Program

A block is the declaration and body of a routine or of the main program. A body is the statements between begin and end.

The main program is treated exactly as if it were a parameterless procedure, so that the code to execute the program is

 MST 0
 CUP label Call the program
 STP

Every body starts with two instructions to create the stack frame:

 ENT l segsize The size of everything up to the stack area
 ENT 2 stacksize The size of the stack area

Similarly every body ends with a RET instruction, RETP for procedures and the main program, and RETI, RETR, RETC, RETB, and RETA for integer (and enumeration), real, character, boolean, and pointer functions.

Routine block, lines [852-3, 3566-88]

[3567] Set dp (declaration part) to true to indicate that declarations are being compiled. This only affects the listing [332].

[3569-78] Compile the declarations.

[3583-7] Compile the body. The repeat loop is for error recovery. Normally only one block would appear here.

Notes

  1. Variable test [853] is used very heavily throughout the enclosed procedures, as pointed out earlier, and should be replaced by a number of variables local to each routine that uses it.

  2. Fsys does not contain casesy (see [3998]), because it is ambiguous, also occurring in record variants. Therefore it is not included in fsys until the declarations have been compiled [3584].

Routine body, lines [1808-22, 3469-564]

[3470] If this is the body of a routine, fprocp contains the information about it, including its entry name which was generated during its declaration [1722-3]. If fprocp is nil, this is the body for the program, and the entry name has not yet been generated.

[3472] Cstptrix is the pointer into the array of constants for this body, used for code generation.

Topnew and topmax are used for calculating the size of the local stack area (see procedure mes [1825-8]). Initialising these two to lcaftermarkstack assumes that routines will be called in this body, and leaves space on the stack for this to be done. If routines are not called, all that happens is that the stack frame allocation is slightly too large.

[3473-4] Output the label that starts this body. The two ENT instructions specify the size of the stack frame: segsize will be the size of the result, plus the system area, plus the parameters, plus the local variables.

Stacktop will be the size of the local stack for this body.

[3475-97] If this is a routine, then generate the copy instructions for any value record or array parameters. Llc1 works along the parameter addresses.

[3488] Load the address of the local area, the destination of the copy.

[3489] Load the address of the source.

[3490] Generate the code to copy the value.

[3498] Lcmax will be the maximum value of lc, as its name implies. 'Will be', because variables may be allocated for for and with statements.

[3499-504] Compile the statements of the body. The inner loop is error recovery.

[3517-26] Generate the return instruction for a routine.

[352l-5] Define the values for segsize and stacktop.

[3528-34] Do the same for the program. The final 'q' indicates the end of the body of the program.

[3535-40] Generate the code to call the program.

[3541-59] Check that the files mentioned in the program heading were all declared. Since file types are not implemented this is effectively checking that input, output, prr, and prd only, were declared.

Notes

  1. This routine is a bit messy, and would benefit from some reorganisation. The major source of messiness is the treatment of the program.

    One solution would be to create an identifier for the program, of class proc, and with no parameters, so that the if statements at [3470], [3475], and [3517] would not be needed.

    As for [3535-63], this should not be here, but a part of programme [3590-621].

  2. It is unclear why there is a need for two code records. There is nothing to prevent the instructions of the second record being output as the first three instructions of the first, where they should be. This should be done by procedure programme.

  3. The messages at [3551-5] will only be output if the filename was declared as something other than a file. If it was undeclared, llcp will be uvarptr [616] which has an idtype of nil [3776], and so nothing will be printed.

    There should also be an extra space after the word 'file' on [3553] to separate it from the filename.

  4. Variable i [1819] is never used.

Routine programme, lines [3590-621]

Compile the program.

[3598-608] Get the parameters of the program and link them together in a list in fextfilep.

[3615-7] Compile the block. The loop is for error recovery.

[3619-20] Make sure that all the errors have been printed for the last line.

Note

  1. "else error(3)" should appear after line [3614].

Initialisation and the Main Program [3624-4000]

Most of this has been discussed in passing earlier. To gather it together, here is a reminder about these parts.

stdnames [3624-39]

The names of the Pascal standard identifiers. Questions to ask are 'Why is there no round?, Page? Dispose?', etc. Also easy to include would be maxint, ordminchar and ordmaxchar. This procedure should really be part of inittables [3823 et seq.].

enterstdtypes [3641-67]

Create the structures for the standard types. Variable sp is never used. Nilptr should really be called addressptr to more clearly reflect its use. Parmptr is only used as a parameter to align, and is dubious as a type. Text is the type of the standard files.

entstdnames [3669-762]

Enterstdids would be a better name (except that it doesn't differ in the first 8 characters from enterstdtypes so entstdids would have to do). Creates the identifiers for the standard Pascal identifiers. As pointed out before, nil should be a reserved word, not an identifier.

The section [3724-32] apart from what is mentioned in the comment, is also for new, release, readln, writeln; [3733-88] is also for mark; [3739-47] is also for eoln.

enterundecl [3764-96]

These are the identifiers returned by searchid when the required identifier is not found. It is hardly worth calling genlabel for undeclared routines [3787, 3793], "pfname:=0" would do just as well. Anyway, pfname is a packed field and so may not be passed as a variable parameter like this.

initscalars [3798-809]

A real hotch-potch. The allocation of space for the files should be done in procedure programme. Anyway, the comment is wrong: four files are being allocated (see [68]).

The instruction counter ic is initialised to 3 because the first three instructions are not generated until the end.

initsets [3811-21]

Some commonly used syntax synchronising sets. If Pascal allowed set constant declarations, these could all be constants.

inittables: reswords [3824-41]

The table of reserved words in increasing length order. Forward should not be a reserved word. Nil should be.

symbols [3843-65]

The symbols to match the reserved words, and single character operators. If implemented as an array of records, the relationship would be clearer, for example

with sym[1] do
begin name:= 'if'; symbol:=ifsy end;

etc.

rators [3867-76]

The tables for operators. Variable ch is not used.

procmnemonics [3878-86]

The code names for the standard procedures and functions, as used in the CSP instruction. WRO and PAK are never used.

Why is ELN (for eoln) a CSP, and EOF an instruction [3892]?

instrmnemonics [3888-906]

The mnemonics for the instructions. Possibly wasteful using four characters.

chartypes [3908-44]

Classification of characters, not all used. No need for ordint.

initdx [3946-70]

The effects on the stack of the instructions and standard routines. These could be part of a record with their mnemonics:

 
with instr[1] do begin name:='abi': effect:=0
end;

Main Program [3978-4000]

[3986-9] Enter the standard identifiers.

[3990-2] Create a display element for the program identifiers.

[3997-8] Compile the program!

Pascal Implementation by Steven Pemberton and Martin Daniels