Pascal Implementation by Steven Pemberton and Martin Daniels
Function calls are a part of expressions, and procedure calls are a part of statements. Both procedure and function calls generate the same code.
The code for a procedure call like p(a, b) is something like
MST level Mark stack -- prepare for a new routine [Code for a] [Code for b] CUP m label Call user procedure
In the CUP
, m
is the size of the parameter space, and label
the label of the first instruction
of the procedure.
When compiling a call to a procedure or function, each actual parameter is checked to see that it matches in kind and type with the corresponding formal parameter of the procedure's declaration. Problems arise when calling routines that were themselves passed over as parameters. For instance the call of fp in the following:
procedure ap(procedure fp); begin fp(2) end;
In the call of fp, nothing is known about its parameters, so any number are accepted, and their types are not checked, although they must be value parameters. This compiler does not implement procedures passed as parameters, but it does do all checking on them, just failing to generate code.
It is important that you should be able to distinguish between the formal parameters of a procedure (as declared in its declaration) and the actual parameters of a call to the procedure.
There are four kinds of parameter: value, variable, procedure, and function. For instance, in order:
procedure p(a:integer; var b: integer; procedure fp; function ff: integer);
Make sure that you understand what is meant for instance by 'an actual value parameter to a formal procedure'. Fp is a formal procedure in the above example.
The four formal parameter kinds are distinguished as follows: formal procedures and functions have klass = proc or klass = func. Variable and value parameters have klass = vars, with value parameters having vkind = actual and variable parameters having vkind = formal.
This compiles procedure and functions calls to both standard and user-declared routines.
[2695-714] Standard routines. Fcp, passed as a parameter [2227], is the pointer to the identifier of the routine being called. Each standard routine has a compiler routine to deal with it. If pfdeckind is standard this section decides which procedure should deal with it.
[2699-700] Read, readln, write, and writeln may have no parameters. So procedure read [2247-2302], which deals with read, and readln, and procedure write [2304-96], which deals with write and writeln, take care of the brackets for those cases. All other standard procedures have at least one parameter, and these two lines and [2712-3] deal with the brackets.
[2701-11] Select the required standard routine handler. Their names indicate their uses.
[2716-34] Standard functions.
[2717-21] Eof and eoln may have no parameters.
[2720] All others have one expression parameter.
[2722-31] Select the standard function handler. Their names reflect their use, except that eof also compiles eoln.
[2736] Call the procedure to deal with declared procedures and functions. Remember that the arithmetic functions sqrt, sin, cos, exp, ln and arctan are declared rather than standard [3753-61].
As mentioned before, the standard routines should not be numbered, but should use an enumeration, such as
pfkey := (stdput, stdget, stdreset...);
which is more explicit and easier to read.
To suit the naming conventions used, eof should be called eofeoln, read should be called readreadln, and write should be called writewriteln.
Compile a call to a declared routine.
[2598] Locpar counts the number
of locations used for parameters for this call, used in the CUP
instruction [2688].
[2600] The next field points to the identifier for the first parameter; nxt will step along the formal parameters each linked together by the next field. Lkind records whether the routine being called is formal or actual.
[2601] Generate the mark-stack instruction MST
, with the
difference in level between where it is being called from and where it was
declared.
For example in the following
program, the comments indicate the values of MST
that would be
generated for the calls.
The MST
instruction is explained in detail in the
chapter on the interpreter.
program a; procedure b; begin end; procedure c; procedure d; begin b; (*2*) c; (*2*) d (*1*) end; begin (*c*) b;(*1*) c; (*1*) d (*0*) end; begin (*main *) b; (*0*) c (*0*) end.
[2603-82] Deal with parameters.
[2604] Save the current value of lc; copied back [2680]. It is unclear why this is done.
[2605] This repeat statement deals with each parameter (ending at [2679]). Lb indicates whether the next parameter is a formal routine or not. It is true if it is a formal procedure or function, and false if it is a variable or value parameter.
[2608] If nxt is nil, the formal parameters of the routine declaration have been exhausted, so there are too many actual parameters.
[2610] Error 399 reports an implementation restriction that formal routines are not implemented.
[2618-34] The formal parameter is a procedure or function. Since the compiler does not deal with these, only syntax and semantics are checked; no code is generated for them.
[2620-1] The actual parameter must be an identifier.
[2624-9] Make sure the identifier is a procedure or function as required, and if a function, that its type matches that required.
[2631-2] Make sure that the parameter was just an identifier.
[2636-77] The formal parameter is a variable or value parameter. Get the expression.
[2638] If the routine being called is an actual routine, then the parameter can be checked...
[2640] ... as long as the formal parameters have not been exhausted.
[2644-61] Deal with value parameters.
[2645-55] 'Small' value parameters.
[2646] Load the actual parameter.
[2647] Generate checking code if required.
[2648-52] If the formal parameter is real, and the actual integer, float it.
[2653-4] Increment locpar by the required amount. Every parameter must be aligned on a parmal boundary [65], since each parameter is loaded on the stack and uses an integral number of stack elements.
[2658-60] 'Large' value parameters (arrays and records) have only their address loaded. The called procedure contains code to copy the value in to local cells (see [3475 et seq.]).
[2663-8] Variable parameters.
[2663] The actual parameter must be a variable.
[2669] Check the formal-actual type correspondence.
[2675] The comment here should read 'pass value parameter to formal procedure'
[2678] Get the next formal parameter.
[2684] At the end of the actual parameter list check that all the formal parameters were used up -- that is, that there were enough actual parameters.
[2687-8] Generate the call instruction.
[2691] Get the result type for functions; this will be nil for procedures.
The Pascal report states that formal procedures may only have value parameters. Therefore the comment starting at [2611] is redundant.
If comptypes(realptr, lsp) [2648] is unnecessary, since if lsp=realptr serves just as well
Despite external procedures being treated differently from standard procedures, the same calling code is generated for both kinds.
The Pascal standard procedures and functions can be split into two kinds: those that are essential to Pascal, in the sense that they have special syntax and special rules for the types that are allowed, and therefore cannot be written in Pascal, and convenience routines with no special syntax and type rules, that could be written in Pascal
The routines with special syntax and type rules are write and writeln, which can have any number of parameters, and which can have width specifications, read and readln which can have any number of parameters, and eof and eoln which can have zero or one parameter; routines with special type rules are get, put, reset, and rewrite which work on files of any type; pack and unpack which work on packed arrays of any type; new, mark and release which take a pointer of any type (mark and release are special to this compiler only); and pred and succ which work on any scalar type.
The routines that could be written in Pascal are abs, sqr, trunc, odd, ord, and chr. These could have been called in a different way by using the structures created in entstdnames [3669-762] instead of having a separate compiling routine for each of them.
Many standard procedures have variable parameters. This compiles one actual variable parameter.
[2242] The parameter to get, put, reset, and rewrite must be a file.
[2244] Reset and rewrite are not implemented.
Deals with compiling calls to read and readln.
[2251] The default for read is input. This creates the address attributes of file input.
[2252] Readln may have no parameters.
[2254] Get the first parameter, which may be a file (for example readln(input)).
[2257] Test to see if it is a file.
[2261] Get the address of the file.
[2262] Only files of char are implemented.
[2263-6] Read(input) has no meaning, and is not allowed.
[2271] Get the next parameter after the file, if there is one.
[2275] At this point either test = true, meaning there are no more parameters, or the first non-file parameter has been compiled.
[2276] This repeat statement deals with each parameter in turn. A call to read is split into several calls. For example,
var i: integer; r: real; c: char; read(i, r, c)
is coded as
[code to load address of i] [code to load address of input] CSP RDI (read integer) [address of r] [address of input] CSP RDR [address of c] [address of input] CSP RDC
Additionally, a readln is followed by
[Address of input] CSP RLN
[2276] Load the address of the variable.
[2277] Load the address of the file.
[2279-87] Generate the call to the routine.
Variable lcp [2248] is not used.
The calculation of the address of input is poor because the value is hard-wired in. If it were required to change the address of input this bit of code could easily be overlooked. The address of input (and output) is calculated at [3711], and this value should be saved somehow, and the saved value used here. The same remarks apply to line [2308] where the address of output is used.
Whenever the address of a file is loaded
([2277, 2299] and many places following), an LDA
instruction is
used to load the address, even if the file is global (always the case
as things stand) when an LAO
could be used (load address
in outermost block). The code to fix this would be
if llev = 1 then gen1(32(*ldo*), laddr) else gen2(50(*lda*), level-llev, laddr)
This would occur so often that it would be worth making into a procedure.
The error 399 at [2288] is for an attempt to read an enumeration. This is not a restriction, but an error -- Pascal does not allow it. Error 116 would be better.
Similar in many ways to read, compiles write and writeln calls. Only the differences to read will be mentioned here.
[2337] Cope with a width specification (for example, write(a:2)).
[2340] The width specification must be integer.
[2343] The absence of a width specification means use a default.
[2344-50] Reals may have a fixed point specification (for example write(r:10:5)).
[2347] The width must be integer.
[2348] The expression must be real.
[2349] But fixed point specifications are not implemented.
[2352] Output of an integer.
[2353] If a default width is needed, load the default of 10.
[2354] Load the address of the file.
[2355] Call WRI
, write integer.
[2358-62] Reals, default width is 20.
[2364-8] Characters, default width is l.
[2372] Output of enumerations is not allowed.
[2374-81] Strings. Default is the length of the string.
It would be easy enough to allow fixed-point
formats, by having another standard procedure say WRF
. Replace
error(399) on [2349] by
gen2(50 (*lda*), level-llev, laddr) genl(30 (*csp*) .... (*wrf*));
The default widths for integers and reals should not be hard-wired into the program like this [2353, 2359], but should be constant at the start of the compiles (or even better, since the sizes of intergers and reals are properties of the target machine, be part of P-code).
Pascal does not allow the output of enumerations
[2372], but it does allow the output of booleans. This would be easy
enough to implement, with a WRB
standard
procedure. Before [2369] should be inserted
else if lsp=boolptr then begin if default then gen2 (51(*ldc*), 1, boolwidth); gen2(50 (*lda*), level-llev, laddr); genl(30(*csp*), ... (*wrb*)) end
Neither pack nor unpack are implemented. The format of a pack statement is
pack(a, i, p)
where a is an unpacked array, p is a packed array, and i is an index into a.
The format of an unpack is
unpack(p, a, i)
[2400] This line gets the first parameter.
[2403-6] Check it is an array.
[2408-12] Get the second parameter; check it is a scalar and that it matches the index type.
[2414-23] Get the third parameter, and check that it is an array of the right type.
Even with constructs the compiler does not implement, it usually attempts to fully analyse them. Even though packed types are not implemented, they should at least be checked; for instance that the last parameter of pack is packed.
In the absence of packed data types, a pack or unpack is just a copying operation, which would be relatively easy to implement. In other words, the compiler should just treat packed data types as unpacked, and compile them suitably.
The form of a call to new is
new(p)
where p is a pointer, or
new(p, c1, c2, ..., cn)
when p is a pointer to a record with variant parts, and the c's select a particular variant part.
The code generated is
[Code to load the address of p] LDCI size Load the size of area to be created CSP NEW
[2458-69] Get the first parameter, check that it is a pointer, and get the variant part if it is a pointer to a record. If it is not a record, or it has no variants, lsp will be nil.
[2470-94] Get the subsequent parameters.
[2474-81] Check that the constant matches in type to the next variant tag.
[2482-92] Search the variant part for the value required and save its size. If it is not found, the variant tag holds the default size.
The variables lmin, lmax, lsz, and varts are not used.
The comment at [2473] refers to a case like
type index=1..10; r = record case i: index of . . end var p: ^r; new(p, 11);
The 11 here should not be permitted.
These two routines are not standard, but replacements for dispose, to make heap management easier in the compiler. Both take one pointer parameter.
Dispose should be implemented as well, even if the interpreter should choose to ignore calls to it.
SAV
and RST
are a poor
choice of names. Better would be MRK
and
RLS
.
These are all fairly self-evident. The only point to notice is [2554] where the test
if gattr.typtr^.form >= power
equivalent to
if not (gattr.typtr^.form in [scalar, subrange, pointer])
erroneously allows ord to be applied to both pointers and reals. (It might be incorrect Pascal, but in fact the compiler itself uses this in printtables, e.g. at [780].)
Compiles calls to eof and eoln.
[2577-80] Gets the parameter.
[2582-5] If there is no parameter, input is default.
[2587-8] Check the parameter is a file.
[2589] Generate the code
EOF
or
CSP ELN.
Again the address of input should not be hard-wired.
It is not clear why EOF
should be an
instruction, and ELN
a standard procedure. Better that
they should both be standard procedures.
Pascal Implementation by Steven Pemberton and Martin Daniels
Copyright © 1982, 2002 Steven Pemberton and Martin Daniels, all rights reserved.