ERROR DETECTION AND RECOVERY
CONTENTS
ERROR
• Program submitted to a compiler often have errors of various kinds
So, good compiler should be able to detect as
many errors as possible in various ways and also recover from them
(i.e) even in the presence of errors ,the compiler should scan the program and try to compile all of it.(error
• when the scanner or parser finds an error
and cannot proceed , the compiler must
modify the input
• so that the correct portions of the program
can be pieced together and successfully
proessed in the syntax analysis phase.
• Therefore , there should be a considerable
forethought from the beginning in designing
the compiler.
• various ways:
• After the detection of a error,
• * a simple compiler may stop all the
activities other than lexical and syntactic
analysis.
• * a more complex compiler may transform
the erroneous input into a similar legal
input on which the normal processing can
be resumed(repair).
• *an even more sophisticated compiler may
correct the erroneous input by guessing
what the user has intended.
• However, no compiler can do true
correction.Because,compiler wont know the
intent of the programmer due to errors.
Completely accurate error correction can be
done only by the programmer.
DIAGNOSTIC COMPILER
• Compilers giving due importance to all the 3
aspects is called “Diagnostic compilers”,
which follows complex analysis task, which
may require extra time or memory space.
• Eg: WATFOR, IITFORT
CORRECTING COMPILER
• These compilers does the job of error recovery
not only from the compiler point of view but
also from the programmers point of view(ie)
generates code to be executed, which eases the
programmer. Eg:PL/C
• But at the same time, error recovery should not
lead to misleading or spurious error messages
elsewhere (error propagation).
• Indication of run time errors is another
neglected area in compiler design. Because,
code generated to monitor these violations
increases the target program size, which
leads to slow execution.
• So these checks are included as “debugging
options”(also includes intermediate display
of values, trace of procedure calls) at the
• Good error reporting plays an important role in the construction of reliable programs.
•
• PROPERTIES OF GOOD ERROR DIAGNOSIS: •
• *The messages should pinpoint the errors in terms of original source program rather than in some internal representation, which is unknown to the user. • *The error messages should be tasteful and understandable by the user. • Eg: Showing error as “missing right parenthesis in line 5” rather than as a
cryptic code “OH17”
• *The messages should be specific and should localize the problem
• Eg: Showing error as “Z not declared in procedure add” rather than “missing declaration”.
• *The massages should not be redundant.
• Eg: If z is not declared, then it should be said once not every time z appears in the program.
PROPERTIES OF GOOD
ERROR DIAGNOSIS
• *The messages should pinpoint the errors in
terms of original source program rather
than in some internal representation, which
is unknown to the user.
• *The error messages should be tasteful and
understandable by the user.
• Eg: Showing error as “missing right
parenthesis in line 5” rather than as a cryptic
code “OH17”
• *The messages should be specific and should
localize the problem
• Eg: Showing error as “Z not declared in
procedure add” rather than “missing
declaration”.
• *The massages should not be redundant.
• Eg: If z is not declared, then it should be said
once not every time z appears in the program.
SOURCES OF ERROR
• * ALGORITHMIC ERRORS:
The algorithm used to meet the design may be
inadequate or incorrect
*CODING ERRORS:
The programmer may introduce errors in
implementing the algorithms, either by
introducing logical errors or using the
• *The program may exceed a compiler or machine
limit not implied by the definition of the
programming language.
• Eg:
• An array may be declared with too many
dimensions to fit in the symbol table ,
• an array may be declared with too large to be
allocated at runtime.
• *COMPILER ERRORS:
• compiler can insert errors as it translates source
program into an object program.
• CRETERIA FOR THE CLASSIFICATION OF
ERRORS:
• *Compile time,
• *Link / Load time
• * Run time errors.
• CLASSIFICATION OF COMPILE TIME
ERRORS:
• *lexical errors
• * Syntactic errors
• * Semantic errors.
• The lexical and syntactic errors are found
during the execution of the program.
• Most of the run time errors are semantic in
nature.
• In compile-link-go systems, the compile and
link errors will be trapped seperately by the
compiler and the linkage editor / loader.
• In compile-and-go systems, the compile and
link errors will be trapped by the compiler
itself.
• In compile-and-go systems, the compile and
link errors will be trapped by the compiler
itself.
• Execution time errors are detected by the
run time environment, which includes
runtime control routine, the machine
hardware and the standard OS interfaces
through which status of the hardware can
be accessed or monitored as when required.
PLAN OF ERROR DETECTION IN
PORTION OF COMPILER
• It’s consists of routine to recover from lexical and syntactic errors , a routine to detect semantic errors and a routine to print the diagnostics
• The diagnostic routine communicates with the symbol table to avoid printing redundant messages.
• The message printer must defer it’s diagnostics until after complete line of source text has been read and listed.
ERRORS SEEN BY EACH PHASE
Each phase of the compiler expects it’s input to flow in certain specificationWhen the input does not, the phase has detected an
inconsistency or error which it should report to the user. Moreover in order to continue processing it’s input, phase has to recover from each error as being lexical phase ,
syntactic phase or semantic phase errors depending on which compiler phase detects them.
• After detecting and reporting error , a phase can either repair it or pass it along to the subsequent modules.
• If a phase attempts repair , it should take precautions that the repair does not introduce a flurry of other errors
• If a phase transmits error the subsequent phase should be able to deal with the erroneous inputs passed on.
PLAN OF ERROR DETECTOR / CORRECTOR
8/31/2012 22 Lexical analyzer Lexical corrector Diagnostic Message Printer Syntactic corrector Parser Symbol Table Semantic checker Sourcecode Tokens Intermediate code
LEXICAL AND SYNTAX ERRORS
• Two frequent sources of these errors are:1.Spelling errors,
2.Missing operators and keywords
• These errors can happen due to genuine oversight or due to typing mistake.
• They are common mistakes for even for professional programmers.
SPELLING ERRORS- WHEN DO THEY
OCCUR???
If a program uses variables names which differs in only one or two characters ,then there exits great scope for spelling errors.
There is less chance for using automatic procedures for detecting and correcting these errors.
Only the programmer can able to tackle the problem
MAJORITY SPELLING ERRORS
1. One character is wrong,
2. One character is missing, 3. One character is extra,
4. Two adjacent characters are transposed.
• Testing for these four types of error will not enable us to catch all the spelling mistakes but practical consideration limit
searches to these four only
• The implementation these four checks is quite expensive because an associative search has to be performed over all names in symbol table to locate resembling name
CORRECTION ALGORITHM
• The searching try to mask off one or more adjacent characters from symbols and locate a matching symbol from symbol table. • The located symbol can be used instead of erroneous symbol if
only one character was masked off
• If two characters were masked then transposition should be checked .
• If associative search for an erroneous names matches more than one symbol in the symbol table then In that case their attributes are used to decide the final.
• If more than one symbol with matching attributes result then correction is not safe and should not be attempted.
• This algorithm may fail if unusual usage of names results from valid usage of language facilities.
• So it is necessary to inform the user whenever one correction made.
MISSING OPERATORS AND KEYWORDS
• It can detected by their context.• It is not perfect because certain context tends to hide the absence of an operator .
• ex: G=H(A+B) typed instead of G=H*(A+B).
• In this case only if H exist in the symbol it shows error . • Otherwise the couldn’t produce any error since H could
DUPLICATE MASSAGE
• It is to find that many message appear owing to the same error.
• Ex: If a is used as a simple variable and later goes to
declare and use it as array a[10,10] then all references to the array will be flagged as erroneous use of variable
name.
• This can achieved by setting a flag in the symbol table entry of a.
• This will enable to detect and indicate all possible illegal use of identifier.
RECOVERING FROM SYNTAX ERROR
• The chief concern while recovering from the syntax erroris to attain a parser state from where the parser can safely resume parsing the input string.
• Many parsers detects errors when it doesnot have legal move from it’s correct configuration , which is determined by it’s state , stack content and current input symbol.
• To recover from an error a parser should ideally locate it , correct and resume parsing
TIME OF DETECTION – VALID PREFIX
PROPERTY
• LL1 AND LR1 parsers will announce errors as soon as the prefix of the input has been seen for which there is no valid continuation
• This is the earliest time at which a parser that reads it’s input from left to right can announce an error
• Adv – reports errors as soon as possible • Limits amount of erraneous output
Panic mode recovery
• Parser discards input symbol until a synchronizing token usually a statement delimiter or semicolon is found
• The parser then deletes stack entries until it finds an entry that can continue parsing given the synchrnosing token on input
• Ie.skip until we encounter a symbol which tells us what should be the parser state inorder to recognize it
• Adv – simple to implemement • Never go infinite loop
on discovering an error
the parser discards input symbols one at a time
until is found one of a designated set of synchronizing tokens
◦ delimiters ; or }
◦ have a clear and unambiguous role
◦ must be selected by the compiler designer
skips considerable amount of input no checking for additional errors simple
guaranteed not to go on an infinite loop
• Three basic policy of recovering syntax error: • 1.Deletion of a source symbol
• 2.Insertion of a synthetic symbol. • 3.Replacement.
• The motive behind all these actions is to
present a new string to the parser which
would lead to bypassing the error situation
and continue to parse.
• Here multiple recovery possibilities may
exists.
• We should choose the one which has
smallest number of changes – minimum
distance recovery
RECOVERY IN TOP DOWN PARSING
• There are two methods.
• First is to try and successfully complete the predictions existing in the stack at the error point.
Ex: Input string: Aα
• Last prediction was W:…ABν
• If no other rules exit with A on right hand side then recovery can be effected by inserting B and deleting parts a until a ν is
recognized in source string.
• Another is unstack certain symbols from parser stack until we have a TOS symbol which can produce one of the synchronizing symbols.
RECOVERY IN BOTTOM UP PARSING
• In bottom up parsing insertion of symbols is better thandeletion.
• Because it is easy to determine what symbol is to be inserted . routines may be devised to carry out of the specific recovery action.
• Replacing or deleting the next few source symbols also done.
OPERATOR PRECEDENCE PARSING
Operator precedence parser uses set of production rules and operator precedence table to parse an arithmetic expression. E → E + E | E – E | E * E | E / E | E ^ E | ( E ) | - E | id
ERROR RECOVERY IN OPERATOR
PRECEDENCE PARSING
•There are two types of operator precedence parsing errors.
character pair errors reducibility errors.
•A character pair error occurs when there is no operator precedence relation between pairs of symbols in the
grammar.
•A reducibility error occurs when you cannot reduce the handle to the left hand side of some production.
CHARACTER PAIR ERROR RECOVERY
Fill each empty entry with a pointer to an error routine.Example,
E1 – ‘missing operand’ – whole expression is missing E2- ‘unbalanced right parenthesis’
E3- ‘missing operator’
REDUCIBILITY ERROR RECOVERY
• Decides the popped handle “looks like” which right handside. And tries to recover from that situation. • Same like shift-reduce errors
HANDLING SHIFT-REDUCE ERRORS
• Generic shift-reduce strategy:– If there is a handle on top of the stack, reduce – Otherwise, shift
• But what if there is a choice?
– If it is legal to shift or reduce, there is a shift-reduce conflict
– If it is legal to reduce by two different productions, there is a reduce-reduce conflict
HANDLING SHIFT-REDUCE ERRORS
• Ambiguous grammars always cause conflicts• But beware, so do many non-ambiguous grammars To resolve this, we should modify the grammar.
SEMANTIC ERRORS
• Can be both local and global in scope.• Types
– Immediate errors
• Can be detected while processing the erroneous statement itself.
– Delayed errors
• Can’t be detected while processing the statement. • But can be detected at a later stage when its effect is
EXAMPLES FOR SEMANTIC ERRORS
• Illegal Operator or Operand (immediate)• Control Structure Violation (both)
• Missing Labels (delayed at the end of the program)
THE ERROR PRINT ROUTINE
• Messages have to be displayed for all errors which are detected , or detected and corrected in the source
program.
• The error print routine is the common agency that is
used by all individual compiler routines for this purpose. • The text of the error is normally stored in the table local
to this routine
• Associated with each message is the numerical value indicating it’s error severity
• This value is mainly used for purposes internal to the
compiler’s operation ( like if not to allow the program to reach the execution stage or not)
1 Warning and correction. Compilation continues and the compiled program will execute
2 Warning only. Compilation continues and compiled program will execute 3 Fatal error. Compilation continues but the compiled program will not
execute
• For each individual error two items of
information need to be passed to this
routine
• The error number and the statement
number
• The structure and logic of the routine
depends largely on the decisions regarding
the place where the message is to be
Desirable place for printing error
messages
• The messages are best printed against the
erraneous statement itself
• Single pass compilers find it difficult to
indicate all errors against the offending
statement
• Multipass compilers can provide such error
condition
• Many Fortran compilers indicate errors on a line
by line basis as far as possible since syntax
analysis and output listing are both performed in
the same – normally first pass
• Some compilers group all error messages at end of
the program.
• This has the advantages that the problem of
duplicate messages for similar misuse of an
identifier can be satisfactorily solved.
• The compiler error table will be in the form
Error number(Message identifier) Erroneous statement Auxiliary information Message textRuntime errors
• The runtime errors are detected by
1. The run time control routine which is
interfaced with the generated code in
standard manner
2. The machine hardware
• The agency required to detect particular
type of error depends on nature of error
and in general varies from machine to
machine and compiler to compiler
Detection of runtime errors
• Arithmetic exceptionsArises because of the violations of semantics of machine computations.
Includes frequently occurring error conditions like overflows, underflows, loss of precision etc..
Present day architecture detects most of the conditions at the machine hardware level and indicate their presence
through interrupts or traps
• Input output errors
Device error conditions and end of file conditions on input file are usually detected by IOCS which sets appropriate flag to indicate their occurance
The runtime control routine should make appropraite
provisions to obtain control when such conditions arise Ex: fortran programmer stmt
read(5,100,err=110,end=120) A,B
Appropriate code take control to line 120 on eof and 110 on error
• Dimensions overruns
overall array bound check
Individual subscript bound check
Watfor compilers do these
Programmer Recovery Options
• Difference b/w compile time and run time error is the
type of recovery possible and its implications for pgmr.
• Syntax errrors can be patched up in a standard manner
in order to extend the life of the program and to push it to exeution.
•Same thing is for runtime errors but here the difference
is that the programmer can forsee the runtime errors and correct it
• Standard recovery action may not suit for a
programmer.
•languages provide this options. PL/I.
•When ever an exception occurs the runtime control
routine has to decide what action to take.
•Maintains runtime exception table
• Ex. • ON SUBSCRIPTRANGE I = 5; • ON OVERFLOW I = 25; ……… I = A(J,K-4)/X; ………
Type of Exception Program Action System Action Overflow I = 25 Make the standard assumption
regarding the resulting value. Return. Subscript Range I = 5 Cancel the Program.
• Compiler generates code for inserting and deleting entries from the program action fields depending on the scope of the program-indicated recovery actions.
Type of Exception Scope Recovery action stack pointer Overflow --- ---
Subscript range --- ---
• Scope column indicate where the scope of the programmer indicated recovery action ends.
Debugging aids and options
• Run time checks are so costly in terms of code space and execution time. • This checks are debugging options.
Trace and Sub traces.
• Procedure calls printed out at user option to indicate the flow of control. • The trace is written into special debugging files.
• Debugging file consist of output for the statements and variables.
Assignment Checks
• Assignments to a variable are monitored by the system.
Intermediate and error Dumps
• Intermediate dumps can be produced during the execution time. • It may also be produced at abnormal program execution.
Conversational debugging
• Facilities are provided through which the programmer can set the break points in the program.
• When the program reach the break point, a conversation is initiated with the programmer.
Slide:61