Error Detection Recovery

(1)

ERROR DETECTION AND RECOVERY

(2)

ERROR

• Program submitted to a compiler often have errors of various kinds

So, good compiler should be able to detect as

many errors as possible in various ways and also recover from them

(i.e) even in the presence of errors ,the compiler should scan the program and try to compile all of it.(error

(4)

• when the scanner or parser finds an error

and cannot proceed , the compiler must

modify the input

• so that the correct portions of the program

can be pieced together and successfully

proessed in the syntax analysis phase.

• Therefore , there should be a considerable

forethought from the beginning in designing

the compiler.

(5)

• various ways:

• After the detection of a error,

• * a simple compiler may stop all the

activities other than lexical and syntactic

analysis.

• * a more complex compiler may transform

the erroneous input into a similar legal

input on which the normal processing can

be resumed(repair).

(6)

• *an even more sophisticated compiler may

correct the erroneous input by guessing

what the user has intended.

• However, no compiler can do true

correction.Because,compiler wont know the

intent of the programmer due to errors.

Completely accurate error correction can be

done only by the programmer.

(7)

DIAGNOSTIC COMPILER

• Compilers giving due importance to all the 3

aspects is called “Diagnostic compilers”,

which follows complex analysis task, which

may require extra time or memory space.

• Eg: WATFOR, IITFORT

(8)

CORRECTING COMPILER

• These compilers does the job of error recovery

not only from the compiler point of view but

also from the programmers point of view(ie)

generates code to be executed, which eases the

programmer. Eg:PL/C

• But at the same time, error recovery should not

lead to misleading or spurious error messages

elsewhere (error propagation).

(9)

• Indication of run time errors is another

neglected area in compiler design. Because,

code generated to monitor these violations

increases the target program size, which

leads to slow execution.

• So these checks are included as “debugging

options”(also includes intermediate display

of values, trace of procedure calls) at the

(10)

• Good error reporting plays an important role in the construction of reliable programs.

•

• PROPERTIES OF GOOD ERROR DIAGNOSIS: •

• *The messages should pinpoint the errors in terms of original source program rather than in some internal representation, which is unknown to the user. • *The error messages should be tasteful and understandable by the user. • Eg: Showing error as “missing right parenthesis in line 5” rather than as a

cryptic code “OH17”

• *The messages should be specific and should localize the problem

• Eg: Showing error as “Z not declared in procedure add” rather than “missing declaration”.

• *The massages should not be redundant.

• Eg: If z is not declared, then it should be said once not every time z appears in the program.

(11)

PROPERTIES OF GOOD

ERROR DIAGNOSIS

• *The messages should pinpoint the errors in

terms of original source program rather

than in some internal representation, which

is unknown to the user.

• *The error messages should be tasteful and

understandable by the user.

(12)

• Eg: Showing error as “missing right

parenthesis in line 5” rather than as a cryptic

code “OH17”

• *The messages should be specific and should

localize the problem

• Eg: Showing error as “Z not declared in

procedure add” rather than “missing

declaration”.

• *The massages should not be redundant.

• Eg: If z is not declared, then it should be said

once not every time z appears in the program.

(13)

SOURCES OF ERROR

• * ALGORITHMIC ERRORS:

The algorithm used to meet the design may be

inadequate or incorrect

*CODING ERRORS:

The programmer may introduce errors in

implementing the algorithms, either by

introducing logical errors or using the

(14)

• *The program may exceed a compiler or machine

limit not implied by the definition of the

programming language.

• Eg:

• An array may be declared with too many

dimensions to fit in the symbol table ,

• an array may be declared with too large to be

allocated at runtime.

• *COMPILER ERRORS:

• compiler can insert errors as it translates source

program into an object program.

(15)

• CRETERIA FOR THE CLASSIFICATION OF

ERRORS:

• *Compile time,

• *Link / Load time

• * Run time errors.

(16)

• CLASSIFICATION OF COMPILE TIME

ERRORS:

• *lexical errors

• * Syntactic errors

• * Semantic errors.

(17)

• The lexical and syntactic errors are found

during the execution of the program.

• Most of the run time errors are semantic in

nature.

• In compile-link-go systems, the compile and

link errors will be trapped seperately by the

compiler and the linkage editor / loader.

• In compile-and-go systems, the compile and

link errors will be trapped by the compiler

itself.

(18)

• In compile-and-go systems, the compile and

link errors will be trapped by the compiler

itself.

• Execution time errors are detected by the

run time environment, which includes

runtime control routine, the machine

hardware and the standard OS interfaces

through which status of the hardware can

be accessed or monitored as when required.

(19)

PLAN OF ERROR DETECTION IN

PORTION OF COMPILER

• It’s consists of routine to recover from lexical and syntactic errors , a routine to detect semantic errors and a routine to print the diagnostics

• The diagnostic routine communicates with the symbol table to avoid printing redundant messages.

• The message printer must defer it’s diagnostics until after complete line of source text has been read and listed.

(20)

ERRORS SEEN BY EACH PHASE

Each phase of the compiler expects it’s input to flow in certain specification

When the input does not, the phase has detected an

inconsistency or error which it should report to the user. Moreover in order to continue processing it’s input, phase has to recover from each error as being lexical phase ,

syntactic phase or semantic phase errors depending on which compiler phase detects them.

(21)

• After detecting and reporting error , a phase can either repair it or pass it along to the subsequent modules.

• If a phase attempts repair , it should take precautions that the repair does not introduce a flurry of other errors

• If a phase transmits error the subsequent phase should be able to deal with the erroneous inputs passed on.

(22)

PLAN OF ERROR DETECTOR / CORRECTOR

8/31/2012 22 Lexical analyzer Lexical corrector Diagnostic Message Printer Syntactic corrector Parser Symbol Table Semantic checker Source

code Tokens Intermediate code

(23)

LEXICAL AND SYNTAX ERRORS

• Two frequent sources of these errors are:

1.Spelling errors,

2.Missing operators and keywords

• These errors can happen due to genuine oversight or due to typing mistake.

• They are common mistakes for even for professional programmers.

(24)

SPELLING ERRORS- WHEN DO THEY

OCCUR???

 If a program uses variables names which differs in only one or two characters ,then there exits great scope for spelling errors.

 There is less chance for using automatic procedures for detecting and correcting these errors.

 Only the programmer can able to tackle the problem

(25)

MAJORITY SPELLING ERRORS

1. One character is wrong,

2. One character is missing, 3. One character is extra,

4. Two adjacent characters are transposed.

• Testing for these four types of error will not enable us to catch all the spelling mistakes but practical consideration limit

searches to these four only

• The implementation these four checks is quite expensive because an associative search has to be performed over all names in symbol table to locate resembling name

(26)

CORRECTION ALGORITHM

• The searching try to mask off one or more adjacent characters from symbols and locate a matching symbol from symbol table. • The located symbol can be used instead of erroneous symbol if

only one character was masked off

• If two characters were masked then transposition should be checked .

• If associative search for an erroneous names matches more than one symbol in the symbol table then In that case their attributes are used to decide the final.

• If more than one symbol with matching attributes result then correction is not safe and should not be attempted.

• This algorithm may fail if unusual usage of names results from valid usage of language facilities.

• So it is necessary to inform the user whenever one correction made.

(27)

MISSING OPERATORS AND KEYWORDS

• It can detected by their context.

• It is not perfect because certain context tends to hide the absence of an operator .

• ex: G=H(A+B) typed instead of G=H*(A+B).

• In this case only if H exist in the symbol it shows error . • Otherwise the couldn’t produce any error since H could

(28)

DUPLICATE MASSAGE

• It is to find that many message appear owing to the same error.

• Ex: If a is used as a simple variable and later goes to

declare and use it as array a[10,10] then all references to the array will be flagged as erroneous use of variable

name.

• This can achieved by setting a flag in the symbol table entry of a.

• This will enable to detect and indicate all possible illegal use of identifier.

(29)

RECOVERING FROM SYNTAX ERROR

• The chief concern while recovering from the syntax error

is to attain a parser state from where the parser can safely resume parsing the input string.

• Many parsers detects errors when it doesnot have legal move from it’s correct configuration , which is determined by it’s state , stack content and current input symbol.

• To recover from an error a parser should ideally locate it , correct and resume parsing

(30)

TIME OF DETECTION – VALID PREFIX

PROPERTY

• LL1 AND LR1 parsers will announce errors as soon as the prefix of the input has been seen for which there is no valid continuation

• This is the earliest time at which a parser that reads it’s input from left to right can announce an error

• Adv – reports errors as soon as possible • Limits amount of erraneous output

(31)

Panic mode recovery

• Parser discards input symbol until a synchronizing token usually a statement delimiter or semicolon is found

• The parser then deletes stack entries until it finds an entry that can continue parsing given the synchrnosing token on input

• Ie.skip until we encounter a symbol which tells us what should be the parser state inorder to recognize it

• Adv – simple to implemement • Never go infinite loop

(32)

 on discovering an error

 the parser discards input symbols  one at a time

 until is found one of a designated set of synchronizing tokens

◦ delimiters ; or }

◦ have a clear and unambiguous role

◦ must be selected by the compiler designer

 skips considerable amount of input  no checking for additional errors  simple

 guaranteed not to go on an infinite loop

(33)

• Three basic policy of recovering syntax error: • 1.Deletion of a source symbol

• 2.Insertion of a synthetic symbol. • 3.Replacement.

(34)

• The motive behind all these actions is to

present a new string to the parser which

would lead to bypassing the error situation

and continue to parse.

• Here multiple recovery possibilities may

exists.

• We should choose the one which has

smallest number of changes – minimum

distance recovery

(35)

RECOVERY IN TOP DOWN PARSING

• There are two methods.

• First is to try and successfully complete the predictions existing in the stack at the error point.

Ex: Input string: Aα

• Last prediction was W:…ABν

• If no other rules exit with A on right hand side then recovery can be effected by inserting B and deleting parts a until a ν is

recognized in source string.

• Another is unstack certain symbols from parser stack until we have a TOS symbol which can produce one of the synchronizing symbols.

(36)

RECOVERY IN BOTTOM UP PARSING

• In bottom up parsing insertion of symbols is better than

deletion.

• Because it is easy to determine what symbol is to be inserted . routines may be devised to carry out of the specific recovery action.

• Replacing or deleting the next few source symbols also done.

(37)

OPERATOR PRECEDENCE PARSING

Operator precedence parser uses set of production rules and operator precedence table to parse an arithmetic expression. E → E + E | E – E | E * E | E / E | E ^ E | ( E ) | - E | id

(38)

ERROR RECOVERY IN OPERATOR

PRECEDENCE PARSING

•There are two types of operator precedence parsing errors.

 character pair errors  reducibility errors.

•A character pair error occurs when there is no operator precedence relation between pairs of symbols in the

grammar.

•A reducibility error occurs when you cannot reduce the handle to the left hand side of some production.

(39)

CHARACTER PAIR ERROR RECOVERY

Fill each empty entry with a pointer to an error routine.

Example,

E1 – ‘missing operand’ – whole expression is missing E2- ‘unbalanced right parenthesis’

E3- ‘missing operator’

(40)

REDUCIBILITY ERROR RECOVERY

• Decides the popped handle “looks like” which right hand

side. And tries to recover from that situation. • Same like shift-reduce errors

(41)

HANDLING SHIFT-REDUCE ERRORS

• Generic shift-reduce strategy:

– If there is a handle on top of the stack, reduce – Otherwise, shift

• But what if there is a choice?

– If it is legal to shift or reduce, there is a shift-reduce conflict

– If it is legal to reduce by two different productions, there is a reduce-reduce conflict

(42)

HANDLING SHIFT-REDUCE ERRORS

• Ambiguous grammars always cause conflicts

• But beware, so do many non-ambiguous grammars To resolve this, we should modify the grammar.

(43)

SEMANTIC ERRORS

• Can be both local and global in scope.

• Types

– Immediate errors

• Can be detected while processing the erroneous statement itself.

– Delayed errors

• Can’t be detected while processing the statement. • But can be detected at a later stage when its effect is

(44)

EXAMPLES FOR SEMANTIC ERRORS

• Illegal Operator or Operand (immediate)

• Control Structure Violation (both)

• Missing Labels (delayed at the end of the program)

(45)

THE ERROR PRINT ROUTINE

• Messages have to be displayed for all errors which are detected , or detected and corrected in the source

program.

• The error print routine is the common agency that is

used by all individual compiler routines for this purpose. • The text of the error is normally stored in the table local

to this routine

• Associated with each message is the numerical value indicating it’s error severity

• This value is mainly used for purposes internal to the

compiler’s operation ( like if not to allow the program to reach the execution stage or not)

(46)

1 Warning and correction. Compilation continues and the compiled program will execute

2 Warning only. Compilation continues and compiled program will execute 3 Fatal error. Compilation continues but the compiled program will not

execute

(47)

• For each individual error two items of

information need to be passed to this

routine

• The error number and the statement

number

• The structure and logic of the routine

depends largely on the decisions regarding

the place where the message is to be

(48)

Desirable place for printing error

messages

• The messages are best printed against the

erraneous statement itself

• Single pass compilers find it difficult to

indicate all errors against the offending

statement

• Multipass compilers can provide such error

condition

(49)

• Many Fortran compilers indicate errors on a line

by line basis as far as possible since syntax

analysis and output listing are both performed in

the same – normally first pass

• Some compilers group all error messages at end of

the program.

• This has the advantages that the problem of

duplicate messages for similar misuse of an

identifier can be satisfactorily solved.

(50)

• The compiler error table will be in the form

Error number(Message identifier) Erroneous statement Auxiliary information Message text

(51)

Runtime errors

• The runtime errors are detected by

1. The run time control routine which is

interfaced with the generated code in

standard manner

2. The machine hardware

(52)

• The agency required to detect particular

type of error depends on nature of error

and in general varies from machine to

machine and compiler to compiler

(53)

Detection of runtime errors

• Arithmetic exceptions

Arises because of the violations of semantics of machine computations.

Includes frequently occurring error conditions like overflows, underflows, loss of precision etc..

Present day architecture detects most of the conditions at the machine hardware level and indicate their presence

through interrupts or traps

(54)

• Input output errors

Device error conditions and end of file conditions on input file are usually detected by IOCS which sets appropriate flag to indicate their occurance

The runtime control routine should make appropraite

provisions to obtain control when such conditions arise Ex: fortran programmer stmt

read(5,100,err=110,end=120) A,B

Appropriate code take control to line 120 on eof and 110 on error

(55)

• Dimensions overruns

overall array bound check

Individual subscript bound check

Watfor compilers do these

(56)

Programmer Recovery Options

• Difference b/w compile time and run time error is the

type of recovery possible and its implications for pgmr.

• Syntax errrors can be patched up in a standard manner

in order to extend the life of the program and to push it to exeution.

•Same thing is for runtime errors but here the difference

is that the programmer can forsee the runtime errors and correct it

• Standard recovery action may not suit for a

programmer.

•languages provide this options. PL/I.

•When ever an exception occurs the runtime control

routine has to decide what action to take.

•Maintains runtime exception table

(57)

• Ex. • ON SUBSCRIPTRANGE I = 5; • ON OVERFLOW I = 25; ……… I = A(J,K-4)/X; ………

Type of Exception Program Action System Action Overflow I = 25 Make the standard assumption

regarding the resulting value. Return. Subscript Range I = 5 Cancel the Program.

(58)

• Compiler generates code for inserting and deleting entries from the program action fields depending on the scope of the program-indicated recovery actions.

Type of Exception Scope Recovery action stack pointer Overflow --- ---

Subscript range --- ---

• Scope column indicate where the scope of the programmer indicated recovery action ends.

(59)

Debugging aids and options

• Run time checks are so costly in terms of code space and execution time. • This checks are debugging options.

Trace and Sub traces.

• Procedure calls printed out at user option to indicate the flow of control. • The trace is written into special debugging files.

• Debugging file consist of output for the statements and variables.

Assignment Checks

• Assignments to a variable are monitored by the system.

(60)

Intermediate and error Dumps

• Intermediate dumps can be produced during the execution time. • It may also be produced at abnormal program execution.

Conversational debugging

• Facilities are provided through which the programmer can set the break points in the program.

• When the program reach the break point, a conversation is initiated with the programmer.

(61)

Slide:61

Error Detection Recovery

ERROR DETECTION AND RECOVERY

CONTENTS

ERROR

• when the scanner or parser finds an error

and cannot proceed , the compiler must

modify the input

• so that the correct portions of the program

can be pieced together and successfully

proessed in the syntax analysis phase.

• Therefore , there should be a considerable

forethought from the beginning in designing

the compiler.

• various ways:

• After the detection of a error,

• * a simple compiler may stop all the

activities other than lexical and syntactic

analysis.

• * a more complex compiler may transform

the erroneous input into a similar legal

input on which the normal processing can

be resumed(repair).

• *an even more sophisticated compiler may

correct the erroneous input by guessing

what the user has intended.

• However, no compiler can do true

correction.Because,compiler wont know the

intent of the programmer due to errors.

Completely accurate error correction can be

done only by the programmer.

DIAGNOSTIC COMPILER

• Compilers giving due importance to all the 3

aspects is called “Diagnostic compilers”,

which follows complex analysis task, which

may require extra time or memory space.

• Eg: WATFOR, IITFORT

CORRECTING COMPILER

• These compilers does the job of error recovery

not only from the compiler point of view but

also from the programmers point of view(ie)

generates code to be executed, which eases the

programmer. Eg:PL/C

• But at the same time, error recovery should not

lead to misleading or spurious error messages

elsewhere (error propagation).

• Indication of run time errors is another

neglected area in compiler design. Because,

code generated to monitor these violations

increases the target program size, which

leads to slow execution.

• So these checks are included as “debugging

options”(also includes intermediate display

of values, trace of procedure calls) at the

PROPERTIES OF GOOD

ERROR DIAGNOSIS

• *The messages should pinpoint the errors in

terms of original source program rather

than in some internal representation, which

is unknown to the user.

• *The error messages should be tasteful and

understandable by the user.

• Eg: Showing error as “missing right

parenthesis in line 5” rather than as a cryptic

code “OH17”

• *The messages should be specific and should

localize the problem

• Eg: Showing error as “Z not declared in

procedure add” rather than “missing

declaration”.

• *The massages should not be redundant.

• Eg: If z is not declared, then it should be said

once not every time z appears in the program.

SOURCES OF ERROR

• * ALGORITHMIC ERRORS:

The algorithm used to meet the design may be

inadequate or incorrect

*CODING ERRORS:

The programmer may introduce errors in

implementing the algorithms, either by

introducing logical errors or using the