• No results found

The Problem: Your program has detected an internal problem that needs debugging. But the program is not being debugged. How do you let the

programmer at the problem?

The Hack: Define a function which starts the debugger on a running program.

Note: The following code is Linux specific. If you are running on a UNIX like system it should be easy to port it to that system. If you are running

The basic idea of the program is that when a

debug_me

function call occurs that the program should start the debug (gdb) and attach it to the running process. Sounds simple, but there are a few details to work out.

First let's see what we need to tell gdb start the debugging process. The initial gdb commands are:

1.

attach <pid>

– Attach the debugger to the program being debugged.

2.

echo “Debugger gdb started\n” –

Let the user know what has happened.

3.

symbol /proc/<pid>/exe

– Tell gdb where to find the symbol table for the process.

4.

break gdb_stop

– Stop at a nice stopping point.

5.

shell touch <flag-file>

– Tell the program that gdb is attached. (More on this later)

6.

continue

– Continue execution and stop at the correct location. The first gdb command:

attach <pid>

attaches the debugger to the program. (

<pid>

is replaced by the process id of the program to be debugged.) The debugger is now in control of the program. Actually if we were in a minimalist frame of mind, we would stop here.

But the debug session is in a sorry state. The symbol table has not been loaded and don't know if we stopped in the correct thread or at a know location. So we execute a few more commands to make things a little nicer.

The next command:

echo “Debugger gdb started\n”

outputs a greeting message. That way the user that we've started the debugging process.

Next we load the symbol table. For that we need the name of the program file. One way of finding this is to talk look at the program name and do a PATH search for the executable file. But Linux is nice to provide a symbolic link from

/proc/<pid>/exe

to the executable, so we just exploit this feature to load our symbol table.

symbol /proc/<pid>/exe

Now we tell gdb to stop at a good place. In fact we've defined a good place to stop called

gdb_stop

, so we'll set a breakpoint there.

break gdb_stop

When the debugger is attached to a program, the program stops. The problem is we don't know where the program is stopped. It could be 80 levels deep into some function called by

debug_me

. What worse, we could be dealing with a threaded program. In that case we many not even be stopped in the thread that caused the error.

The solution to this problem is to set a stop in a know location (

gdb_stop

) and tell the debugger to continue. When we stop at

gdb_stop

we know where we are and we are sure to be in the correct thread.

After

debug_me

starts gdb it waits around for the debugger to start. This is done using the loop:

96: while (access(flag_file, F_OK) != 0) 97: {

98: sleep (1); 99: }

This loop waits around for a flag file to be created. As soon as it shows, the program knows that the debugger is running and it can continue.

In order to create the flag file, we issue the following command to gdb:

touch <flag_file>

Finally we tell gdb to continue. At this point the execution of the program continues for a short while until the program reaches

gdb_stop

.

The code to do all this work is listed in the full debug_me.c module at the end of this hack. Mostly it's a matter string processing to get the commands into the command file.

Now let's talk about the actual invocation of the gdb command. Ideally we should be able to just use a system call to execute the command:

gdb --command=<command-file>

In this example

<command-file>

will be replace by a temporary file containing the commands we listed above. But it's not as simple as that. It never is.

What is our program is a daemon running in background. It has no standard in and standard out. If we started gdb there would be no terminal in which to type commands.

A solution to this problem is to start our own terminal window. This done with the command:

xterm -bg red -e gdb –command=<command-file>

This starts a new xterm program in which our debugger will run. We set the background to red using the options

-bg red

. Red is used because it gets our attention. Besides the red screen of death sounds better than the blue screen of death.

Finally we tell xterm to execute the gdb command through the use of the

-e

option.

Now that we've done all this let's see how this function might be used in a program. Here's some code that handles a variable that it either black or white (at least under normal, sane circumstances):

#include “debug_me.h” // .... switch (black_or_white) { case BLACK: do_black(); break; case WHITE: do_white(); break; default:

std::cerr << “INTERNAL ERROR: Impossible color” << std::endl;

debug_me(); break;

}

In this case the variable

black_or_white

undergoes a sanity test. If things are insane we start the debugger.

Note: This only works for programs which are used internally. If you are giving a program to a customer without source code, this system is not that useful.

1: /************************************************ 2: * debug_me -- A module to start the debugger * 3: * from a running program. * 4: * * 5: * Warning: This code is Linux specific. * 6: ************************************************/ 7: #include <stdio.h> 8: #include <unistd.h> 9: #include <sys/param.h> 10: #include <stdlib.h> 11: 12: #include "debug_me.h" 13:

14: static int in_gdb = 0; // True if gdb started 15:

16: /************************************************ 17: * gdb_stop -- A place to stop the debugger * 18: * * 19: * Note: This is not static so that the * 20: * debugger can easily find it. * 21: ************************************************/ 22: void gdb_stop(void) 23: { 24: printf("Gdb stop\n");fflush(stdout); 25: } 26: 27: /************************************************ 28: * start_debugger -- Actually start * 29: * the debugger * 30: ************************************************/ 31: static void start_debugger(void)

32: {

33: int pid = getpid(); // Our PID 34:

35: // The name of the gdb file

36: char gdb_file_name[MAXPATHLEN]; 37:

38: // File that's used as a flag 39: // to signal that gdb is running 40: char flag_file[MAXPATHLEN]; 41:

42: // The file with the gdb information in it 43: FILE *gdb_file;

44:

46: char cmd[MAXPATHLEN+100]; 47:

48: if (in_gdb)

49: return; /* Prevent double debugs */ 50:

51: /*

52: * Create a command file that contains

53: * attach <pid> # Attaches to the process 54: * echo .... # Echos a welcome message 55: * symbol /proc/<pid>/exe

56: * # Loads the symbol table 57: * break gdb_stop # Set a breakpoint in 58: * shell touch /tmp/gdb.flag.<pid>

59: * # Create a file that tells us 60: * # that the debugger is running 61: * continue # Continue the program

62: */ 63: sprintf(gdb_file_name, "/tmp/gdb.%d", pid); 64: gdb_file = fopen(gdb_file_name, "w"); 65: if (gdb_file == NULL) 66: { 67: fprintf(stderr,

68: "ERROR: Unable to open %s\n", 69: gdb_file_name);

70: abort(); 71: }

72: sprintf(flag_file, "/tmp/gdb.flag.%d", pid); 73: fprintf(gdb_file, "attach %d\n", pid);

74: fprintf(gdb_file, "echo "

75: "\”Debugger gdb started\\n\"\n"); 76:

77: fprintf(gdb_file, "symbol /proc/%d/exe\n", 78: pid);

79:

80: fprintf(gdb_file, "break gdb_stop\n"); 81:

82: fprintf(gdb_file, "shell touch %s\n", 83: flag_file);

84:

85: fprintf(gdb_file, "continue\n"); 86: fclose(gdb_file);

87: /* Start a xterm window with the 88: * debugger in it */

89: sprintf(cmd, "xterm -fg red " 90: "-e gdb --command=%s &",

91: gdb_file_name); 92: system(cmd);

93:

94: /* Now sleep until the debugger starts and 95: * creates the flag file */

96: while (access(flag_file, F_OK) != 0) 97: { 98: sleep (1); 99: } 100: in_gdb = 1; 101: gdb_stop(); 102: } 103: 104: /************************************************ 105: * debug_me -- Start the debugger * 106: ************************************************/ 107: void debug_me(void) 108: { 109: start_debugger(); 110: gdb_stop(); 111: }