Building simple generators
4.3 T HE MIXED - CODE GENERATOR MODEL
The mixed-code generator is a more practical implementation of the inline-code expansion model. The generator reads the input file, makes some modifications to the file, and then saves it back into the input file after backing up the original.
The potential uses are similar. Using special markup, the generator builds imple-mentation code to match the requirements specified in the markup.
The key difference between the two models is the I/O flow. In the mixed-code generation model, the input file is the output file. Mixed-code generation thus avoids the debugging problems inherent with inline-code generation.
To demonstrate the difference between the two models, we’ll show the same example from the inline-code expansion introduction implemented as mixed-code generation, starting with the input:
void main( int argc, char *argv[] ) {
// sql-select: SELECT first, last FROM names // sql-select-end
return;
}
Note that the <sql-select: …> syntax has been replaced with specially format-ted comments.
The output of the generator is:
void main( int argc, char *argv[] )
THEMIXED-CODEGENERATORMODEL 83
// sql-select: SELECT first, last FROM names struct {
char *first;
char *last;
} *sql_output_1;
{
db_connection *db = get_connection();
sql_statement *sth = db->prepare( “SELECT first, last FROM names” );
sth->execute();
sql_output_1 = malloc( sizeof( *sql_output_1 ) * sth->count() );
for( long index = 0; index < sth->count(); index++ ) {
// ... marshal data }
}
// sql-select-end return;
}
Notice how the comments are maintained but that now the interior is populated with code. The code that implements the requirements is specified in the comments. The next time the generator is run, the interior will be removed and updated with newly generated code.
Mixed-code generation has advantages over inline-code expansion:
• The use of comments avoids any syntax corruption with the surrounding code.
• By using comments in the original file, you can take advantage of special features of any IDE, such as syntax coloring or code hints.
• You can use a debugger because the input file is the same as the output file.
• The output code is located right next to the specification, so there is a clear visual correspondence between what you want and how it is implemented.
4.3.1 Uses and examples
The potential uses for mixed-code generation are similar to those for inline-code expansion. However, because of the proximity of the markup code to the generated code you may think of using it for other types of utility coding, such as:
• Building rudimentary get/set methods
• Building marshalling code for user interfaces or dialog boxes
• Building redundant infrastructure code, such as C++ copy constructors or operator= methods.
As we mentioned earlier, the major difference between inline-code generation and mixed-code generation is the flow between the input and the output. Figure 4.15 shows the flow for mixed-code generation. As you can see, the generation cycle uses the source code as both the input and the output. This is the same code that is sent to the compiler and used as production code.
It is the responsibility of the generator to retain a backup of the original code before replacing it with the newly generated code. When you use this model, make sure that you manage the backups and have a reliable source code control system.
Our first example of the mixed-code generation type will build print statements.
Here is the input file for our example:
int main( int argc, char *argv[] ) {
// print Hello World // print-end
return 0;
}
We’ve changed the <…> syntax into comments. There are two reasons for this change.
First, the code is compilable both before and after generation. Second, the comments are maintained between generation cycles so that the generator knows which parts of the code to maintain. The output of the generator, which is in the same file as the input, is shown here:
int main( int argc, char *argv[] ) {
// print Hello World printf("Hello World");
// print-end return 0;
}
The original comments are retained and the implementation code has been put in-between the start and end comments.
Do you need start and end comments? Yes. You need a predictable ending marker for the regular expression. Otherwise, you would not know which code belonged to the generator and therefore could be replaced. You could end up replacing the contents of the file from the starting marker to the end of the file.
Source Code
Mixed-Code Generator
Source Code
Compiler
Executable
Figure 4.15
The inputs and output flow for a mixed-code generator
THEMIXED-CODEGENERATORMODEL 85 Listing 4.13 contains the code that implements our simple mixed-code generator.
require "ftools"
unless ARGV[0]
print "mc1 usage: mc1 file.c\n"
exit end
fh = File.open( ARGV[0] ) text = fh.read() fh.close text.gsub!( /(\/\/\s*print\s+)(.*?)\n(.*?)(\/\/\s*print-end\n)/m ) { code = "printf(\"#{$2}\");\n"
$1 + $2 + "\n" + code + $4 }
File.copy( ARGV[0], "#{ARGV[0]}.bak" ) File.open( ARGV[0], "w").write( text )
q
This regular expression finds the // print … and // print-end markers and all of the content between the two. The // print text goes into $1; the print specifica-tion goes into $2. The generated code in the middle, if it is there, goes into $3, and the // print-end goes into $4.The regular expression is shown in exploded form in figure 4.16.
w
This creates a printf call from the string that was specified in the comment.e
This puts the expression back together by adding the code text to the $1, $2, and $4 groups that we preserved from the regular expression.Listing 4.13 Mixed-code generator 1: building printfs
Reads the
Figure 4.16 The regular expression that finds the special markup comments
4.3.2 Developing the generator
Figure 4.17 shows a simple development process for building a mixed-code generator.
As you can see, this is very similar to the process for developing an inline-code expander:
• Build the test code—First, build the code you want to see come out of the genera-tor. That means writing some special markup comments and also identifying the code to be generated.
• Design the generator—Sketch out the code flow for the generator.
• Develop the input parser—If you want to include any options that you can specify in the input file, this is the time to implement the parsing for that. You need to develop the code that reads the input file and scans for any options that will be used to modify the behavior of the generator. Our earlier example doesn’t have any options, but you could imagine that there might be an option for specifying a custom procedure instead of printf.
• Develop the code replacer—Next, build the regular expression that will read the replacement sections. This expression should find the starting and ending blocks, as well as the arguments and the code in the interior. The example code shows a typical regular expression for this purpose.
• Develop the templates from the test code—Now that you have identified replace-ment regions, you need to develop the code that will populate them with the generated code. The example code is so simple that all you need is to do is some string formatting to build the printf statement. If the requirements of your generator are more complex, you may want to use some of the ERb templating techniques shown in chapter 3, “Code generation tools.”
• Develop the output code builder—The final step is to merge the code replacer with the output code builder to create the final output code. Then you need to back up the original file and replace it with the newly generated code.
Build Test Code
Design Generator Develop Templates From Test Code
Develop Input Parser Develop Output Code Builder Develop Code
Replacer
Figure 4.17
The design and implementation steps for developing a mixed-code generator
THEPARTIAL-CLASSGENERATORMODEL 87