The previous section described how to create a flat file schema by hand. This section uses the same flat file instance and uses the Flat File Schema Wizard to create the same schema. The first subsection takes you through the wizard, and the second discusses
2
ptg7041380 changes that need to be done after the wizard has finished. Finally, a short description of
how to go about testing the generated schema follows.
To revisit the flat file instance, see Figure 2.26, which is a copy of the flat file instance shown in Figure 2.18.
FIGURE 2.26 Flat file example with the same delimiter for several records.
The Wizard
This section takes you through using the wizard step by step to generate a schema for the flat file instance shown in Figure 2.26.
1. Right-click your project and choose Add, New Item.
2. Choose Flat File Schema Wizard and provide a filename for the schema. You will now be seeing the wizard, as shown in Figure 2.27, which appears after you move on from the Welcome screen, which can be disabled from appearing anymore.
3. After the Welcome screen, the wizard needs an instance of a flat file on which to base the rest of the wizard. Browse to the appropriate flat file and enter values for Record Name and Target Namespace. The default values for the root node and target namespace are the exact same values as if you had just added a flat file schema to your project to build it manually. Change them appropriately and choose a code page for the input. Often, UTF-8 is fine, but when integrating with older legacy systems in countries that have special characters, some instances will be using old code pages. If you check the Count Positions in Bytes check box, this will cause both the wizard and the BizTalk runtime to count Positional Offset, Positional Length, and Tag Offset using bytes rather than characters. If selected, the wizard sets the property Count Positions in Bytes to Yes in the properties of the generated schema.
NOTE
You normally want to count the number of characters and not the number of bytes, because an encoding might use 1 byte for some characters and 2 or more bytes for other characters, but sometimes you will need to count using bytes when integrating to SAP or mainframe systems that count using bytes or when dealing with multibyte char-acter set (MBCS or DBCS) data.
ptg7041380 4. When these fields are filled out, click Next, which brings you to the screen shown in
Figure 2.28.
2
FIGURE 2.27 Flat File Schema Wizard, choosing instance and main properties.
FIGURE 2.28 Flat File Schema Wizard, choosing relevant data.
5. In this screen, you need to choose the data that is relevant for the schema you are going to define. By default, all data is selected, because the wizard assumes you want to create a schema that describes the entire flat file. This is usually the case, but sometimes you want to generate a schema for just part of the flat file. For instance, if a flat file has a header record and lots of details records, you can create a schema that parses just one detail record and then use this schema in a receive pipeline to split up incoming messages into lots of messages that each conform to this schema.
Click Next.
6. Now you get the screen shown in Figure 2.29. In this screen, you need to specify whether the root node is delimited or positional. This instance has a delimited root node, so choose By Delimiter Symbol and click Next, which brings you to the screen shown in Figure 2.30.
ptg7041380
7. Here you need to specify what delimiter to use for the root node. CR/LF has been preselected for you, but you can change it if you want to. Also, if the record you are specifying has a tag identifier, you can add it here. For this scenario, do not choose a tag identifier because it will be put on the Headerrecord instead, as with the manu-ally created flat file. Click Next to go to the screen shown in Figure 2.31.
8. In this screen, you must specify the children of the record. The columns in this screen are Element Name, Element Type, Data Type, and Contents.
FIGURE 2.30 Flat File Schema Wizard, choosing the delimiter and tag identifier.
FIGURE 2.29 Flat File Schema Wizard, choosing the format of the root node.
ptg7041380 The wizard has one row in this table for each record it has found utilizing the delimiter
chosen in Figure 2.30.
The Element Name column shows the name of the element as it will appear in the tree view once the schema is done, and which also will be the names of the relevant XML records, elements, and attributes in the instances after they are converted into XML. It is preferable to provide descriptive names here, but they can also be changed in the tree view at any time after the wizard has finished creating the schema.
The Element Type column can have five different values:
. Field Element: Used to create an element in the schema, which will then contain values.
. Field Attribute: Used to create an attribute in the schema, which will then contain values.
. Record: Used to create a record. If you choose this, the wizard later asks you to define this record, meaning that you have to specify delimiters and go through this exact same screen defining the children of this new record.
. Repeating Record: Used to define a record that can occur multiple times. As with Record, you are later be asked to define this record.
. Ignore: Used to tell the wizard to ignore a specific field that comes after a repeating record because the field is of the same type as the record that was repeating.
The Data Type column defines the data type of the row, provided the row is either a field element or a field attribute.
2
FIGURE 2.31 Flat File Schema Wizard, specifying child elements.
ptg7041380 The Contents column shows the contents of the child, which can help you remember
what the element should be called and defined as.
1. For this example, choose values as shown in Figure 2.32. This is where the tricky part comes in, which is also the reason you are encouraged to have experience in defining flat file schemas manually before attempting to use the Flat File Schema Wizard. Setting the first record to Header was expected, but then setting the second to Orders and ignoring the rest of the flat file seems odd. This is because of the limi-tations of the wizard. Later, when defining the Ordersrecord, you will reevaluate what data belongs to that and then carry on defining it. Also, the Orderisn’t supposed to be reoccurring, but the wizard will not let you ignore the rest of the flat file if you choose Record instead. This will have to be fixed after the wizard has finished. Click Next to get to the screen shown in Figure 2.33.
FIGURE 2.32 Flat File Schema Wizard, child elements specified.
FIGURE 2.33 Flat File Schema Wizard, choosing the next record.
ptg7041380 2. In this screen, you need to choose which of the records just defined in Figure 2.32
you want to define now. Choose Header, click Next, and you will get the screen shown in Figure 2.34.
2
FIGURE 2.35 Flat File Schema Wizard, specifying positional fields.
FIGURE 2.34 Flat File Schema Wizard, selecting data for Header record.
3. In this screen, you need to choose what part of the data from the input instance constitutes the Headerrecord. The wizard has already chosen the first line of data for you, excluding the CR/LF because this not part of the Headerrecord but rather the delimiter needed by the root node. Accept the proposal from the wizard, and click Next to start defining the content of the Headerrecord.
4. In the next screen, choose By Relative Positions and click Next. The screen shown in Figure 2.35 will then appear.
ptg7041380 5. This is where you define which fields exist in the positional record. The first thing to
do is to define the tag identifier, if any. For the Headerrecord, the tag identifier is H. 6. Then you define what fields exist by clicking inside the white box that shows the
content of the record. When you click, you add a vertical line that divides the content into fields. For this scenario, click just right of the H, which is the tag identi-fier, and just right of the date, resulting in the screen shown in Figure 2.35. If you accidentally set a vertical line in the wrong place, you can click the line to make it disappear. After setting the field delimiters, click Next to get the screen shown in Figure 2.36.
FIGURE 2.36 Flat File Schema Wizard, determining data types of Header record.
7. In this screen, you should set the data types and names of the fields that are in the Headerrecord. Specify the fields as shown in Figure 2.36 and ignore the warning on the OrderDate field. The warning just wants to let you know that the data content in the instance does not match the xs:datedata type, and therefore you need to change the Custom Date/Time Format property after the wizard has finished. Click Next and you get to the same screen as shown in Figure 2.33; only now the Header is grayed out, and you can only select Orders to specify that.
8. Choose Orders and click Next. You will see a screen similar to the screen in Figure 2.34, where only the second line is selected.
9. We want the root node to contain the Headerand the Ordersrecord, so the Orders record should cover all data in the instance that is not covered by the Header. Therefore, select the data as shown in Figure 2.37, and click Next.
10. As shown in Figure 2.36, there is a red exclamation mark on thedatedata type of the OrderDate field. The reason for this is that the data the wizard has determined is the value of the field does not match the data type ofdate. If you hover over the exclamation mark, you will get a message describing the error. In this case, it tells you that the data does not match the chosen data type and that you need to set the value of the Custom Date/Time Format of the field OrderDate after the wizard finishes.
ptg7041380 11. On the next screen, choose By Delimiter Symbol and click Next.
12. Because the Ordersrecord does not have a tag identifier (we are keeping that for the OrderHeaderrecord) and the proposed delimiter has the right one, click Next again to get the screen shown in Figure 2.38.
2
FIGURE 2.38 Flat File Schema Wizard, defining the records beneath Orders.
FIGURE 2.37 Flat File Schema Wizard, selecting data for Orders record.
13. Because you need to define an Orderrecord that will actually span multiple records, you need to define this one record and then ignore the rest. Set the values as shown in Figure 2.38, and click Next.
14. You can now select that you want to define theOrderrecord. Select this and click Next to get to a screen where you need to select what data belongs to anOrder record.
15. Choose as shown in Figure 2.39, and click Next.
ptg7041380 16. On the next screen, choose By Delimiter Symbol and click Next.
17. Because the Orderrecord does not have a tag identifier, accept the suggested delim-iter and click Next.
18. In the next screen, define the records as shown in Figure 2.40, and click Next.
FIGURE 2.39 Flat File Schema Wizard, choosing data for the Order record.
FIGURE 2.40 Flat File Schema Wizard, defining records beneath the Order record.
19. Choose to define the OrderHeaderrecord and click Next.
20. The wizard will have selected the data it believes belongs to the OrderHeaderrecord, and because the wizard is right, click Next again.
21. Then choose By Delimiter Symbol and click Next.
22. In the next screen, choose a comma as the delimiter, and let the tag identifier be O. Then click Next.
ptg7041380 23. Define the two fields as OrderNumber and SalesPerson, provide relevant data types,
and click Next.
24. On the next screen, you need to choose OrderLines and click Next to define what is inside the OrderLinesrecord.
25. On the next screen, choose what theOrderLinesrecord contains. TheOrderLines record contains all the orderlines, so choose them as shown in Figure 2.41, and click Next.
2
FIGURE 2.41 Flat File Schema Wizard, defining the content of the OrderLines record.
FIGURE 2.42 Flat File Schema Wizard, defining the records beneath the OrderLines record.
26. Then, choose By Delimiter Symbol and click Next.
27. Chose {CR}{LF} as delimiter and click Next.
28. Choose values as shown in Figure 2.42, and click Next.
ptg7041380 29. Click Next to start defining the OrderLinerecord, choose the data as shown in
Figure 2.43, and then click Next.
FIGURE 2.43 Flat File Schema Wizard, choosing data for the OrderLine record.
FIGURE 2.44 Flat File Schema Wizard, naming the children of the OrderLine record.
30. Choose By Delimiter Symbol and click Next.
31. Now choose the comma as delimiter and OL as tag identifier, and click Next.
32. Name the fields as shown in Figure 2.44, and click Next, and because you have now defined all the records in the flat file, you can click only Finish.
If you make a mistake at some point that you cannot undo because the wizard does not allow you to go back enough steps to undo it, you can still fix it using the wizard later on. You can usually go back in steps within the record you are defining. But once you have clicked Next, as shown in Figure 2.33, you have accepted the record you have just
ptg7041380 defined and cannot get back to it. If you have made an error, the way to fix it is to
rede-fine that record. This is done in the tree view of the generated schema, where you can right-click a record and choose Define Record from Flat File Instance. If you do this, the wizard starts up again, asking you to specify an instance file, which is prefilled for you because the wizard wrote the instance filename into the <projectname>.btproj.user file in theEditorInputInstanceFilenameelement.
Changes After the Wizard
After completing the wizard, you will almost always need to make changes to the gener-ated schema. The changes that are the most common include the following:
. Cardinality: When creating recurring records, the wizard sometimes inserts the actual number of records found in the instance as MinOccurs and MaxOccurs, which most of the time is not what you want and therefore quite silly.
. Custom date and time formats: As shown in Figure 2.36, you get a warning when you set a field to the type dateif the value in the field does not match the xs:date data type usable in XSDs. In this case, you need to remember to set the value of the Custom Date/Time Format property.
. Switching record to elements: Because you cannot use the wizard to create recur-ring elements, you need to create a recurrecur-ring record and have an element inside it.
You might want to change that afterward.
Other than these, there might be other changes you want to implement. As with all wizards, check the output.
For the generated schema to work, the following things need to be changed:
. The Child Order property on the root node should be changed to Infix.
. The Max Occurs property on theOrdersrecord should be reset to the default value.
. The Child Order property on the Orderrecord should be changed to Infix . The Child Order property on the OrderLinesrecord should be changed to Infix.
. The Max Occurs property on theOrderLinesrecord should be reset to the default value.
. The Custom Date/Time Format property of the OrderDateattribute should be set to yyyyMMdd.
Test the Generated Schema
After generating the schema and manually changing the needed properties, you should enable unit testing of the schema with as many flat file instances as you have, to make sure the schema is valid. Unit testing of schemas is described later in this chapter.
2
ptg7041380