• No results found

Globalization Considerations when Mapping Data

In document Goldengate 12c (Page 162-165)

12 Mapping and Manipulating Data

12.5 Globalization Considerations when Mapping Data

When planning to map and convert data between databases, take into consideration what is supported or not supported by Oracle GoldenGate in terms of globalization.

These considerations encompass the following topics:

Section 12.5.1, "Conversion between Character Sets"

Section 12.5.2, "Preservation of Locale"

Section 12.5.3, "Support for Escape Sequences"

12.5.1 Conversion between Character Sets

Oracle GoldenGate converts between source and target character sets if they are different, so that object names and column data are compared, mapped, and manipulated properly from one database to another. See Appendix A, "Supported Character Sets," for a list of supported character sets.

To ensure accurate character representation from one database to another, the following must be true:

The character set of the target database must be a superset or equivalent of the character set of the source database. Equivalent means not equal, but having the same set of characters. For example, Shift-JIS and EUC-JP technically are not completely equal, but have the same characters in most cases.

If your client applications use different character sets, the database character set must also be a superset or equivalent of the character sets of the client

applications.

In this configuration, every character is represented when converting from a client or source character set to the local database character set.

A Replicat process can support conversion from one source character set to one target character set.

12.5.1.1 Database Object Names

Oracle GoldenGate processes catalog, schema, table and column names in their native language as determined by the character set encoding of the source and target

databases. This support preserves single-byte and multibyte names, symbols, accent characters, and case-sensitivity with locale taken into account where available, at all levels of the database hierarchy.

12.5.1.2 Column Data

Oracle GoldenGate supports the conversion of column data between character sets when the data is contained in the following column types:

Character-type columns: CHAR/VARCHAR/CLOB to CHAR/VARCHAR/CLOB of another character set; and CHAR/VARCHAR/CLOB to and from NCHAR/NVARCHAR/NCLOB.

Columns that contain string-based numbers and date-time data. Conversions of these columns is performed between z/OS EBCDIC and non-z/OS ASCII data.

Conversion is not performed between ASCII and ASCII versions of this data, nor between EBCDIC and EBCDIC versions, because the data are compatible in these cases.

Character-set conversion for column data is limited to a direct mapping of a source column and a target column in the COLMAP or USEDEFAULTS clauses of the Replicat MAP parameter. A direct mapping is a name-to-name mapping without the use of a stored procedure or column-conversion function. Replicat performs the character-set conversion. No conversion is performed by Extract or a data pump.

Note: Oracle GoldenGate supports timestamp data from 0001-01-03 00:00:00 to 9999-12-31 23:59:59. If a timestamp is converted from GMT to local time, these limits also apply to the resulting timestamp.

Depending on the timezone, conversion may add or subtract hours, which can cause the timestamp to exceed the lower or upper supported limit.

If the trail is written by a version of Extract that is prior to version 11.2.1, the character set for character-type columns must be supplied to Replicat with the SOURCECHARSET parameter. This parameter also supplies a PASSTHRU option for preventing the

conversion of character sets. For more information, see Reference for Oracle GoldenGate for Windows and UNIX.

12.5.2 Preservation of Locale

Oracle GoldenGate takes the locale of the database into account when comparing case-insensitive object names. See Appendix B, "Supported Locales" for a list of supported locales.

12.5.3 Support for Escape Sequences

Oracle GoldenGate supports the use of an escape sequence to represent a string column, literal text, or object name in the parameter file. You can use an escape sequence if the operating system does not support the required character, such as a control character, or for any other purpose that requires a character that cannot be used in a parameter file.

An escape sequence can be used anywhere in the parameter file, but is particularly useful in the following elements within a TABLE or MAP statement:

An object name

WHERE clause

COLMAP clause to assign a Unicode character to a Unicode column, or to assign a native-encoded character to a column.

Oracle GoldenGate column conversion functions within a COLMAP clause.

Oracle GoldenGate supports the following types of escape sequence:

\uFFFF Unicode escape sequence. Any UNICODE code point can be used except surrogate pairs.

\377 Octal escape sequence

\xFF Hexadecimal escape sequence The following rules apply:

If used for mapping of an object name in TABLE or MAP, no restriction apply. For example, the following TABLE specification is valid:

TABLE schema."\u3000ABC";

If used with a column-mapping function, any code point can be used, but only for an NCHAR/NVARCHAR column. For an CHAR/VARCHAR column, the code point is limited to the equivalent of 7-bit ASCII.

The source and target data types must be identical (for example, NCHAR to NCHAR ).

Begin each escape sequence with a reverse solidus (code point U+005C), followed by the character code point. (A solidus is more commonly known as the backslash symbol.) Use the escape sequence, instead of the actual character, within your input string in the parameter statement or column-conversion function.

To use the \uFFFF Unicode escape sequence

The \uFFFF Unicode escape sequence must begin with a lowercase u, followed by exactly four hexadecimal digits.

Supported ranges are as follows:

0 to 9 (U+0030 to U+0039) A to F (U+0041 to U+0046) a to f (U+0061 to U+0066)

\u20ac is the Unicode escape sequence for the Euro currency sign.

To use the \377 octal escape sequence

Must contain exactly three octal digits.

Supported ranges:

Range for first digit is 0 to 3 (U+0030 to U+0033)

Range for second and third digits is 0 to 7 (U+0030 to U+0037)

\200 is the octal escape sequence for the Euro currency sign on Microsoft Windows

To use the \xFF hexadecimal escape sequence

Must begin with a lowercase x followed by exactly two hexadecimal digits.

Supported ranges:

0 to 9 (U+0030 to U+0039) A to F (U+0041 to U+0046) a to f (U+0061 to U+0066)

\x80 is the hexadecimal escape sequence for the Euro currency sign on Microsoft Windows 1252 Latin1 code page.

In document Goldengate 12c (Page 162-165)