6824377 GDC Wrapper

Full text

(1)

DDW

(2)

Process Highlights

• Run any executable file including Ab Initio deployed scripts • Restart without touching/deleting flags manually

• Run processes from different servers without impacting inter-process communication

• Check/Set Object status on Oracle

• Update ASLAM on Teradata and Oracle

• Collect Statistics and prepare tie-out

• Archive log files

• Archive data files

• Communicate completion/failure/time-out thru email to different mailing lists and pager

(3)

Process Architecture

• Main Process - submitted thru crontab

• Sub-processes - submitted by Main Process

• Executable files - submitted by Sub-process

Main Process

(4)

Main Process

• Submitted thru crontab

• Sets environment for the entire process

• Validates existence/executability of sub-process files

• Submits one or more Sub-Processes

• Waits for Sub-Process completion

• Updates ASLAM on Teradata

(5)

Sub-Process

• Submitted by the Main process

• All sub-processes submitted simultaneously and not sequentially

• Has capability to wait for a variety of dependencies, including other Sub-Processes

• Can perform various functions depending on RUN_TYPE definition

(6)

Sub Process - Functionality

Can perform any of the functions as determined by RUN_TYPE P : Process (submit any executable file such as Ab Initio

deployed script)

F : Set Flag on local and remote directory location O : Set Object Status on Oracle

(7)

Sub Process - Dependencies

Can wait for one or more or all of the dependencies

D : Data file

F : Flag set by another process

O : Object Status on Oracle

(8)

File System - Overview

• Common Files

 Sourced by every process  Ease of code maintenance

 Extend new features to all processes

 Developers can not alter code – maintains integrity

• Local Files

(9)

File System – Common Files

• Located in /usr/local/abinitio/common (DDW_COMMON_DIR ) on every server

• Files include

ddw_main_process.ksh chk_object_status.sql ddw_sub_process.ksh set_object_status.sql archive.ksh fill_job_detail_nolsn.sql get_ora_cnt.ksh fill_job_detail.sql

(10)

File System – Local Files

• Main process file

• Sub-Process file(s)

• Ab Initio deployed scripts

• List files

• Mail files

(11)

File System – hosts.env

• Required for defining host name of each process • Located in /usr/local/abinitio/common

• Used for checking/setting flags in inter-process communication

• Useful for fail-over protection and inter-process communication

(12)

File System - Directories

• Home directory (also known as Sandbox) is project specific

e.g. /usr/dell/us_fin/fin/orders/us/load

• Following sub-directories required by the Wrapper

bin main process file db dbc files

dml dml files env list files

flags setting/checking flags logs log files

mail mail-related files paging pager related files

(13)

File System – Main Process file

• Copy /usr/local/abinitio/template/template_main.ksh to bin/ directory ($AI_BIN)

• Rename as desired

• Make sure it’s executable

• Modify just one line of the file (directory path)

. $HOME/<directory path>/ab_project_setup.ksh $HOME/<directory path>

(14)

File System – Sub-Process file - 1

• Copy /usr/local/abinitio/template/template_sub.run to run/ directory ($AI_RUN)

• Rename as desired

• Make sure it’s executable

• Define RUN_TYPE and related parameters

(15)

File System – Sub-Process file - 2

RUN_TYPE=P

• Runs a process by submitting an executable file such as Ab Initio deployed graph

• Parameter required is

(16)

File System – Sub-Process file - 3

RUN_TYPE=O

• Sets Object Status on Oracle

• Sources /usr/local/abinitio/common/set_object_status.sql

• Parameters required are

OS_REGION: region code

OS_SUBJECT_AREA: subject area code OS_OBJECT_NAME: object name

OS_ACTION: START or FINISH

OS_LOAD_SEQ_NUM: $LOAD_SEQ_NUM or 0 OS_COMMENTS: Any non-null value

OS_SID_NAME: Oracle SID where process is running OS_SCHEMA: Oracle Schema where process is running

(17)

File System – Sub-Process file - 4

RUN_TYPE=F

• Sets flags on local and target directories

• Parameters required are

FLAG_NAME: flag name

REMOTE_HOST: host server name of downstream process as in hosts.env REMOTE_USER: useid for logging to REMOTE_HOST

(18)

File System – Sub-Process file - 5

RUN_TYPE=OA

• Updates ASLAM on Oracle

• Parameters required are

OA_JOB_CODE: Job code for ASLAM on Oracle OA_ACTION: 'START‘ or ‘FINISH’

OA_LOAD_SEQ_NUM: $LOAD_SEQ_NUM

OA_FINISH_PROCESS_NAME: run file name where OA_ACTION is ‘FINISH’

• OA_FINISH_PROCESS_NAME is required only when

(19)

File System – List Files

• Used by Main Process or Sub-processes

• Located in env/ directory ($AI_ENV)

• Can be any of the following types

Job List

Dependency List Stats List

(20)

File System – Job List File

• Lists the sub-processes to be submitted by Main Process

• SUB_JOB_LIST parameter defines the file name

• ON/OFF flag determines which process to run

• Order in the job list file does not indicate order of their run

• Main Process reads this file several times - hence a copy of this file is stored in $AI_ENV/.ENV and accessed to preserve integrity from changes to the file till process completion

(21)

File System – Dependency List File - 1

• Lists the dependencies for which a sub-processes may wait

• SUB_DEPEND_LIST parameter defines the file name

• Four types of dependencies

D: Data File

F: Flag set by another process O: Object Status

(22)

File System – Dependency List File - 2

• Main Process reads this file several times - hence a copy of this file is stored in $AI_ENV/.ENV and accessed to

preserve integrity from changes to the file till process completion

(23)

File System – Dependency List File - 3

Dependency Type = D

• Waits for Data files

• Format for entry

<subprocess> D <depedended file> <DEFAULT/directory location> e.g.

build_svc_tags.run D sthsflat.sql DEFAULT

(24)

File System – Dependency List File - 4

Dependency Type = F

• Waits for a flag set by another process

• Format for entry

<subprocess> F <depedended flag> <directory location> <remote server> <remote userid>

e.g.

build_svc_tags.run F customer_can_${LOAD_SEQ_NUM}.1_moved.flg $CUST_MOVED_DIR $CAN_FIN_CUST_LOAD_HOST can_svc

(25)

File System – Dependency List File - 5

Dependency Type = O

• Waits for Object Status on Oracle

• Format for entry

<subprocess> O <object_name> <subject name> <region> e.g.

copy_girp_oh_od_all_us.run O PROD_ORDER_DETAIL FINANCE AMER

(26)

File System – Dependency List File - 6

Dependency Type = S

• Waits for another Sub-Process submitted by the same Main Process

• Format for entry

<sub-process> S <depedended file> e.g.

(27)

Dependencies – Important Points

• A Sub-Process can have any kind and number of dependencies

• If a Sub-Process has more than one kind of dependency, waiting is in the alphabetical order of the kind (D  F  O  S)

• A Sub-process can wait for any number of other Sub-Processes • Any number of Sub-Processes can wait for a Sub-Process

• Setting OFF of dependent Sub-Process ignores the dependency – if Sub-Process B waits for Sub-Process A and Sub-Process A is set to OFF, the dependency is ignored

• Ignoring the Process dependency does no cascade – if Sub-Process C waits for Sub-Sub-Process B and Sub-Sub-Process B in turn waits for Sub-Process A, setting Sub-Process B to OFF does not make

(28)

File System – Stats List File -

1

• Required for collecting the record count for the tie-out report

• STAT_LIST_FILE parameter defines the file name

• collect_stats.ksh is used to collect record count

• File contents are not read till collect_stats Sub-Process submits collect_stats.ksh

• Can collect record count of any of the following sources

Data Files used/generated by Ab Initio

(29)

File System – Stats List File -

2

• Source Type determines how to get the record count

• Source Types

AF : Anomaly File downloaded from source

AT : Anomaly Table on Teradata along with database name, with optional where condition BT : Base Table on Teradata along with database name, with optional where condition DF : Records discarded with D flag - file created by Ab Initio graph

DR : Delete Resent - file created by Ab Initio graph EF : Extract File

IF : Incremental file

IT : Incremental Table on Teradata along with database name, with optional where condition MB : Datamart base table on Oracle, with optional where condition

(30)

File System – Stats List File -

3

Data Files

• Source Types of AF, DF, DR, EF, IF

• Format for definition

<Table Name> <Source Type> <File Name With Location> <DML File With Location>

e.g. SVC_TAG AF $SVC_TAG_ANOM_IN_DAT $SVC_TAG_ANOM_DML

• Both multi-file and single-file systems are handled

• DML file required in definition for multi-file system, not for single-file systems

(31)

File System – Stats List File -

4

Teradata Tables

• Source Types of AT, BT, IT, MT

• Format for definition

<Table Name> <Source Type> <database.tablename> [<"Where Condition">] e.g. SVC_TAG BT $SVC_TAG "where svc_business_unit_id = 707 and load_seq_num=${LOAD_SEQ_NUM}.1"

• When no where condition is defined, whole table count is returned • Sources /usr/local/abinitio/common/get_td_cnt.ksh

(32)

File System – Stats List File -

5

Oracle Tables

• Source Types of OE, MI, ME

• Format for definition

<Table Name> <Source Type> <Source Name> <Oracle Schema> <Oracle Sid> [<"Where Condition">]

e.g. ORDER_DETAIL OE RAW_RAW_STAT_ORDER_DETAIL_AMER am_fl_extract proc

• When no where condition is defined, whole table count is returned • Sources /usr/local/abinitio/common/get_ora_cnt.ksh

(33)

File System – Archive List File

• Used for archiving data files

• ARCHIVE_FILES_LIST parameter defines the file name

• Format for definition

ON <file name with directory path> e.g.

(34)

File System – Mail files - 1

• Files used for sending mails

• Files located in mail/ ($AI_MAIL)

• Three types of files required

List file Subject File Body text File

(35)

File System – Mail files - 2

• List file sends mail to the listed email addresses

e.g. Format for entry

mail -s “`cat $1` `date`” ravi_pothukuchy@dell.com < “$2”

• Add additional email addresses delimited by a comma

• Subject file contains text that forms the subject part of a mail

e.g. Format for entry

ERROR - <region> <process name> <task> Aborted

(36)

File System – Pager files

• File used for sending pager messages

• File located in paging/ ($AI_PAGING)

• File contains the logic and page-id for sending the code

e.g. Format for entry

echo $1 $2 | /usr/bin/Mail -s "ravi" pager@pagerhost.us.dell.com

(37)

ASLAM on Teradata

• Each process has a Process_id and makes an entry to ASLAM tables

• Each Process works on one or more Objects and ASLAM tables maintain the relation between a Process and the Objects

• A Process can end with any of the four statuses

S : Successful E : Errored T : Timed-out U : Unknown

(38)

Log Files - 1

• Four levels of log files are created in $AI_LOG

Generated by Main Process Generated by Sub-Process

Generated by the executable submitted by a Sub-Process Log files defined inside an Ab Intio graph

• Main Process log file defined as parameter LOG_FILE and each time Main Process is submitted, a separate log file is created

e.g.

(39)

Log Files - 2

• Sub-Process log file is created by suffixing date

(MMMDD_YYYY:HH:MI:SS) to the run file and each time Sub-Process is submitted, a new log file is created.

e.g.

generate_dml_May22_2002:21:03:02.log

• The log file generated by the executable submitted by the

Sub-Process (such as Ab Initio deployed script) takes it’s name from the script name appended with YYYYMMMDD and the extension is ‘out’ instead of ‘log’. Each time the process is submitted, output is appended to this file (i.e. only one file per day)

(40)

Log Files - 3

• Log files defined inside the Ab Initio graphs have constant names and are always replaced when the graph is re-run.

• Log files older than certain number of days

(LOG_FILE_KEEP_DAYS )are archived and compressed by the Main

Process and copied to $AI_LOGS/archive directory.

• Archived log files older than certain number of days (LOG_ARCH_KEEP_DAYS) are removed by the Main process.

(41)

Automating a Process - 1

1. Create the directory structure 2. Setup Project environment 3. Create Main Process File

(copy /usr/local/abinitio/template/template_main.ksh to $AI_BIN and modify)

4. Create Sub-process files

(copy /usr/local/abinitio/template/template_sub.run to $AI_RUN and modify)

5. Define Job list file

(42)

Automating a Process - 2

1. Define Dependency list file

(refer to /usr/local/abinitio/template/template_dependency.lst for sample)

2. Define Stats list file

(refer to /usr/local/abinitio/template/template_stats.lst for sample)

3. Define Archive list file

(refer to /usr/local/abinitio/template/template_archive.lst for sample)

4. Define Mail files

(refer to mail*.lst and mail*.txt in /usr/local/abinitio/template for sample)

(43)

Automating a Process - 3

1. Define Pager file

(Copy /usr/local/abinitio/template/page_oncall to $AI_PAGING and modify as required)

2. Copy /usr/loca/abinitio/template/collect_stats.ksh to $AI_RUN –

customize if custom source types are defined or tie-out calculation needs to be modified

3. Define wrapper related Parameters in the project setup and make sure they are exported

(44)

Parameters that change runtime behavior - 1

LSN_REQUIRED (Y/N) Whether to check for the file that provides the Load Sequence Num exists or not IGNORE_RUNNING_FLAGS (Y/N) Whether to resubmit the running

sub-processes again when the main-process is restarted

IGNORE_ASLAM (Y/N) Whether to update ASLAM tables or not PAGE_SUCCESSFUL_RUN (Y/N) Whether to send a pager message when main

process completes successfully

PAGE_SUBPROCESS_FAIL (Y/N) Whether the sub-process to page when it fails PAGE_SUB_DEPENDENCY (Y/N) Whether the sub-process to page waiting for

(45)

Parameters that change runtime behavior - 2

ARCHIVE_LOG_FILES (Y/N) Whether to archive log files or not.

LOG_FILE_KEEP_DAYS (3) Log files older than how may days should be archived

LOG_ARCH_KEEP_DAYS (14) Archived log files older than how many days old should be deleted

ARCHIVE_FILES (Y/N) Whether to archive data files or not

TIEOUT_FAIL_EXIT (Y/N) Whether the process should terminate if the tie-out fails

PRINT_BASE_TIEOUT (Y/N) Whether want to print base tie-out in the report PRINT_MART_TIEOUT (Y/N) Whether want to print mart tie-out in the report

(46)

Parameters that change runtime behavior - 3

MAIN_SLEEP_TIME Time in seconds the main-process waits for sub-process completion between each cycle

MAIN_PAGE_CNT No. of cycles after which main-process sends pager message

MAIN_EXIT_CNT No. of cycles after which main-process time-outs SUB_SLEEP_TIME Time in seconds sub-process sleeps to check for

completion of a dependency between each cycle SUB_PAGE_CNT No. of cycles after which sub-process sends a pager SUB_EXIT_CNT No. of cycles after which sub-process times-out

(47)

Flags and Process Control - 1

• Main Process sets $RUNNING_FLAG to prevent another

concurrent session

• Main Process writes Load Sequence Number and time of

completion to $DONE_FLAG, to prevent another run for the same day • Each Sub-Process sets a flag depending on the status

Running: <sub_process_name>_running.flg

Done: <sub_process_name>_done.flg

Failed: <sub_process_name>_error.flg

• Sub-Process on failure or time-out sets $ABORT_FLAG

(48)

Flags and Process Control - 2

• To resubmit processes whose running flags exist (occurs when the process is killed / server has failed), remove flags manually or set IGNORE_RUNNING_FLAGS to Y and restart

• Flags set by upstream processes are deleted at the end of successful completion of process

• Flags set for downstream processes are deleted when Main Process is started afresh (when $ABORT_FLAG does not exist)

(49)

Flags and Restartability - 1

• Restarting a process is required under one of these situations

• Main-Process failed or timed-out

• One or more Sub-Processes failed or timed-out

• Main-Process/Sub-Processes killed manually or due to server failure

• Restarted Main Process ignores Sub-Processes whose done flags or running flags exist.

• Restarting Main Process cleans up abort flags and error flags –

(50)

Flags and Restartability - 2

• Deleting done flags before restart will resubmit the Sub-Processes those have already completed

• Deleting running flags before restart can lead to concurrent sessions of the Sub-Processes whose outcome may be unpredictable

• Deleting ABORT_FLAG will remove any flags set for downstream processes (since absence of ABORT_FLAG is taken as a fresh

(51)

Important Considerations

• Main Process time-out does not kill any sub-process it has submitted – they are still running – so just restart the Main Process • Sub-Process time-out indicates it has timed-out even before finished it’s job

• Sub-Process never times-out waiting for the process it has submitted (such as an Ab Initio deployed script)

• Failure of one Sub-Process in no way influences the outcome of another Sub-Process (except it may time-out if it has a dependency) • Deleting running flags before restart can lead to concurrent

(52)

The Wrapper Advantage

• Easy to maintain code because it’s centralized

• Easy to extend new features to every process with little changes in individual process

• Easy to set-up a process which improves productivity

• Easy to support because of uniformity in code/processing across regions/subject/processes

• Easy to move processes across servers without impacting inter-process communication

(53)

The Wrapper Advantage

Easy to Set-up. Easy to Support.

Figure

Updating...

References

Updating...

Related subjects :