UNIX Comes to the Rescue: A Comparison between UNIX SAS and PC SAS

Download (0)

Full text

(1)

UNIX Comes to the Rescue: A Comparison between UNIX SAS® and PC SAS®

Chii-Dean Lin, San Diego State University, San Diego, CA Ming Ji, San Diego State University, San Diego, CA

ABSTRACT

Running SAS under PC and under UNIX environment are very similar in general. However, some differences do exist between PC SAS and UNIX SAS. A SAS code may run smoothly under PC SAS but not under UNIX SAS. In this paper, we compare the differences between PC SAS and UNIX SAS. Features that are different between PC SAS and UNIX SAS are summarized. In addition, we show a step-by-step procedure for running a PC created SAS code on a UNIX server. This paper is intended for beginners with basic SAS knowledge.

INTRODUCTION

“I really need the SAS results by Friday but my PC and my colleagues’ PCs can only get half of them. What should I do?” If you have a UNIX account, one suggestion would be to upload the program(s) to UNIX and run program(s) there. Even if you follow all the necessary rules and procedures of putting SAS program(s) and data sets into UNIX, sometimes the program will not run as smooth as you wish. Generally speaking, PC SAS and UNIX SAS are so similar that people usually are not aware of any problems until they try to execute a PC SAS program on UNIX (or vice versa) and run into trouble. In this paper, we compare differences between PC SAS and UNIX SAS and discuss potential problems when running a PC created SAS program on a UNIX platform.

Kutler (2003) discussed the Linux/Unix & X-windows systems for SAS system administrators who would like to implement an open environment within their SAS installation. Zhang (2003) showed how one can use ODS, FTP, DDE, etc for integrating between UNIX & SAS. Our focus of this paper is not, however, to show how to integrate between PC and UNIX using SAS/CONNECT or accessing the interfaces for client/server database systems using SAS/ACCESS. Instead, we emphasize on running a SAS code on either PC or UNIX in a more traditional way: write a program, submit it, and get results.

PCs are more powerful and faster than before. Even so, we still receive requests for help from students asking how to run programs such as SAS or S-plus on UNIX due to time limit or memory problem on PC. It is clear that the UNIX platform still holds some edges for running large programs over the PC platform. In this paper, we first compare major differences between PC SAS and UNIX SAS. This gives readers a quick snapshot on how SAS functions under these two systems. After giving a general idea how SAS behaves under PC and UNIX, we provide a detailed illustration on some major issues and provide simple programs for comparison purposes. A step-by-step procedure is provided on how one can run a PC created SAS program using UNIX batch mode and how one can edit the program on UNIX. The procedure gives readers a guideline to run a SAS code from PC SAS to UNIX SAS.

Finally, we conclude with a brief discussion.

A COMPARISON BETWEEN PC SAS AND UNIX SAS

We start with a simple comparison on general issues between PC SAS and UNIX SAS. A summarized table showing the comparisons is listed below. Sample SAS codes used to show the differences are given in the next section. Note that some of the comparisons are based on the windowing environment in PC SAS and on the batch mode in UNIX SAS. Running SAS codes under PC windowing environment and under UNIX batch mode are our major concern. This is because windowing environment is used the most under PC while the background batch mode under UNIX can be used as an alternative option for running a long processing SAS program. Some of the differences listed below may be due to the running modes such as windowing environment and batch mode and not due to the PC SAS and the UNIX SAS environment.

Server PC-SAS UNIX-SAS

Invoking a SAS session -Windowing environment mode.

-Batch mode.

-Display manager mode (if connects from a PC, one needs an X Window manager such as XWIN32 to run the Display manager mode).

-Non-interactive mode (Batch mode).

-Line command mode.

Terminating a SAS process

-Windowing environment mode:

Use BREAK icon (circled exclamation point (!)) or CTRL+BREAK.

-Batch mode: Click on cancel icon.

All modes: use kill PID under UNIX prompt (See next section for detailed description).

Case Sensitive? No The UNIX environment is case sensitive but not

(2)

the SAS session. File directories and external file names called within SAS are case sensitive.

Output & Log file -Windowing environment mode:

accumulate under the Output Window & the Log Window.

-Batch mode: override the existing .lst and .log files when rerun the SAS code.

-Display manager mode: same as PC-SAS.

-Batch mode: override the existing .lst and .log files when rerun the SAS code unless redirect to different files.

Executing system commands within SAS?

Yes Yes

Running multiple jobs? Yes. Not efficient. Yes. Batch mode submission allows us to submit many jobs simultaneously.

Line width restriction when creating a SAS code?

No Yes. The lines that is longer than 135 columns

will be automatically moved to next line. (See the example below)

The submitted code will remain in the Program Editor under the Display manager mode?

Windowing environment mode: the SAS code will remain in the Program Editor window after a submission.

Display manager mode: the SAS code will disappear from the Program Editor window after a submission. Can use RUN --> RECALL LAST SUBMIT to recall the submitted SAS code.

Methods for submitting a batch mode job:

Right click on the mouse and select batch submit icon when pointing to the SAS program file or drag the SAS program file to the SAS shortcut icon.

Under UNIX prompt, type sas filename.sas

&, where filename.sas is the SAS code you like to run.

Page break effect when exporting output files to other text editing software such as Microsoft Word:

Windowing environment mode: no effect (save from Output Window).

Batch mode: the page break effect exists (export from .lst file).

Windowing environment mode: no effect (save from Output Window).

Batch mode: the page break effect exists (export from .lst file).

The above table provides a quick comparison between PC SAS and UNIX SAS. In this paper, we focus our discussions on running programs using windowing environment on PC and background batch mode on UNIX. The following is a simple SAS code that assigns a new variable based on two existing variables under DATA STEP and under PROC IML. Note that the assignment of z was written in a long line under the SAS Program Editor window.

That is, there is no line break as shown here. The same SAS code was uploaded and run under a UNIX environment. Partial log files of the submitted SAS code under a PC SAS environment and under a UNIX SAS environment are shown below. We note that there is no error when the SAS code runs under PC but a syntax error message is issued when the same SAS code runs under UNIX SAS. We notice this same phenomenon under either DATA STEP or PROC IML.

data test;

x = 5;

y = 10;

z = x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x + x + x +y;

run;

proc print;

run;

proc iml;

x = 5;

y = 10;

z = x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x + x + x +y;

print x y z;

quit;

run;

(3)

A partial log file using PC SAS shows no error.

NOTE: SAS initialization used:

real time 1.90 seconds cpu time 1.85 seconds 1 data test;

2 x = 5;

3 y = 10;

4

5 z = x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y

5 ! + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x

5 ! - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x 6 + y + x * y + x - y + x + x + x + x +y;

7 run;

NOTE: The data set WORK.TEST has 1 observation and 3 variables.

NOTE: DATA statement used:

real time 0.84 seconds cpu time 0.09 seconds

If we upload the same SAS code and run under a UNIX platform, a warning message and an error is shown. A partial log file is listed:

1 data test;

2 x = 5;

3 y = 10;

4

WARNING: Truncated record.

5 z = x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x +

5 ! x + x + y + y/x + x*y + 3 + x + x + y + x +y +x + x + y + x * y + x - y + x + x + x + x + y + y/x + x*y + 3 + x + x + y

5 ! + x +y +x + x 6 run;;

___

___

___

22 22 22

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT,

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT,

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT,

LE, LT, MAX, MIN, NE, NG, NL, OR, ^=, |, ||, ~=.

LE, LT, MAX, MIN, NE, NG, NL, OR, ^=, |, ||, ~=.

LE, LT, MAX, MIN, NE, NG, NL, OR, ^=, |, ||, ~=.

7 run;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: SAS set option OBS=0 and will continue to check statements. This may cause NOTE: No observations in data set.

WARNING: The data set WORK.TEST may be incomplete. When this step was stopped there were 0 observations and 4 variables.

NOTE: DATA statement used:

real time 0.46 seconds cpu time 0.03 seconds 7 !

The difference between PC SAS and UNIX SAS is that the PC SAS Program Editor has no column limit while the UNIX SAS Program Editor automatically moves texts into next line when the entered texts pass a limit. When a SAS code is uploaded to a UNIX platform, the texts that are out of the UNIX SAS column limit will be truncated. The truncated SAS code will generate an error message due to the truncation. If you create a SAS code under the UNIX SAS Program Editor, the entry will be redirected to next line after reaching 135 columns automatically. On the other hand, the PC SAS Program Editor has no column limit. For a UNIX batch mode submission, the truncation occurs

(4)

at column 256. One way to fix this problem is to break a long line into several lines so that each line will be shorter than 135 columns when creates a SAS code.

RUNNING A PC CREATED SAS CODE ON UNIX

Assume we have created a SAS code under a PC environment but think that the UNIX batch mode is more appropriate. To run this PC created SAS code under UNIX, we need to upload the SAS code along with any associated raw data sets to UNIX. One easy way is the use of the FTP.

FTP FILES TO A UNIX PLATFORM

To upload the SAS code and raw data sets to the UNIX, one can use FTP to transport the SAS code and the raw data sets. To do so, under windows operation system, select start, run, type in ftp sciences.sdsu.edu.

Note that you need to change the sciences.sdsu.edu to the UNIX platform that you wish to upload. After type in the user id and password of your UNIX account, you can use put file.sas to upload file.sas to the UNIX platform. Note that before you transporting the file, you need to change your directory to where file.sas is located. The command under FTP is lcd for the change of the local directory. Similarly, cd is used to change the remote host directory. Since you cannot see where your location is, it is not easy to use FTP for the first time. Type help under the FTP prompt will show a set of commands that is available under FTP. A better alternative way of transporting files to a UNIX platform is the use of the freeware, WS_FTP LE, which is available online. After installing the application, once you click the WS_FTP LE icon, a dialog box will show up. The dialog box is shown below. You can create new profiles that store the host name and startup directories, etc. In this dialog box, you can change local or remote site directory after a connection, select files to upload or download, and change the transport format between ascii and binary. One worthwhile note is that if you edit some files after opening the WS_FTP LE application, you have to press the refresh button to update the listed files.

CONVERT PC TEXT FILES FOR UNIX

The DOS (includes Microsoft Windows) and UNIX operation systems store text files differently in format. The DOS places a line feed and a carriage return character at end of each line while the UNIX only places a line feed at end of each line. Some UNIX applications do not recognize the carriage return character and show the character as ^M.

This will cause a problem when SAS tries to read in a raw data. For example, a simple SAS code was uploaded to a UNIX platform using binary format (the default format under WS-FTP LE). Note that the values of variable y are all missing while the original values are not. On the other hand, the character variable z reads the values correctly.

Several easy ways can be used to avoid this uploading problem between PC and UNIX. When uploading the SAS code and raw data use FTP, select ascii format. This will avoid this problem. An alternative way is to run a dos2unix command under a UNIX prompt. This can be done easily under a UNIX prompt by entering dos2unix filename1.sas filename2.sas. The filename1.sas is the original PC created SAS file and the

filename2.sas is the newly converted SAS file. Note you can use the same filename to override the existing one.

options ls = 70 ps = 70;

Data a;

(5)

Input x y;

Cards;

1 5 2 6 3 7

;

proc print; run;

data b;

input z $;

cards;

5 6 7;

proc print; run;

If we upload and run this code under UNIX SAS, the output file is shown below:

The SAS System 1 Obs x y

1 1 . 2 2 . 3 3 .

The SAS System 2 Obs z

1 5 2 6 3 7

RUNNING SAS IN BATCH MODE

After converting a SAS code and raw data sets to a UNIX format, we can run the SAS code in background batch mode. An advantage of doing so is that it will free up the X-terminal for doing other jobs. You can even log off the terminal while the SAS code keeps running in the background. To view the status of the running SAS job, you can use top command to see how much CPU the job consumes and how long it has been running. To do so, type top under a UNIX prompt. When you are done viewing, simply press q from the keyboard to get back to the UNIX prompt. To see the job ID assigned to this running program, you can type ps under UNIX prompt. The command ps will show the following job description. The PID for the running SAS job is 6224 and the cumulated running time is 4:06. Use kill –9 6224 if you want to terminate the running process.

The top command provides more information than the ps command. The screen looks like the following. The PID 6224 consumed about 24.98% of the CPU during the time we browsed it. The cumulated running time is 2:31 and the NICE setting is 0. Depending on the UNIX server regulation, sometimes you need to change the NICE setting to a lower priority. You can use /usr/bin/nice –20 sas file.sas & to change the nice setting to 20. Note that some UNIX systems automatically kill a running job that runs longer than a specific time under the regular priority (NICE = 0). Consult your system administrator for more information. If there are not many processes during the time you submit your process, the NICE feature will not affect the performance since the system will allocate all possible source to the job you submitted.

(6)

VIEWING THE OUTPUT AND LOG FILE AND EDITING THE SAS CODE

When the SAS batch mode process is done, we can use several UNIX commands to view the output file or log file.

One easy way is the use of more command under UNIX. The command more test.log will show the content of test.log. To scroll to next page, you can press the space bar. To end the view, use q to end the display. Text editors such as pico or emacs are alternative ways of viewing the output file or log file. If there is a need to modify a SAS code, you can use text editors mentioned above to edit it and then resubmit the SAS code. Recall that the output file and log file will be replaced when you resubmit the code. Use cp test.lst test_old.lst to keep the old output file if you wish.

TIPS FOR RUNNING A LONG PROGRAM

Some UNIX systems may have a running time restriction under a normal running priority. You may need to lower the running priority to preserve a longer process time. As we mentioned above, you can use nice –20 sas file.sas & to change the priority. Consult with your system administrator for any restriction. Most of UNIX accounts have quota limitation. You can check your quota using quota –v. If the output generated from your SAS code exceeds the quota, the process may be terminated without any notice. If for an anticipated large output file that you may need to save temporarily, you can use /tmp directory to store the large output.

File permission on UNIX is another feature that PC does not have. Since a UNIX server allows multi users to work on, a permission set to each file prevents any unnecessary modification by other users of a file that belongs to you.

To check a permission status of a file, you can use ls –l to see the list of files under current directory.

As we mentioned, running a long process program under a UNIX server is a key advantage over a PC client.

However, you still need to estimate how long a process will be running under a UNIX server. An estimated running time of a SAS program allows you to anticipate the approximate time for getting your output. If a program will run forever or unreasonably long, you can modify your code so that it will finish under a reasonable time period.

Another reason of running a SAS process from PC SAS to UNIX SAS is the out of memory problem. If the out of memory error still shows under UNIX, you can use -memsize 0M option to increase the memory to all available memory when UNIX SAS processes your job. Recall that you may want to change your process to a lower priority using nice since the -memsize 0M option may slow down other jobs dramatically.

A STEP-BY-STEP PROCEDURE

In this section, we provide a step-by-step procedure that summarize above features into an algorithm for readers to follow. This procedure is used to upload a PC created SAS code to a UNIX platform and use the UNIX SAS batch mode to run the code.

1. Ftp files (including programs and data sets) to the Unix system where you want to run your SAS program.

Check the conversion status from pc to UNIX. We recommend using WS-FTP LE for transporting files.

2. From your PC, log on to the UNIX machine using telnet. (Select start, run, then telnet sciences.sdsu.edu, where sciences.sdsu.edu is the UNIX server you wish to log in.)

3. Remove the Windows carriage returns (^M) using dos2unix sas1.sas sas1.sas under UNIX prompt, where sas1.sas is the SAS program you uploaded to the UNIX. Repeat the same procedure for all programs and data sets.

4. Submit the SAS program in background batch mode. Use sas sas1.sas & for a background batch mode submission. If the UNIX system you will be running requires lower priority for a long running process, use nice to change the priority (consult with your system administrator for any restriction). For lower priority submission, use /usr/bin/nice –20 sas sas1.sas &.

5. Check the status of your running program using either top or ps under UNIX.

(7)

6. Check the log file (sas1.log) using either pico or emacs or simply the more UNIX command to see if there is any error message.

7. Use any text editor, pico or emacs, to edit the SAS code and resubmit again.

8. Note that if you are uncomfortable of editing your SAS code under UNIX, you can download the program to your PC and do the editing there and ftp back to UNIX again.

9. If you have a very large output file or log file that you want to save temporarily, you can use the temporary directory /tmp.

10. If there is an out of memory message, you can change the memory size by /usr/bin/nice –20 sas –memsize 0 sas1.sas &

to increase the memory size to all available memory.

CONCLUSION

It is known that personal computers are more powerful nowadays. However, some limitation may still exist. In this paper, we compare the differences between PC SAS and UNIX SAS. Actions needed to avoid errors when running a PC created SAS code under a UNIX platform are summarized. We also provide problems when uploading a PC SAS code to a UNIX environment. A step-by-step procedure is provided for users that want to know how to run a PC created SAS code on UNIX a quick glance. Note that a lot of online documents are available for basic UNIX commands, how to use pico, emacs, and how to run ftp, etc. Also, most UNIX servers have different settings and regulations. You should consult your UNIX system administrator for more information.

REFERENCES

Gady Kotler, SAS, Linux/UNIX and X-WINDOWS systems. Paper 283-28. SUGI 28 Proceedings. SAS Institute, Inc., 2003, Cary, NC.

SAS Institute, INC., SAS Companion for the Microsoft Windows Environment, Version 8. SAS Institute, Inc., 2000, Cary, NC.

SAS Institute, INC., SAS Companion for UNIX Environments, Version 8. SAS Institute, Inc., 2000, Cary, NC.

Yadong Zhang, UNIX Meet PC: Version 8 to The Rescue, Paper 39-28. SUGI 28 Proceedings. SAS Institute, Inc., 2003, Cary, NC.

ACKNOWLEDGMENTS

This first author’s work was supported in part by the Biological and Environmental Research Program (BER), U.S.

Department of Energy, through the Great Plains Regional Center of the National Institute for Global Environmental Change (NIGEC) under Cooperative Agreement No. DE-FC02-03ER63616.

CONTACT INFORMATION

Your comments and questions are valued and encouraged. Contact the author at:

Chii-Dean Lin

San Diego State University 5500 Campanile Dr.

San Diego, CA 91913 (619)594-6186

Email: cdlin@sciences.sdsu.edu

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Figure

Updating...

References