1
Logging in unix, linux, OS-X
Many unix and linux operating systems include versions of the syslog framework. Syslog is composed of several parts:
• A standard library interface that makes it easier for programs to produce log data in a common way, with the data having a common format.
• A daemon which serves as the command and control center for logging and log data. The daemon typically gets kernel log data, listens on a unix domain socket (/dev/log) for log data from local processes, and listens on a udp socket for log data from remote hosts and then processes the log data according to its configuration.
• A specified set of actions that the logging daemon can take, such as append to files, write to terminals and send log data to other syslog aware hosts.
• An external, but commonly used, set of programs which can post-process log data in order to increase its signal to noise ratio.
• Programs which rotate, archive, etc. log files so that they don’t take over finite disk space.
(fac:kern) kernel emerg alert crit err warn notice debug info auth cron local mail user daemon ftp kern @loghost.sdsu.edu | /some/program/or/fifo /var/log/messages /dev/log udp:514
inputs patch panel outputs
/dev/kmsg
klogd sshd
syslog: the big picture
cron sendmail (fac:mail)
(fac:daemon)
(fac:cron)
1.1
The big picture
Messages meant for syslog come from sources in both userland (programs, pro-cesses, and the network) and the kernel itself. The kernel messages (debug-ging messages, exceptions from device drivers, etc) are available through the
/dev/kmsg character device. Typically the kernel log daemon (/dev/klogd) reads /dev/kmsg and translates the kernel messages into a format compatible with syslog before writing the translated messages to the /dev/log socket using the kern facility. The userland programs and processes which have been linked to the syslog library interfaces write to the /dev/log socket automatically. In the example above, sendmail is writing log messages using facility mail, sshd is using facility daemon, and cron is using facility cron. Finally, messages from the network come in as UDP packets on port 514 and are also processed by syslogd. Syslogd reads it’s configuration file (/etc/syslog.conf) upon startup and ev-ery time it receives the HUP signal. The syslog configuration is like a patch panel which tells syslogd that the messages with certain priorities written to groupings of the available channels are to be combined to a particular log ac-tion.
1.2
Sources of log information
• The kernel
Modern versions of the linux OS use klogd to acquire kernel log messages from /dev/kmsg, convert these to syslog format and then write them on /dev/log.
• Any process or program
Many executables on a computer running linux produce log output while they are running. Examples include most daemons (ftpd, sshd, send-mail,cupsd), many shell scripts via the logger command.
• Listening on the network
Syslogd can optionally listen on udp port 514 for log messages from other hosts. This is how a central loghost works: it listens for log messages from other hosts and processes those messages in a consistent fashion.
1.3
The syslog library interface
• void openlog(const char *ident, int logopt, int facility);
During the initialization part of the program that will log, the openlog library routine is called with the ident (the name of the program), some log options, and the name of the facility (channel) to log to. Example:
/*
* Log things using the "cups-lpd" name... */
openlog("cups-lpd", LOG_PID, LOG_LPR);
Here, the CUPS lpd backend sets ident to "cups-lpd", sets the option to log the process ID, and sets the facility to LOG_LPR;
• void syslog(int priority, const char *message, .../* arguments */); When there is a message to write, the message is written via the sys-log library call. Remembering that a channel has already been selected, the priority for this message is set via the priority argument. The message body is generated from the message and following arguments in the same manner as if these were arguments to printf(), except that the additional conversion specification %m shall be recognized; it shall convert no argu-ments, shall cause the output of the error message string associated with the value of errno on entry to syslog(). Example:
if (getpeername(0, (struct sockaddr *)&hostaddr, &hostlen)) {
syslog(LOG_WARNING, "Unable to get client address - %s", strerror(errno)); strcpy(hostname, "unknown");
}
Here cups-lpd detects an error from the getpeername socket library call and so logs a message at priority warning. Note that both the generic message "Unable to get client address" and the specific error text from the string attached to the errno return value from getpeername are included in the message.
• void closelog(void);
The closelog call takes no arguments and closes the previously opened logging channel. Example:
syslog(LOG_INFO, "Closing connection"); closelog();
Here cups-lpd logs a message that it is closing the connection and then closes the logging channel. I included the syslog() here because I think it is illustrative to note that many codes include what amounts to debugging info logged at a low priority (here LOG_INFO instead of LOG_DEBUG).
1.4
The format of a log message
Feb 6 06:17:02 hostname cups-lpd[pid of cups-lpd]: [ID xxxxxx lpr.warning] \ Unable to get client address - Interrupted system call
Date Time Hostname Ident[pid of logging process]: [ID xxxxxx facilty.level] Text of logged message
1.5
syslogd: the logging switchboard
Syslog is flexible. It allows logging in two dimensions, with both a “facility” (a logging channel) and a severity. When programs use the library interface they will open a particular channel to write their log messages to. Some of these channels (EG cron, ftp, kern, lpr, mail) are actually named after the pro-grams that typically write log messages to them. Once a program has opened a particular channel, log messages can then be written to that channel with an attached priority. This priority scheme is important because it allows coder of the program to include syslog messages with very low priority (say debugging messages) all of the way to a very high priority (emerg, or emergency messages) with multiple levels in between.
auth authpriv cron daemon ftp kern local0-7 lpr mail syslog user emerg alert crit err warning notice info debug Configuration
The syslog daemon typically starts at system boot time, and is typically config-ured by /etc/syslog.conf. The configuration is typically re-read upon the receipt of the HANGUP signal (kill -HUP pid). Syslog.conf contains entries of the for-mat: facility.level action facility1,facility2.level action facility1.level1,facility2.level2 action *.level action *.level,omitfacility.none action Action Meaning
/filename Write the messages into a file named filename on the local machine @hostname Forwards the message to syslogd running on hostname. @ipaddress Forwards the message to syslogd running on ip address
|fifoname Write the message into named pipe fifoname
user1,user2 Write the message to the terminals of these users, if logged in * Write the message to all logged in users
An example:
# /etc/syslog.conf - Configuration file for syslogd(8) #
# For info about the format of this file, see "man syslog.conf". #
# #
# save most to /var/log/message1 #
kern.warning;*.err;authpriv.none [tab]/var/log/message1 #
# log all emergency messages to all logged in users #
*.emerg [tab]*
#
# send everything to a loghost #
*.* [tab]@loghost.domain
#
# dump everything except kernel messages into one file #
*.*,kern.none [tab]/var/log/bigfile
#
# examples from the "big picture" # auth.emerg [tab]|/some/program/or/fifo # ftp.debug [tab]/var/log/messages # *.crit [tab]@loghost.sdsu.edu
1.6
OS-X differences
• OS-X 10.5 syslog seems to use /var/run/syslog as the input socket instead of /dev/log. syslogd under OS-X seems to get the kernel messages from /dev/klog itself instead of using a separate klogd.
• OS-X syslog messages don’t seem to have the facility.level part of the message. Actually, this is true of many linux implementations, too. The only OS that I know has the facility.level part to the message is Solaris.
• Here’s the OS-X 10.5 configuration: /etc/syslog.conf [D:~] morris% cat /etc/syslog.conf
*.notice;authpriv,remoteauth,ftp,install.none;kern.debug;mail.crit /var/log/system.log # Send messages normally sent to the console also to the serial port.
# To stop messages from being sent out the serial port, comment out this line. #*.err;kern.*;auth.notice;authpriv,remoteauth.none;mail.crit /dev/tty.serial # The authpriv log file should be restricted access; these
# messages shouldn’t go to terminals or publically-readable # files. auth.info;authpriv.*;remoteauth.crit /var/log/secure.log lpr.info /var/log/lpr.log mail.* /var/log/mail.log ftp.* /var/log/ftp.log install.* /var/log/install.log install.* @127.0.0.1:32376 local0.* /var/log/appfirewall.log local1.* /var/log/ipfw.log *.emerg *
1.7
post-processing: creating reports
One standard log analyzer and report generator is logwatch (I’m including this because I use it and it comes standard with RHEL). From the man page: LogWatch is a customizable, pluggable log-monitoring system. It will go through your logs for a given period of time and make a report in the areas that you wish with the detail that you wish. Easy to use - works right out of the package on almost all systems.
LogWatch configuration looks like:
[morris@bushline log.d]$ cat logwatch.conf
######################################################## # This was written and is maintained by:
# Kirk Bauer <[email protected]> #
# Please send all comments, suggestions, bug reports, # etc, to [email protected].
#
######################################################## # Yes = True = On = 1
# No = False = Off = 0 # Default Log Directory
LogDir = /var/log
# You can override the default temp directory (/tmp) here #TmpDir = /tmp
TmpDir = /var/log/tmp
# Default person to mail reports to. Can be a local account or a # complete email address.
MailTo = root
# If set to ’Yes’, the report will be sent to stdout instead of being # mailed to above person.
Print = No
# Use archives? If set to ’Yes’, the archives of logfiles # (i.e. /var/log/messages.1 or /var/log/messages.1.gz) will # be searched in addition to the /var/log/messages file. # This usually will not do much if your range is set to just # ’Yesterday’ or ’Today’... it is probably best used with # Archives = Yes
# Range = All
# The default time range for the report... # The current choices are All, Today, Yesterday Range = yesterday
# The default detail level for the report. # This can either be Low, Med, High or a number. # Low = 0
# Med = 5 # High = 10 Detail = 0
# The ’Service’ option expects either the name of a filter # (in /etc/log.d/scripts/services/*) or ’All’.
# The default service(s) to report on. This should be left as All for # most people.
Service = All
#Service = -sendmail
# You can also disable certain services (when specifying all) #Service = -zz-fortune
# If you only cared about FTP messages, you could use these 2 lines # instead of the above:
#Service = ftpd-messages # Processes ftpd messages in /var/log/messages #Service = ftpd-xferlog # Processes ftpd messages in /var/log/xferlog
# Maybe you only wanted reports on PAM messages, then you would use: #Service = pam_pwdb # PAM_pwdb messages - usually quite a bit #Service = pam # General PAM messages... usually not many
LogWatch uses a flexible, but complicated, configuration scheme. In the directory /etc/log.d/conf/logfiles live the configuration files that specify how to process individual input logfiles. For example, messages.conf contains:
# What actual file? Defaults to LogPath if not absolute path.... LogFile = messages
# If the archives are searched, here is one or more line
# (optionally containing wildcards) that tell where they are... # Note: if these are gzipped, you need to end with a .gz even if # you use wildcards...
Archive = messages.* Archive = messages.*.gz Archive = archiv/messages.* Archive = archiv/messages.*.gz
# Expand the repeats (actually just removes them now) *ExpandRepeats
# Now, lets remove the services we don’t care about at all... *RemoveService = talkd
*RemoveService = telnetd *RemoveService = inetd *RemoveService = nfsd
*RemoveService = /sbin/mingetty
# Keep only the lines in the proper date range... *OnlyHost
*ApplyStdDate
Similarly, in the directory /etc/log.d/conf/services lives the configuration files for the various service scripts. For example, clamav.conf:
Title = "Clamav" LogFile = maillog
*OnlyService = MailScanner *RemoveHeaders
# Set this to 1 if you want to ignore unmatched clamav messages... $clamav_ignore_unmatched = 1
Finally, in the directory /etc/conf.d/scripts/services lives the actual per ser-vice perl files that do the data massaging.
examples of services are sendmail, sshd, clamav. LogWatch will process the specified date range of log data and produce a report with sections, where each section is a particular service. The particular format of the report produced de-pends upon the service. It is relatively easy to create new services for LogWatch if the existing ones are insufficient. Here is an excerpt from a LogWatch report: ################### LogWatch 5.2.2 (06/23/04) ####################
Processing Initiated: Wed Feb 3 04:02:35 2010 Date Range Processed: yesterday
Detail Level of Output: 0
Logfiles for Host: bushline
################################################################ --- Clamav Begin ---Viruses detected: ClamAV: 4878 Time(s) McAfee: 4878 Time(s) Sanesecurity.Junk.24907.UNOFFICIAL: 1848 Time(s) ... Worm.Mydoom.M: 16 Time(s) --- Clamav End --- IMAP Begin -
---[IMAPd] Logout stats: ====================
User | Logouts | Downloaded | Mbox Size --- | --- | | ---usera | 188 | | userb | 200 | | luser | 202 | | ---590 | 0 | 0 --- IMAP End --- MailScanner Begin - ---MailScanner Status:
47058 messages Scanned by MailScanner 3704922904 Total Bytes
6336 Spam messages detected by MailScanner 9308 Viruses found by MailScanner
4 Banned attachments found by MailScanner 40929 Messages delivered by MailScanner Virus Sender Report: (Total Seen = 6078)
109.166.135.157: 2 Times(s) Content Report: (Total Seen = )
and have disarmed phishing tags in HTML message: 2216 Times(s) and have disarmed script tags in HTML message: 164 Times(s)
and have disarmed script, phishing tags in HTML message: 36 Times(s) and have disarmed web bug tags in HTML message: 6338 Times(s)
and have disarmed web bug, phishing tags in HTML message: 2446 Times(s) and have disarmed web bug, script tags in HTML message: 74 Times(s)
and have disarmed web bug, script, phishing tags in HTML message: 52 Times(s) Filename Report: (Total Seen = 4)
Possible virus hidden in a screensaver (wjfbx.scr): 2 Times(s) Windows/DOS Executable (text.exe): 2 Times(s)
--- MailScanner End --- sendmail Begin -
---Bytes Transferred: 3701099397 Messages Sent: 49151 Total recipients: 61167
Aliases database out of date 13 Time(s) Unknown local users:
Total: 2018
Top relays (recipients/connections - min 10 rcpts, max 50 lines): 7434/7134: sdsu.edu [130.191.229.14]
6354/1633: localhost [127.0.0.1]
1742/1064: IDENT:[email protected] Summary:
Total Mail Rejected: 3778
---1.8
rotating files
Some logfiles grow slowly, while others grow quickly, but the point is that none will ever shrink without some sort of intervention. Without some sort of peri-odic log rotation or compression procedure, logfiles will eventually fill all of the available disk space.
newsyslog is a facility that allows for a specification of rules for the rotation of log files. newsyslog is run periodically from cron and "rotates" log files. ( logfile -> logfile.0 -> logfile.1 -> etc).
OS-X uses newsyslog, and here is the OS-X 10.5 configuration, /etc/newsyslog.conf: #
# owner:group defaults to root:admin. #
# 100 = 100k bytes #
# * = don’t rotate based on time #
# @T00 = midnight? #
# @01T05 = 05:00 on the 1st of the month. #
# J = use bzip2 to compress # B = binary file
#
# Entries which do not specify the ’/pid_file’ field will cause the # syslogd process to be signalled when that log file is rotated. This # action is only appropriate for log files which are written to by the # syslogd process (ie, files listed in /etc/syslog.conf). If there # is no process which needs to be signalled when a given log file is # rotated, then the entry for that file should include the ’N’ flag. #
# The ’flags’ field is one or more of the letters: BCGJNUWZ or a ’-’. #
# Note: some sites will want to select more restrictive protections than the # defaults. In particular, it may be desirable to switch many of the 644 # entries to 640 or 600. For example, some sites will consider the # contents of maillog, messages, and lpd-errs to be confidential. In the # future, these defaults may change to more conservative ones.
#
# logfilename [owner:group] mode count size when flags [/pid_file] [sig_num]
/var/log/appfirewall.log 640 5 100 * J
/var/log/hwmond.log 640 5 100 * J /var/log/install.log 640 5 100 * J /var/log/ipfw.log 640 5 100 * J /var/log/lookupd.log 640 5 100 * J /var/log/lpr.log 640 5 100 * J /var/log/mail.log 640 5 100 * J /var/log/ppp.log 640 5 100 * J /var/log/secure.log 640 5 100 * J /var/log/system.log 640 7 * @T00 J /var/log/wtmp 644 3 * @01T05 B
1.9
syslog-ng and centralized logging
Maybe it’s just me, but there have always been things that I’ve wanted to do with syslog that are not in its bag of tricks. In the old days, I wrote some filter programs that would read in regular expressions from a configuration file and syslog data from a fifo and would split logs into host specific and/or service specific logs. This was cumbersome, though, as the filter process was a daemon that communicated with syslog through a fifo and one or the other was always dying. About five years ago when I began setting up a central loghost, I discov-ered syslog-ng. Syslog-ng uses an entirely different scheme to process log data. It is flexible enough to emulate syslog classic at the same time it does things that syslog can never do.
Here are my goals for a central loghost: 1. Emulate syslog classic
Since LogWatch expects to find its source log data in conventional places (/var/log/messages, etc) it is necessary to have some very large files that almost all of the log data gets written to.
2. Create a per host log hierarchy
My two most frequent tasks when I need to look at log data are: 1. To look at the logs for a particular host on a particular day 2. To look at logs for a particular host for a particular month.
3. Create a per day or per month log hierarchy for all hosts
A less frequent task that I perform using log data is to correlate between hosts the data on a particular day or over a particular month.
How can I use syslog-ng to meet these goals? 1.9.1 syslog-ng paradigm
syslog-ng has a different paradigm than syslog. In syslog-ng there are sources of data, filters to pass the data through, and destinations for the results. Without going into too much detail, let’s look at the syslog-ng.conf from my loghost:
1. Setting up the sources source s_sys {
file ("/proc/kmsg" log_prefix("kernel: ")); unix-stream ("/dev/log");
internal();
udp(ip(0.0.0.0) port(514)); };
2. Setting up the filters, destinations, and hooking them together # /var/log/messages
filter f_filter2 { level(info..emerg) and
not facility(mail,authpriv,cron); }; destination d_mesg { file("/var/log/messages"); };
log { source(s_sys); filter(f_filter2); destination(d_mesg); }; # /var/log/secure
filter f_filter3 { facility(authpriv); }; destination d_auth { file("/var/log/secure"); };
log { source(s_sys); filter(f_filter3); destination(d_auth); }; #/var/log/maillog
filter f_filter4 { facility(mail); };
destination d_mail { file("/var/log/maillog" sync(10)); }; log { source(s_sys); filter(f_filter4); destination(d_mail); }; #write to all logged in ttys
filter f_filter5 { level(emerg); }; destination d_mlal { usertty("*"); };
log { source(s_sys); filter(f_filter5); destination(d_mlal); }; #/var/log/spooler
filter f_filter6 { facility(uucp) or
(facility(news) and level(crit..emerg)); }; destination d_spol { file("/var/log/spooler"); };
log { source(s_sys); filter(f_filter6); destination(d_spol); }; #/var/log/boot.log
filter f_filter7 { facility(local7); };
destination d_boot { file("/var/log/boot.log"); };
log { source(s_sys); filter(f_filter7); destination(d_boot); }; #/var/log/cron
filter f_filter8 { facility(cron); };
destination d_cron { file("/var/log/cron"); };
log { source(s_sys); filter(f_filter8); destination(d_cron); }; #
# automatic host sorting (usually used on a loghost) #
# date by host destination std {
file("/var/log/HOSTS/$HOST/$YEAR/$MONTH/$DAY/$FACILITY"
owner(root) group(root) perm(0644) dir_perm(0755) create_dirs(yes) ); }; log { source(s_sys); destination(std); }; # allhosts by day destination std_hbd { file("/var/log/ALL/$YEAR/$MONTH/$DAY/$FACILITY"
owner(root) group(root) perm(0644) dir_perm(0755) create_dirs(yes) ); }; log { source(s_sys); destination(std_hbd); };
# per host per month destination std_phpm {
file("/var/log/HPM/$HOST/$YEAR/$MONTH/$FACILITY"
owner(root) group(root) perm(0644) dir_perm(0755) create_dirs(yes) ); }; log { source(s_sys); destination(std_phpm); };