What happens when your configuration does not behave as you expected it to? It is possible there is an error in the rule set’s logic, and if so you need to find the error and correct it. Tracking down logic errors in your rule set can be time consuming and could involve manually evaluating your rule set, both as it is stored in the pf.conf file and the loaded version after macro expansions and any optimizations.
Before diving into the rule set itself, you can easily determine whether the PF configuration is what is causing the problem. Disabling PF by running the command pfctl -d to see if the problem disappears is a valid test that can save you a lot of trouble.
On the mailing lists, news groups, and other forums, we frequently see users initially blaming PF for problems that turn out to be basic network problems. Network interfaces set to the wrong duplex settings, bad netmasks, or even faulty network hardware are common culprits.
If the problem persists when PF is not enabled, it is likely that the prob- lem is not in the PF configuration. You should then turn to debugging other parts of your network configuration instead. However, if you are about to start adjusting your PF configuration, it is worth checking that PF is in fact enabled and that your rule set is loaded, using the following command:
$ sudo pfctl -si | grep Status
Status: Enabled for 20 days 06:28:24 Debug: Loud
Here Status: Enabled tells us that PF is enabled, so we try viewing the loaded rules with a different pfctl command:
$ sudo pfctl -sr
scrub in all fragment reassemble block return log all
block return log quick from <bruteforce> to any anchor "ftp-proxy/*" all
Here, pfctl -sr is equivalent to pfctl -s rules. The actual output is likely to be a bit longer than what we show here, but it’s a good example of what you should expect to see when a rule set is definitely loaded. For debugging purposes it is useful to add the -vv flag to the pfctl command line to see rule numbers and some additional debug information, like this:
$ sudo pfctl -vvsr
@0 scrub in all fragment reassemble
[ Evaluations: 67274995 Packets: 34231784 Bytes: 9800756925 States: 0 ] [ Inserted: uid 0 pid 1013 ]
@0 block return log all
[ Evaluations: 618114 Packets: 15833 Bytes: 1444217 States: 0 ] [ Inserted: uid 0 pid 1013 ]
@1 block return log quick from <bruteforce:2> to any
[ Evaluations: 618114 Packets: 13208 Bytes: 792140 States: 0 ] [ Inserted: uid 0 pid 1013 ]
@2 anchor "ftp-proxy/*" all
[ Evaluations: 604906 Packets: 3498832 Bytes: 2803255822 States: 0 ] [ Inserted: uid 0 pid 1013 ]
At this time, you should perform a structured walkthrough of the loaded rule set. Find the rules that match the packets you are investigating. What is the last matching rule? If more than one rule matches, is one of the matching rules a quick rule?2 You will need to trace the evaluation until you hit the end of the rule set or until the packet matches a quick rule, which then ends the process. If your rule set walk-through ends somewhere other than with the rule you were expecting to match your packet, you have found your logic error.
Rule set logic errors tend to fall into three types of cases:
Your rule does not match because it is never evaluated. A quick rule earlier in the rule set matched, and the evaluation stopped.
Your rule is evaluated but does not match the packet after all because of the rule’s criteria.
Your rule is evaluated, the rule matches, but the packet also matches another rule later in the rule set. The last matching rule is the one that determines what happens to your connection.
In Chapter 8 we introduced tcpdump as a valuable tool for reading and interpreting PF logs. The program is also very well suited for viewing what traffic passes on a specific interface. What you learned about PF’s logs and how to use tcpdump’s filtering features will come in handy when you want to track down exactly which packets reach which interface.
Here we use tcpdump to watch for TCP traffic on the xl0 interface (but not show SSH or SMTP traffic) and print the result in very verbose mode (vvv).
$ sudo tcpdump -nvvvpi xl0 tcp and not port ssh and not port smtp tcpdump: listening on xl0, link-type EN10MB
21:41:42.395178 194.54.107.19.22418 > 137.217.190.41.80: S [tcp sum ok]
3304153886:3304153886(0) win 16384 <mss 1460,nop,nop,sackOK,nop,wscale 0,nop,nop,timestamp 1308370594 0> (DF) (ttl 63, id 30934, len 64)
21:41:42.424368 137.217.190.41.80 > 194.54.107.19.22418: S [tcp sum ok]
1753576798:1753576798(0) ack 3304153887 win 5792 <mss 1460,sackOK,timestamp 168899231 1308370594,nop,wscale 9> (DF) (ttl 53, id 0, len 60)
The connection shown here is a successful connection to a website. There are more interesting things to look for, though, such as connec- tions that fail when they should not, according to your specification, or connections that succeed when your specification says they clearly should not.
The test in these cases involve tracking the packets’ path through your configuration. Once more, it is useful to check to see if PF is enabled or if disabling PF makes a difference. Building on the result from that initial test, you then perform the same kind of analysis of the rule set as we described previously. Once you have a reasonable theory of how the packets should traverse your rule set and your network interfaces, use tcpdump to see the traffic on each of the interfaces in turn. Use tcpdump’s filtering features to see only the packets that should match your specific case, such as port smtp and dst 192.0.2.19.
Find the exact place where your assumptions no longer match the reality of your network traffic. Turn on logging for the rules that may be involved, and then turn tcpdump loose on the relevant pflog interface to see which rule the packets actually match.
The main outline for the test procedure is fairly fixed. If you have narrowed down the cause to your PF configuration, once more it’s a case of finding out which rules match and which rule ends up determining whether the packet passes or is blocked.