121
Summary
Solutions Fast Track
Introduction
In this chapter, we’ll discuss what’s called pre-assessmentinformation-gathering techniques. During this phase of an assessment, the security tester is most inter- ested in obtaining preliminary information about the target.This does not
include specific information such as IP addresses and DNS names (which we dis- cuss in the next chapter) but rather information that could be used for social manipulation (talking a help desk operator into a password change), physical compromise of a target (gaining information about building structures or badge layouts), and general reconnaissance.
Throughout this chapter, we focus on methods to locate information about the target that will most likely be used in later phases of the assessment. In a twisted sort of way, pre-assessment work is a bit like preparing for the perfect date.You might do a bit of research about the person, get some information about them and their friends and family, spend quality time with them, and learn as much as you can about their interests. Although the stakes are much higher, courting your target can be like courting your mate. When things get rough, plan to spend some time sleeping in a chair or a couch instead of in a nice, warm bed where you belong!
Let’s carry that analogy through the chapter and examine how the stages of pre-assessment mirror the stages of courtship.
The Birds and the Bees
One of the first steps you need to take is to try to understand the target com- pany structure and environment. Visiting the company Web site can provide some information, but keep in mind that you’re only seeing what they want you to see.To get behind the scenes, a simple site:somecompany.com search will often reveal information that wasn’t meant to be seen by the public.This search has one major drawback, however: for a large company, it could return thousands of results, many of which are useless and a huge waste of your time.
In this section we look at techniques (grinding techniques, specifically) that you can use to weed through all this data, but for now it might be a better idea to target your searches to find the useful data.
Intranets and Human Resources
Where do you go if you want the inside scoop on a company? What better department to start with than Human Resources! Since just about anything intentionally viewable by the public tends to be watered down, we’ll need to get behind the scenes. Many companies like to make company information available to their employees (and only their employees), and to do so they set up company intranets containing information for employee eyes only. Intranets are supposed to be private, but combining Human Resourcesand intranetinto a search such as
intitle:intranet inurl:intranet +intext:”human resources”shows that private sites some- times aren’t exactly private, as we can see in Figure 4.1.
In addition to providing you with information about the company policies and procedures, most HR intranet sites provide the names of contact people for the department.These names can be very useful for future social engineering attacks.
www.syngress.com
Underground Googling…
A Wealth of Information Lies in the Company Intranet
Don’t limit yourself to the Human Resources department. Companies put all sorts of information on their intranets, since they assume they are safe from public eyes. Replacing the human resources part of the query with
computer services, IT department, or simply phonecan provide amazing amounts of additional information that you can later use during the social engineering phase. Chapter 7 contains more information about using the company intranet to your advantage.
Help Desks
A simple search listed in Chapter 7’s Top 10 searches is intranet | help.desk, or simply (“help.desk” | helpdesk).Combined with the siteoperator, this query is designed to locate intranets or help desk pages. Help desk references are extremely valuable because they often refer to documents and procedures an attacker could use to gather information about the target.
Self-Help and “How-To” Guides
These documents are designed to help an end user perform some sort of proce- dure. Used creatively, they can provide information about the target that could prove useful at some point during an assessment. For example, a kludgey search such as “how to” network setup dhcp ( “help desk” | helpdesk ) can reveal documents that include instructions for connecting to a network, as shown in Figure 4.2.
This page lists a virtual gold mine of information:
■ Network information DHCP, No client ID’s, AppleTalk, Ethernet. ■ Recommended browsers The download link lists recommended
browsers and version information.
■ Help desk phone number X1705, an RCC comes to your room. ■ E-mail information ID can be generated by the IT department. ■ E-mail information Site uses Novell GroupWise.
■ E-mail information Web-based (!) e-mail server located online at http://gw5.XXX.edu.
■ E-mail information E-mail server is available from the Internet. This in not an uncommon how-to document. Most are overly informative, supplying a great deal of information that an attacker can use.
www.syngress.com
Job Listings
Job listings can also reveal information about a target, including technologies in use, corporate structure, geography, and more. One of the easiest ways to locate job postings is with a simple query such as resume | employment combined with the site operator. Don’t overlook job listings as an important source of informa- tion about an organization.
Underground Googling…
Public Polling Via Google
Google can be used to map the public opinion of a site over time. First, build two lists of Google queries. The first list combines the common name of a company with 100 common “good” phrases such as good experience, wise investment, well-managed, and so on. Next, create a second list that combines the company name with 100 “bad” phrases such as poor customer service, shady management, and beware. Feed these lists into Google every day for an extended period of time, mapping not only the numbers of hits but the page rank of each referring site. This kind of nonobvious statistical information can speak volumes about a company’s image (as well as provide a decent financial investment road map!).
Long Walks on the Beach
During the courtship process, a couple often spends time getting to know one another. Similarly, during a penetration test, it’s not a bad idea to get “personal” with your target, or specifically the people working for the organization. Digging up details about the people who make up an organization can pay off in big ways during later assessment phases. Usernames, employee numbers, or Social Security numbers can be used to social engineer a help desk technician. E-mail addresses can be targeted with e-mails containing malware. Information about an individual’s circle of friends can be used to social engineer that individual. Any little tidbit of information can be used by a creative security tester to gain access
to more information, causing a snowball effect that often leads to system or net- work compromise. In this section, we’ll take a look at some ways Google can be used to harvest this type of information.
Names, Names, Names
One way Google excels at helping the researcher dig up additional names and e- mail addresses is through its Google Groups searches. Google Groups (formerly DejaNews) is simply a Usenet archive that keeps copies of all posts made to thousands of Usenet groups over the years. For example, performing a Google Groups search on somecompany.com returns some nice information, as shown in Figure 4.3.
Notice that the returned results list the name of the poster at the bottom of each result listing. In some cases this information is faked, but depending on the number of results, you could end up with legitimate employee names.
Remember that the Google Groups Advanced Search feature
(http://groups.google.com/advanced_group_search) allows you to narrow your search by specifying several additional search parameters such as Subject, Author, Date, specific phrases, and more.
www.syngress.com
Browsing Google Groups results for information can be a daunting task, especially when it comes time to dig through all the pages to find the informa- tion you’re after. Chapter 10 contains snippets of code that can be used to extract URLs, e-mail addresses, and more from scraped Google Groups result pages. Chapter 10 also goes into more detail on how to properly search for, locate, and extract e-mail addresses using regular expressions.
Automated E-Mail Trolling
It would be nice to have a utility to help automate the process of searching for e-mail addresses. Ask and you shall receive! The Perl code that follows, written by Roelof Temmingh of SensePost (www.sensepost.com), will search through
Google Groups pages and Google Web pages, hunting for e-mail addresses.To use this tool, you must first obtain a Google API key from
www.google.com/apis. Download the developer’s kit, copying the
GoogleSearch.wsdl file into the same directory as this script. Next, download and install the Expat package from sourceforge.net/projects/expat.This installation requires a ./configure and a make as is typical with most modern UNIX-based installers. This script also uses SOAP::Lite, which is easiest to install via CPAN. Simply run CPAN from your favorite flavor of UNIX and issue the following commands from the CPAN shell to install SOAP::Lite and various dependencies (some of which might not be absolutely necessary on your platform):
install LWP::UserAgent install XML::Parser install MIME::Parser force install SOAP::Lite
Although this might seem like a lot of work for one script, most Perl-based Google programs will have the same requirements, meaning that you only need to go through this process once to allow you to run this and other Google querying Perl scripts, some of which are included in later chapters of this book. Be sure to insert your Google API key into this script before running it. Now without further ado, here’s the much-anticipated script:
#!/usr/bin/perl #
# Google Email miner # SensePost Research 2003 # [email protected]
#
# Assumes the GoogleSearch.wsdl file is in same directory #
$|=1;
use SOAP::Lite;
if ($#ARGV<0){die "email-mine <domain> [loops]\nfor example: email-mine sensepost.com 5\n\n";}
#-=-=-=-=-=-# EDIT THIS #-=-=-=-=-==-#
my $key = "--==Insert Google API Key Here==--";
my $service = SOAP::Lite->service('file:./GoogleSearch.wsdl'); # -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-#
my $numloops = @ARGV[1];
if ($numloops == 0){$numloops=5;} my $target = @ARGV[0];
my $query = "\@$target -www.$target";
## Do the Google
for (my $j = 0; $j < $numloops; $j++){ print STDOUT "$j "; my $results = $service -> doGoogleSearch($key,$query,(10*$j),10,"true","","true","","latin1","latin1"); $re = (@{$results->{resultElements}}); foreach my $results(@{$results->{resultElements}}){ push @allemails,extract_email($results- >{snippet},$target); } if ($re != 10){last;} }
# Remove duplicates & show results print STDOUT "\n";
@allemails=dedupe(@allemails);
foreach $email (@allemails){ print STDOUT "$email\n"; }
## --- SUBS --- ## sub extract_email {
my ($passed,$target)=@_;
# we want multiple addresses in a single line my @in = split(/\s/,$passed);
my @collected;
foreach my $line2 (@in){ my $emaila;
chomp $line2;
# Remove Google's boldifications..
$line2 =~ s/<b>//g; $line2 =~ s/<\/b>//g;
# You can run but you can't hide ;)
$line2 =~ s/ at /\@/g; $line2 =~ s/\[at\]/\@/g; $line2 =~ s/\<at\>/\@/g;
$line2 =~ s/_at_/\@/g; $line2 =~ s/dot/\./g;
$line2 =~ /[\W\t]*([\w\.\-]{1,15})\@([\w\-]+)\.([\w\- ]+)\.([\w\-]+)\.([\w\-]+)[\W\t\.]*/; $emaila="$1\@$2.$3.$4.$5"; if (length($emaila) < 5){ $line2 =~ /[\W\t]*([\w\.\-]{1,15})\@([\w\- ]+)\.([\w\-]+)\.([\w\-]+)[\W\t\.]*/; $emaila = "$1\@$2.$3.$4"; } if (length($emaila) < 4){
$line2 =~ /[\W\t]*([\w\.\-]{1,15})\@([\w\- ]+)\.([\w\-]+)[\W\t\.]*/;
$emaila = "$1\@$2.$3"; }
# filter out junk email addresses
my ($name,undef) = split(/\@/,$emaila);
if (length($emaila) > 0 && $emaila =~ /$target$/i && length($name) < 15){ push @collected,$emaila; } } return @collected; } sub dedupe { (@keywords) = @_; my %hash = (); foreach (@keywords) { $_ =~ tr/[A-Z]/[a-z]/; chomp; if (length($_)>1){ $hash{$_} = $_; } }
return keys %hash;
This code, mentioned cursorily in the SensePost paper Putting the Tea Back into CyberTerrorism(do a Google search for Tea Cyberterrorism), performs a Google search for a domain name prepended with an @ sign, excluding the domain’s main page.This will effectively search for e-mail addresses, even though Google ignores the @ sign. For example, when searching for gmail.com, this script will search for @gmail.com –www.gmail.com.This excludes hits from the gmail site itself. Consider the output of this query, as shown in Figure 4.4.
Within the first few results, you should notice a few legitimate-looking e- mail addresses, specifically [email protected] and [email protected]. You could sift through these results by hand plucking out e-mail addresses, or you could simply run this Perl script, which does all the heavy lifting for you. We’ll run the Perl script, instructing it to search for gmail.com addresses, only using 1 of our 1000 daily allotted API queries (which translates to a total of 10 Google results).The output of this run is shown in Figure 4.5.
Figure 4.4 Trolling for E-Mail Addresses
Notice that this script also located the e-mail addresses we found when we performed the search manually.This script really begins to shine when we allow it to sift through more results. Allowing the script to process through 50 results (run with ./email-maine.pl gmail.com 5) returns many more e-mail addresses, as shown below: [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] www.syngress.com
[email protected] [email protected] [email protected]
Obviously, the vast majority of these e-mail addresses are invalid, but this script really shines when it’s fed more specific domain names instead of free Web-based domain names.
Underground Googling…
Patience Pays Off
Searching through thousands of Usenet posts is a tedious and time-con- suming process; however, you will find the results well worth the effort. In addition to current employees, you will likely find the names of former employees, who make for great social engineering targets.
Addresses, Addresses, and More Addresses!
E-mail addresses can show up in so many places that it’s nearly impossible to list them all. However, let’s take a look at some great examples. Both Outlook Express and Eudora, two popular e-mail clients, use the .mbx extension for storage of e-mail. A Google search such as <filetype:mbx mbx intext:Subject>finds thousands of e-mails or mailboxes sitting on the Internet, as shown in Figure 4.6.
Obviously, a person’s private e-mails can reveal loads of information about that person, as well as the company that person works for.They also provide names of coworkers, friends, and family members as well as any mailing lists they belong to.
However, more than e-mails can be found using Google. Many organizations use Microsoft Outlook for their e-mail and calendaring purposes, and it seems that Outlook has become the de facto standard in the workplace. With this in mind, the process of finding e-mails, calendars, and address books can be simpli- fied using a search such as <filetype:pst pst ( contacts | address | inbox)>.This search locates Outlook personal mail folders that include the words contacts, address, or inbox in the name.These words can be modified to return many other results. As shown in Figure 4.7, this query returns an ungodly number of files that were most likely never intended for public viewing.These are, after all per- sonal e-mail folders.
www.syngress.com
The Windows Registry, the heart and soul of a Windows machine, can also be searched for e-mail addresses. It is, after all, a text file. But Google scanning a machine’s registry? It can’t happen, right? Rest assured, a search like <filetype:reg reg +intext:”internet account manager”> produces some rather eye-opening results. You wouldn’t think that people would put such sensitive information on the Internet, but as you can see in Figure 4.8, anything is possible.
Figure 4.7 Microsoft Outlook Files on the Internet
The list of potential e-mail address locations could go on and on, but since we’re not in the business of reckless tree killing, we’ll just round out this section with a few examples from the Google Hacking Database.Table 4.1 presents sev- eral queries that can be used to dig up e-mail addresses, sometimes in the strangest of places!
Table 4.1 E-Mail Address Queries
Query Description
“Internal Server Error” “server at” Apache server error could reveal admin e- mail address
intitle:”Execution of this script Cgiwrap script can reveal lotsof
not permitted” information, including e-mail addresses and even phone numbers
e-mail address filetype:csv csv CSV files that could contain e-mail addresses
intitle:index.of dead.letter dead.letter UNIX file contains the con- tents of unfinished e-mails that can con- tain sensitive information
inurl:fcgi-bin/echo fastcgi echo script can reveal lotsof infor- mation, including e-mail addresses and server information
filetype:pst pst -from -to -date Finds Outlook PST files, which can con- tain e-mails, calendaring, and address information
intitle:index.of inbox Generic “inbox” search can locate e-mail caches
intitle:”Index Of” -inurl:maillog Maillog files can reveal usernames, e-mail
maillog size addresses, user login/logout times, IP addresses, directories on the server, and more
inurl:email filetype:mdb Microsoft Access databases that could contain e-mail information
filetype:xls inurl:”email.xls” Microsoft Excel spreadsheets containing e-mail addresses
filetype:xls username Microsoft Excel spreadsheets containing
password email the words username, password, and
intitle:index.of inbox dbx Outlook Express cleanup.log file can con- tain locations of e-mail information
www.syngress.com Continued
Table 4.1 E-Mail Address Queries
Query Description
filetype:eml eml +intext: Outlook express e-mail files contain
”Subject” +intext:”From” e-mails with full headers
intitle:index.of inbox dbx Outlook Express e-mail folder
filetype:wab wab Outlook Mail address books contain sen- sitive e-mail information
filetype:pst inurl:”outlook.pst” Outlook PST files can contain e-mails, cal-