A Survey on Sponsored Search Advertising in Large Commercial Search Engines

(1)

A Survey on Sponsored Search Advertising in Large

Commercial Search Engines

George Trimponias, Dimitris Papadias Department of Computer Science and Engineering, Hong Kong University of Science and Technology

{trimponias, dimitris}@cse.ust.hk

Technical Report HKUST-CS13-03

Abstract. Large commercial search engines such as Google Search, Microsoft’s Bing, and Yahoo! Search have recently emerged as information gateways for millions of Internet users. Their unique role as an intermediary between Internet users and the vast Web content has created exciting marketing opportunities for many commercial firms that wish to advertise their product or service. As a result, a new multi-billion dollar market for sponsored search has been established, where advertisers pay a fee determined by an auction in order to be displayed in highlighted textual content or alongside the Web search results. Motivated by its unprecedented proliferation and enormous success, we conduct a survey on the newly introduced paradigm of sponsored search advertising. Our contributions are twofold. On the one hand, we provide an extensive and self-contained survey on the sponsored search market, covering as diverse topics as the structure of sponsored search advertising, practical issues that major search engines need to deal with, and even a brief history of the market for paid links. On the other hand, we investigate several auction designs for the sponsored search market, and discuss their properties and mathematical underpinnings. We conclude with future directions and challenges.

(2)

1 Introduction

Over the past years, large commercial search engines1 have emerged as information gateways for millions of Internet users. In response to a user’s query, search engines generate a ranked list of results based on sophisticated information retrieval algorithms. These pages might represent commercial entities selling goods or services, recommendation sites, review sites, etc. Google Search alone, the dominant search engine on the World-Wide-Web, is estimated to index more than 40 billion web pages2 in a Web that consists of over a trillion unique URLs3. Moreover, it serves more than one billion search requests a day4, many of which are related to decision-making tasks such as shopping or film reviews. It should then come as no surprise that modern search engines have a critical power in shaping the Web users’ actions.

This unique role as an intermediary between Internet users and the vast Web content has created exciting marketing opportunities for many commercial firms that wish to advertise their product or service. A new market for sponsored search has been established, where advertisers pay a fee to be displayed in highlighted textual content or alongside the Web search results. Usually, when a user enters a query, a limited number of

paid (sponsored) links (slots) appears on top or to the right side of the unpaid (organic or

algorithmic) search results. Advertisers who have expressed interest in this query compete for the paid positions; as supply and demand vary unpredictably across the user queries, and the number of possible keywords is prohibitively large, the search engine relies on the market to determine the winners and prices by using auctions among the advertisers. Every time a user issues a query, the search engine runs an auction to determine the

winners, i.e., the advertisers that will be displayed along the ad slots, and the price that each of them has to pay to the search engine. Payments are based on the pay per click model, i.e., the advertiser only pays the specified price when the user actually clicks on their ad. Figure 1 depicts part of the first page for the query melanoma treatment using Google Search. The prominent highlighted result at the top of the page and the three results on the right side are sponsored ads, while the 5 results below the highlighted ad correspond to the top-5 organic results.

1_{The term search engines encompasses pure Web search engines (e.g., Google Search), information portals with}

search functionality (e.g., Yahoo!), metasearch engines (e.g., Metacrawler), niche search engines (e.g., CiteSeer), and comparison shopping engines (e.g., mySimon, Shopping.com) [31].

2_{See http://www.worldwidewebsize.com (accessed March 7, 2013).}

3_{See http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html (accessed March 7, 2013).}

4_{See http://www.nytimes.com/2011/03/06/weekinreview/06lohr.html?pagewanted=1&_r=1&hpw (accessed March 7,}

(3)

Fig. 1. Top organic and sponsored results for the query melanoma treatment using Google Search. Sponsored search is a large and rapidly growing source of revenue for all large search engines. Currently, the two most prominent players in the sponsored search market are Google’s AdWords [73], and Microsoft’s Bing Ads which powers the sponsored links for both Microsoft Bing and Yahoo! Search [72]. Google’s total revenue in fiscal year 2010 was $29.321 billion. Over 96% of it was related to advertising, while less than 4% came from licensing and other sources5. To get a better picture of the sponsored search market, note that, as of September 2011, Google boasted a market capitalization of $170.76 billion; General Motor’s (largest US automaker) $31.88 billion paled in comparison6. Several other companies, including LookSmart, FindWhat, eSpotting, InterActiveCorp (Ask Jeeves), and eBay (Shopping.com), earn hundreds of millions of dollars in sponsored search revenue annually [48]. Interestingly, this new advertising model has had a significant economic impact on the entire business activity; in 2009 only, Google “generated a total of $54 billion of economic activity for American businesses, website publishers and non-profits”7. Interestingly, the sponsored search advertising revenue has enabled large search companies to finance the very expensive infrastructure that is necessary to make the vast Internet information accessible, and to develop many free services such as spell checking, currency conversion, flight times, and desktop searching applications [41].

5_{See http://investor.google.com/financial/2010/tables.html (accessed March 7, 2013).} 6_{See https://www.google.com/finance?client=ob&q=NASDAQ:GOOG and}

http://www.google.com/finance?q=NYSE%3AGM (accessed March 7, 2013).

(4)

Motivated by its unprecedented proliferation and enormous success, we conduct a survey on the newly introduced paradigm of sponsored search advertising. The field is still undergoing many changes and it is still an area of intense research within the academic community. We have thus decided to focus on the major industrial aspects of sponsored search as well as the theoretical fundamentals of the auctions that large commercial search engines employ to sell paid links to advertisers. Our goal is the survey to be as self-contained as possible, and serve as a guide to interested readers who wish to introduce themselves to this exciting field. We emphasize that in some parts we have followed very closely the exposition of some excellent surveys or works, as we found them to be exceptionally structured and very clearly written; we will state it explicitly whenever this is the case since these parts do not represent original work of ours.

Section 2 investigates the structure of sponsored search advertising. Section 3 discusses practical issues that arise in sponsored search. Section 4 provides an overview of sponsored search auctions. Section 5 surveys several interesting properties of various auction designs for the sponsored search market, and demonstrates how they are structurally connected. Section 6 analyzes the GSP procedure with game-theoretic tools, and explores various types of equilibria. Finally, Section 7 concludes the paper with future directions and challenges.

2 Structure of Sponsored Search Advertising

Three distinct players define the dynamics of sponsored search advertising: the

advertisers, the search engine, and the users. In this Section, we investigate the main characteristics and structure of each party, as well as how they interact one with another.

2.1 Advertisers

The advertiser seeks to place properly designed advertisements to promote their product or service. They target interesting users by declaring to the search engine a list of

keywords that a relevant user may search for. For each keyword, they additionally determine their maximum cost per click (maximum CPC), also known as maximum bid, which corresponds to the maximum amount of money that the advertiser is willing to spend to appear on the results page for a given keyword; it can be as low as $0.10. The

actual CPC, on the other hand, refers to the actual amount of money that the advertiser is charged when the user clicks on their ad. On average, actual CPC ranges between $0.50 and $0.90, depending on the position8; for competitive markets, however, it can get much higher9. Note that bidding takes place continuously: advertisers can update their bids as

8_{See http://www.adgooroo.com/has_google_changed_their_cpc_formula.php#more (accessed March 7, 2013).} 9_{According to wordstream.com, the most expensive keyword category at the close of 2010 was “Insurance” with a top}

(5)

http://www.wordstream.com/articles/most-often as they wish, although in reality the majority changes their bids on a daily or weekly basis10. To avoid missing a potential user, the advertiser tries to cover all possible keywords. Unavoidably, a user query may match several keywords, so the advertiser essentially ends up competing with itself. Coming up with the right keywords and bids in the presence of complex keyword interactions turns out to be a hard problem [65].

In addition to the bids, an advertiser may also declare a maximum daily budget. In practice, most advertisers have operating budgets or spending targets, and are not willing to spend arbitrarily for their marketing campaign. They report their budget constraints to the search engine, which is responsible for properly allocating the budget. Search engines usually impose limits on the maximum number of times an advertiser can update their bid during a single day; for Google AdWords, for instance, the maximum allowed number of updates per day is 1011. Efficient budget allocation is actually one of the most extensively studied optimization problems for both the search engine and the advertiser, e.g., [1][12][55][56][59]. A particular challenge is that users arrive online in a largely unpredictable way; consequently, the optimal allocation is out of reach and approximation online algorithms are used instead [59].

In the pay per click model, the advertiser is naturally interested in maximizing the number of clicks that they receive. In reality, however, this assumption is rather restrictive: large and popular commercial firms may have a greater interest in getting an ad slot in a high position, and care less about the total number of clicks they receive. The reason is that the top ad slots are associated with an increased brand awareness effect, regardless of whether the ad is actually being clicked12. As a result, branding advertisers would like to have direct control over the position of their ad. Special position-based auctions [3] have been introduced to address this issue.

Usually, the advertiser is interested in the successful conversions rather than the total number of received clicks. These correspond to desired user actions on the landing page, such as product sales, membership registrations, newsletter subscriptions, software downloads, etc. Given this data, the advertiser can have a clear picture of the keywords that are worth more, and adapt its bidding behavior. Conversion rate maximization has been investigated in [45].

2.2 Search Engine

The search engine provides a suitable mechanism to enable the interaction of advertisers and users. Not surprisingly, there is an ongoing conflict for a search engine between revenue maximization and high-quality sponsored results. The fact that the web search market is largely dominated by only few big players led by Google, Microsoft, and

expensive-keywords (accessed March 7, 2013).

10_{See http://www.seroundtable.com/archives/022226.html (accessed March 7, 2013).}

11_{See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=8761 (accessed March 7, 2013).} 12_{Nielsen/NetRatings. Interactive advertising bureau (IAB) search branding study, August 2004. Commissioned by}

(6)

Yahoo! is of no help: empirical studies clearly demonstrate that when the quality of search falls below a certain threshold, the user is very likely to defect to another search engine [53]. A myopic strategy would be to increase revenue by increasing the number and prominence of ad slots; in the long run, however, this would take a toll on the search engine’s perceived quality by the users [10] and expected revenue [31]. In practice, the number of ad slots usually ranges anywhere between 0 and 12. In particular, according to the market research company AdGooRoo13, in year 2008 Google AdWords displayed 5.5 - 6.0 ads per query, and Bing Ads only 3.85 ads per query (United States market only).

The search engines have at their disposal a variety of ways to measure the quality of an ad, and discard low-quality ads. The most prominent measure of ad quality is the

clickthrough rate (CTR), which is the number of times an ad is clicked divided by the number of times the ad is shown. The CTR is computed for each ad and keyword, and is a strong indicator of the relevance of both the ad and the keyword. In general, an average CTR is in the neighborhood of 2%14, while a CTR of 1% or more is considered a good goal for new advertisers15. Other quality measures include: 1) relevance of the landing page to the declared keyword and the ad creative16 [16]; and 2) historical CTR for a query-ad pair over long time periods [38]. In practice, large search engines utilize a

quality score that takes into account a variety of factors to measure how relevant the keyword is to the ad text and to a user’s search query. For instance, each time that a keyword matches a user query, Google AdWords calculates a quality score for the keyword based on the historical CTR of the keyword and the matched ad, the advertiser account history, the quality of the landing page17, the relevance of the keyword and the matched ad to the search query, the account’s performance in the geographical region where the ad will be displayed, as well as other relevance factors18. Note that the exact formula for computing the exact score is one of the best kept secrets of a search engine.

Major search engines enforce reserve prices, dictating the minimum price that an advertiser can pay per click for a particular keyword. The minimum bid is usually bidder-specific and quality-based. For instance, Google AdWords specifies a minimum allowable bid of as low as US$0.01 for high quality keywords19. Reserve prices are necessary to discourage bidders from bidding aggressively on irrelevant keywords, and thus compromising the search engine’s quality as perceived by the users. Besides, according to a well-known result in the theory of optimal auction design [47], setting suitable reserve prices may substantially raise revenues. Ostrovsky et al. [63] perform a large-scale field

13

See http://succeed.adgooroo.com/Q208_Search_Advertising_Report.html (accessed March 7, 2013).

14

See http://www.google.com/support/forum/p/AdWords/thread?tid=7aeb3290fd8feccb&hl=en (accessed March 7, 2013).

15

See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=107955 (accessed March 7, 2013).

16

To appreciate how important that is, note that it is estimated that more than 30% of all landing pages are only very remotely related to the ads [16].

17_{According to the landing page and site quality guidelines, the landing page is expected to 1) feature relevant and}

original content, 2) be transparent, and 3) be easy to navigate. See

http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=46675 (accessed March 7, 2013).

18

See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=10215 (accessed March 7, 2013).

(7)

experiment on reserve prices for sponsored search auctions conducted by Yahoo! to sell advertisements, and show that reserve prices can have a significant positive effect on the search engine’s revenue.

2.3 Users

Users constitute through their actions the commodity that the advertisers bid on. An advertiser expresses interest in those user queries that it deems as relevant by declaring keywords to the search engine. The assumption that the user expresses interest in the product or service of the advertiser is implicit throughout this process. But taken from the user perspective, a different story emerges. Users view the search engine as a gateway to the vast information content of the Web. By posing textual search queries, they express to the search engine an intention that the search engine attempts to capture, sometimes without success. For instance, consider a user who enters the search query apple with the intention of getting information on the vitamin content of the apple fruit. Interestingly, as of 30 September 2010, the top-10 organic results by Google Search are all related to Apple Inc. Corporation, which in this example is totally unrelated to the user’s intention. Unfortunately, even today there is a very limited understanding of user intention and behavior.

However, an aspect of user behavior that has received considerable attention and research in the last years concerns the interplay between organic and sponsored results, as well as the question of how sponsored links are generally perceived. Ghose and Yang, for instance, identify in [33] that retailer-specific and brand-specific information in a sponsored link increases the efficiency of online advertising; the former increases clickthrough rates, whereas the latter raises conversion rates. The same authors show in [71] that organic and sponsored results tend to be positively interdependent. In particular, total clickthrough rates, conversion rates, and revenues in the presence of both paid and organic search listings are significantly higher than those in the absence of paid search advertisements. Reiley et al. investigate in [64] the externalities among the north ads, i.e., the sponsored results that appear above the organic listings, and find that rival north ads impose a positive, rather than negative, externality on existing north listings. In other words, the top north listing receives more clicks when additional sponsored results appear below it. Agarwal et al. show in [2] that although clickthrough rate decreases with position, the conversion rate first increases and then decreases with position for longer keywords. As a result, top positions in sponsored search advertisements are not necessarily the optimal positions for advertisers. Rutz and Bucklin study in [66] the interactions between generic versus branded keywords, and find that generic keywords may induce positive spillovers on the clickthrough rate of branded keywords. Similarly, Jeziorski and Segal [42] and Chiou and Tucker [15] show the prevalence of negative externalities across ads: as many as 50% more clicks would occur in a hypothetical world in which each ad faces no competition. Finally, Edelman and Gilchrist [27] investigate how users perceive the labels of the sponsored results. Concretely, they show that relative

(8)

to users receiving the “sponsored link” or “ad” labels, users receiving the “paid advertisement” label click 23% and 26% fewer advertisements, respectively.

3 Practical Issues in Sponsored Search Advertising

In this Section, we investigate several practical issues that arise in sponsored search advertising. We have tried to place particular emphasis on how major search engines deal with these issues in practice.

3.1 Ranking and Pricing Schemes

Two major market design questions are 1) how to allocate advertisements to slots, and 2) how to price the ads. Interestingly, all large search engines address these two issues in a very similar way. First, ads are sorted in decreasing order of their rank, where the ad rank is determined by both the bid placed by the advertiser on the keyword, and the quality of the ad. The ad with the highest rank appears in the first position, and so on down the page, until all slots have been filled. Note that the exact ad rank formula differs from one search engine to another. Google AdWords20, for instance, has made public that their ad rank is determined by the product CPC bid × Quality Score. Bing Ads21, on the other hand, claim that their ad ranks are determined by various factors such as the bid amount, the ad CTR, and the ad relevance, but have kept the details of the ad rank formula secret.

Pricing takes place after the winning ads and their ranks have been specified to determine the cost per click that the advertiser will be charged whenever a user clicks on their ad. The natural method would be to make bidders pay what they bid, but that leads to well-known race conditions and instabilities, as we will explore in Section 4.1. Instead, all large search engines22 currently employ a generalized second-price auction (GSP) [26][68]. A GSP auction charges an advertiser the minimum amount required to maintain their ad’s position in search results, plus a tiny increment. Though we discuss sponsored search auctions in Section 5 in detail, let’s briefly demonstrate how GSP auctions work in practice by considering the ad ranking of Google AdWords. Suppose K slots are available, and are numbered 1, …, K, starting from the top and going down. Moreover, let the advertiser Ai at position i have a maximum bid bi and a quality score QSi. In GSP, the

price for a click for Ai is determined by the advertiser Ai+1 below them, and given by

bi+1×QSi+1/QSi, which is the minimum that Ai would have needed to bid to attain their

20

See https://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=6111 (accessed March 7, 2013).

21

See

http://advertise.bingads.microsoft.com/en-ca/product-help/bingads/topic?query=moonshot_conc_whatisadposition.htm (accessed March 7, 2013).

22_{For the pricing policy of Google AdWords, see}

http://support.google.com/adwords/answer/6297?hl=en&ref_topic=24937. For Bing Ads, see

http://community.bingads.microsoft.com/ads/en/bingads/b/blog/archive/2009/10/28/sem-beginner-series-what-is-cost-per-click-cpc.aspx. All accessed March 7, 2013.

(9)

position. Note that in this pricing scheme, a bidder’s payment does not take into consideration their own bid.

3.2 Account Structure and Ad Structure

Advertisers organize their ad campaigns through a specially designed structure, which is the same for the three major search engines. Concretely, the advertiser information is organized into three levels: account, campaign, and ad group. Depending on their marketing goals, an advertiser may have one or more accounts. An account is associated with a unique email address, password, and billing information, and contains one or more campaigns. A campaign has its own budget and targeting options (time-zone, geographical area, etc.) to determine where these ads will appear. Typically, campaigns are created to achieve a clear marketing goal, and are made up of one or more ad groups. An ad group contains a set of ads and a keyword list that will trigger these ads to show. Bids can be applied to all the keywords in an ad group, or custom bids may be set for individual keywords.

(a) Account (b) Ad creative

Fig. 2. Example advertiser account and displayed ad creative.

An ad (also known as creative by Google) is composed of three elements that are visible by the user on the results page. The headline is the first line of the ad, and acts as a link to the advertiser’s website. It is considered a good practice for the advertisers to include at least one of the keywords in the headline. The lines of text are the two lines right after the headline, and describe the advertised product or service. Finally, the display URL is the last line, and is used to show the URL of the promoted website. Besides the three visible elements, search engines demand from the advertisers to set a non-visible

destination URL (also known as landing page), which is the exact page within the website that is most relevant to the product or service described in the ad. Figure 2(a) shows an example23 of an account that contains one campaign with one ad group containing four

23

The example was taken from the website of Yahoo! Search Marketing but is not online anymore, as of March13, 2013.

(10)

keywords and two ads, while Figure 2(b) demonstrates how the ad might display when a user searches with the term notebook computers. Finally, we note that efficient indexing structures for both the advertiser accounts and the sponsored ads have been investigated in the literature, e.g., [9][46].

3.3 Click Probabilities

Commercial search engines compute the clickthrough rate of an ad without taking into account the ad position. Under this model, a given ad would have the same probability to be clicked, irrespective of whether it appears at the top or at the bottom of the sponsored results. However, one would normally assume that slots at the top receive more clicks. According to Accuracast24, for example, the average CTR clearly varies across position with ads at the top getting more clicks25. For this reason, prior research has focused on clickthrough models that incorporate the ad position. Aggarwal et al. [6] introduce a

separable click model, which defines the click probability as the product of two components: an ad-specific clickthrough rate, and a position-specific visibility factor. Unfortunately, subsequent experimental studies [43][18] have failed to validate this model and have emphasized its inadequacies. The most significant one is that it completely discounts the effects of other ads shown on the same page, namely the ad externalities. Intuitively, a high-quality relevant ad placed at the top can detract from other ads; conversely, a very low-quality or offensive ad may entice the user to completely disregard all ads on the page. Recent work [5][44][18] has suggested Cascade models for the ad externalities. In addition to the ad-specific clickthrough-rate, the basic Cascade model assumes an ad-specific continuation probability. This latter parameter describes the probability that a user will look at the ads below once it has looked at the current ad.

3.4 Payment Models

Advertisers make payments to the search engine, as the latter provides the platform which enables the advertisers to deliver advertising material to the users. Interestingly, the two parties have conflicting views on when a payment should occur. From the search engine’s point of view, the advertiser should pay on an impression basis, i.e., every time their ad is shown to the user. Indeed, the search engine provides the advertiser with the opportunity to be displayed, which is in its own right an advertising opportunity. Taken from the advertiser’s point of view, however, it is the actual conversions that matter: even if the ad is displayed, this is of little value if the user does not actually proceed to an action that is valuable to the advertiser. The former perspective leads to a pay per impression model, whereas the latter to a pay per conversion (or, pay per action) model. To reconcile the two

24

See http://knowledge.accuracast.com/articles/adwords-clickthrough.php (accessed March 7, 2013).

25

Surprisingly, evidence suggests that conversion rates vary insignificantly with position. See http://adwords.blogspot.com/2009/08/conversion-rates-dont-vary-much-with-ad.html (accessed March 7, 2013).

(11)

opposing viewpoints, the two parties have agreed on a middle-ground, namely, the pay per click model. Under this model, an advertiser makes a payment to the search engine only when a user actually clicks on their ad. Note that Google AdWords introduced a beta test for pay per action advertising in March, 200726; however, it was subsequently retired in June, 200827.

3.5 Parameter Estimation

Most work on sponsored search auctions usually takes for granted various parameters such as CTR and position-visibility. It turns out, however, that estimating this parameters is a difficult problem [22][54]. Indeed, there is an inherent tradeoff between learning these parameters and applying them: one cannot know a priori that a given ad has low quality unless it is exposed to the user; but then it was a bad idea to display the ad in the first place. This is similar to the exploitation vs. exploration tradeoff in reinforcement learning, and has been discussed in [70].

3.6 Incomplete Knowledge

Both the search engine and the advertisers have incomplete knowledge at many different levels. The greatest challenge from the search engine’s perspective is that it is not aware of the future query workload. This perplexes the allocation of an advertiser’s budget: without knowledge of future queries, budget allocation may be suboptimal and inefficient. Prior research has tackled this problem by modeling incomplete knowledge as online

queries, and employing approximate online algorithms, e.g., [59][65][70][55][56][35][62]. Further parameters that are not known include CTR and visibility factors; in this case, parameter estimation techniques can be employed (See Section 3.5).

Advertisers are also in a hard situation. Not only are they unaware of the future queries, they do not know the other advertisers’ bids, budgets, and CTRs as well. As a result, they face a very complex optimization problem, e.g., [48][11][14][65][30]. Furthermore, in this uncertain environment, the advertiser must come up with a profitable keyword choice; this issue is addressed in [65] through an adaptive algorithm that looks at the historical keyword performance. Note that large search engines provide the advertisers with tools that facilitate their keyword choices. Google AdWords, for instance, has developed 2 tools in this direction: 1) the Keyword Tool28 allows the advertiser to build extensive, relevant keyword lists; 2) the Bidding and Budget Tool29 automatically adjusts all bids of an

26

See http://adwords.blogspot.com/2007/03/pay-per-action-beta-test.html (accessed March 7, 2013).

27

See http://adwords.blogspot.com/2008/06/we-are-retiring-pay-per-action-beta.html (accessed March 7, 2013).

28_{See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=147602 (accessed March 7, 2013).} 29_{See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=113234 (accessed March 7, 2013). Note}

that advertisers may also use external bid management software used for the automatic controlling of bids. In this case, search engines generally restrict the bidding information available to the software, and require a review of any automated bidding code [41].

(12)

advertiser given its budget to get the most clicks possible; 3) the Traffic Estimator30

provides the advertiser with keyword search traffic (such as local monthly searches), and various estimates (such as estimated average pay-per-click and average position of an ad).

3.7 Click Fraud

Click fraud is a type of Internet crime that occurs when a sponsored link is intentionally clicked with no intention of generating value. It can be done automatically by computer scripts or directly by humans, and can be attributed to a variety of reasons, including a competitor’s desire to minimize the impact of an ad campaign, simple vandalism, or a desire by a publisher to increase their income [41]. Jansen estimates in [40] that an astonishing 1 percent of all search-engine visits result in an unidentifiable fraudulent click, which translates into hundreds of millions of dollars for both the search industry and the advertisers who end up paying for clicks than are of zero value to them. Unfortunately, identifying click fraud turns out to be surprisingly difficult, since it is hard to know who is behind a computer and what their intentions are. Common tools employed by search engines include aggressive monitoring and improved automated filters that use sophisticated data mining technology. Interestingly, a shift from the pay per click to the pay per action paradigm would partially alleviate this problem, since the advertiser would be charged only for successful conversions rather than clicks.

3.8 Keyword Match

A great challenge faced by the advertisers is to come up with the right set of keywords. Users searching for the keyword may use singular or plural, synonyms and other variations, may misspell, use extensions, or reorder the words. In fact, users may even search using terms that are not in the keyword (for instance, consider the keyword tennis shoes and the search query US Open sneakers). As a result, it is difficult or downright impossible for an advertiser to identify all possible variations of keywords that a user may use in their query.

Major search engines address this issue by providing a structured bidding language. In general, 4 matchtypes are supported31: exact, phrase, broad, and negative. In exact matchtype, the ad would be eligible to appear when a user searches for the specific keyword in this order, and without any other terms in the search query. In phrase matchtype, the ad would be eligible to appear when a user searches on the keyword, with the terms in that order; it can also appear for searches that contain other terms as long as it

30_{See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=8692 (accessed March 7, 2013).} 31_{For keyword matching options in Google AdWords, see}

http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=6100. For Microsoft’s Bing Ads, see http://community.bingads.microsoft.com/ads/en/bingads/b/blog/archive/2008/04/07/keyword-match-types.aspx. All accessed March 7, 2013.

(13)

includes the exact keyword. In broad matchtype, the ad would be eligible to appear when a user's search query contains all terms in the keyword in any order, and possibly along with other terms; it could also appear for the variations that we listed earlier. Finally, negative matchtype ensures that the ad will not be shown upon occurrence of certain terms in the query.

Broad match is the default option for large search engines. One of its primary benefits is that it helps the advertisers attract more traffic to their website; as a result, it boosts the number of clicks and conversions as well. In addition, broad match saves the advertisers time when constructing their campaigns, lets them take advantage of global search trends, and is cost-effective since they do not have to spend money on keywords that do not work32. Exact match, on the other hand, is the least flexible matchtype, and likely leads to fewer impressions, clicks, or conversions, compared to broad match. However, or the advertisers carefully constructs a comprehensive keyword list, the traffic they receive may be more targeted to their product or service. Phrase match lies somewhere in the middle: it is more targeted than broad match, but more flexible than exact match. Finally, negative keywords are especially useful if the advertiser’s account contains several broad match keywords.

3.9 Bidding Expressivity

Bidding expressivity concerns how to best translate advertiser needs into an appropriate bidding language. For example, a wine producer in Sacramento may want to target its ads only to users located in the state of California. Commercial search engines allow the advertisers to fine-tune their ads by targeting 1) specific locations, 2) days of the week, 3) time of day, 4) demographic (gender and age) groups, and 5) languages. Obviously, a more expressive bidding language may be better tailored to the user needs, but comes at a high complexity cost of the auctions and the middleman software. Recently, more expressive bidding languages have been investigated. Even-Dar et al. introduce in [28]

context-based auctions where advertisers can bid on keywords that satisfy specific contexts such as gender, income, likely task, etc. They further show that under certain conditions the overall social welfare increases when moving from standard to context-based mechanisms. Martin et al. [57] propose multi-feature auctions that enable advertisers to express bids on multiple features, namely, clicks, conversions, and slot positions. For instance, an advertiser may express that they only wish to be placed in prominent positions; or, they may prefer their ads to be placed near the top or bottom of the list, but not in the middle; or, they may value purchases (conversions) but have zero valuations for clicks alone. In the multi-feature model, the advertiser declares a bid table that summarizes their valuation over different combinations of the tree features; an efficient, scalable, and parallelizable infrastructure on the search engine’s side is responsible for ad ranking and pricing. To account for ad externalities, Ghosh et al.

(14)

augment in [34] the standard bidding language with an exclusivityfeature: an advertiser’s

value depends on whether or not other ads are shown along with their ad, i.e., whether they are shown exclusively or not. They further introduce two GSP-like auction mechanisms with two types of outcomes: either a single ad is displayed exclusively, or multiple ads are simultaneously shown. Both mechanisms are usually characterized by high efficiency and revenue. An interesting model, namely, the General Auction Mechanism (GAM), was recently proposed by Aggarwal et al. in [4]. GAM allows the bidders to specify both a valuation and a maximum price for each ad slot. The former is a measure of how much the slot is worth to them, whereas the latter is the maximum amount of money that they are willing to pay for the ad to appear in the slot. It turns out that GAM is a powerful model that includes several novel auction mechanisms at its core. Unfortunately, its expressiveness is rather limited for practical scenarios: 1) it cannot handle simultaneously both per-click and per-impression valuations, 2) it does not allow to express how tight the budget is, and 3) it cannot model risk-averse bidders. Motivated by these shortcomings, Duetting et al. discuss in [23] a new powerful auction mechanism that additionally runs in polynomial time.

3.10 Strategic Bidder Behavior

An advertiser strategy includes all actions taken by them to meet their marketing and campaign goals; for instance, a strategy dictates to the advertisers how to allocate their funds, and what keywords to declare. The auction mechanisms that were employed by search engines in the early years of sponsored search were particularly susceptible to certain bidding patterns [26] that led to market inefficiencies and instabilities. The most frequent patterns were bidding war cycles33 [7] and gap jamming [13][32]. In gap jamming, advertisers raise their bids to a point just below their immediate competitors; as a result, the competitors pay the maximum CPC, and deplete their budget fast. In return, advertisers can protect themselves from such spiteful behaviors by gradually shading their bids or even investing in bid management software. The most prominent proposed solutions to vindictive bidding are stochastic auctions and contingent-payment auctions

[58]. Stochastic auctions allow all bidders to win with some probability; under this scheme, vindictive bidders will periodically be forced to pay high amounts, so their incentive is reduced. Moreover, stochastic auctions imply that even low bidders can win occasionally, and the search engine can acquire a more accurate clickthrough estimate for all bidders, in effect trading off high immediate revenue for optimizing future decision making. Note that bidders with near equal slot valuations can share a slot [11]. Finally, in contingent-payment auctions, the winning bidder only pays when a bidder-specific contingency occurs; this makes it harder for a spiteful bidder to deplete a competitor’s budget. Modern search engines, on the other hand, shield the auction process from undesirable advertiser behavior by incorporating ad quality scores into the ranking and

(15)

pricing schemes. Indeed, although the advertiser can still manipulate their declared maximum CPC, they have no control over any ad quality score, which is involved in both the ad allocation and pricing.

4 An Overview of Sponsored Search Auctions

The market for sponsored search advertising had to undergo significant adjustments to address a number of structural shortcomings. The initial unstable mechanisms were replaced by more robust auction designs. This is evident in both the ad allocation and pricing schemes that have evolved in steps over the years. Interestingly, changes in the market for sponsored search took place at a very fast pace. This could be due to the competitive pressures on mechanism designers, the much lower costs of entry and experimentation, advances in the understanding of market mechanisms, and improved technology [26]. In the following, we discuss how sponsored search auctions evolved over the years, based on the seminal work of Edeleman, Ostrovsky and Schwartz [26].

4.1 Generalized First-Price Auctions

Beginning in 1994, early Internet advertising followed a pay per impression model, where advertisers paid a fee to show their ads a fixed number of times (typically one thousand impressions). In 1997, Overture (then GoTo; acquired by Yahoo! in 2003 for $1.63 billion34) introduced a radically new model to sell Internet advertising. In the original Overture auction design, each advertiser submitted a maximum cost per click bid for a particular keyword. An auction would play out in an automated fashion every time a visiting user would trigger the ad spot. Advertisers were ranked in decreasing order of their declared bids, making highest bids the most prominent. Every time a user clicked on a sponsored link, the advertiser was charged an amount of money equal to their latest bid. The underlying ranking and pricing schemes were reminiscent of the first-price auction, a well-studied paradigm in the theory of auctions. In a first-price auction, a single item is for sale; all bidders declare bids that correspond to their maximum willingness to pay for the item. The highest bidder wins, and pays an amount equal to their bid. In sponsored search, on the other hand, multiple items (the ad slots) are for sale; the items are inherently different: higher ad positions are worth more than lower ad positions. As before, the interested advertisers communicate their bids to the search engine, and are subsequently ranked in decreasing order of their bid. The highest bidder wins the first slot and pays its bid; the second-highest bidder wins the second position and pays its bid, and so on until all ad slots have been sold35. The analogy to the classic single-item first-price auctions led to the name generalized first-price auction. Obviously, the newly introduced pay per click

34

See http://docs.yahoo.com/docs/pr/release1102.html (accessed March 7, 2013).

(16)

model allowed the advertisers to better target their ads: instead of paying for a banner ad that would be displayed to everyone visiting a website, advertisers could now explicitly specify which keywords (and, thus, users) were relevant to their advertising campaign. Moreover, the ease of use, the very low entry costs, and the transparency of the mechanism (see [26]) quickly led to the success of Overture. Indeed, major search engines including Yahoo! and MSN adopted Overture’s platform as their advertising provider.

Unfortunately, the novel auction mechanism induced a very unstable dynamics, in the sense that bidders would not state their true valuations, and would keep changing their bids in response to other bidders’ behavior. For instance, in a keyword market with two advertisers and two ad slots, assume that a click is worth $1.00 to the first advertiser and $1.50 to the second. If the first advertiser bids $1.00, then the second will likely bid $1.01 (i.e., the other bid plus a minimal increment), thereby claiming the first slot, and paying as little as possible. But then, the first bidder will try to lower its bid to the minimum bid (or reserve price), say $0.10, reducing its costs while still preserving the second slot, which is the best position it can get given the other advertiser’s high bid. But then the second advertiser will lower its bid, e.g., to $0.11, and the advertisers will raise each other in small increments until the second advertiser outbids the first advertiser’s valuation, and the first advertiser drops to the minimum bid. Under these assumptions, a cyclical pattern

will continue indefinitely. This instability is further exacerbated by automated bid management softwares. Figure 3 (taken from [25]) demonstrates this behavior. It presents top bids, in dollars, for a specific keyword, every 15 minutes from 12:15 AM to 2:15 PM on July 18, 2002. Clearly, a “sawtooth” pattern emerges in the bidding process, whereby bids increase yielding more and more “teeth”, when they suddenly drop to lower values.

(a)14 hours (b) 1 week

Fig. 3. “Sawtooth” bidding pattern (taken from [25]).

Clearly, the generalized first-price auction is not truthful: the advertisers have no incentive to declare their true valuations to the search engine. Instead, they devote considerable time and resources to the constant manipulation of their bids, potentially paying less attention to ad quality and other campaign goals. Furthermore, it is not efficient: although an advertiser may value the top spot more than its competitors, it get lower positions half or

(17)

more of the time. Finally, the volatile prices may take a serious toll on the search engine’s revenues, as empirically demonstrated in [25].

4.2 Generalized Second-Price and Vickrey-Clarke-Groves Auctions

Under the generalized first-price auction, the advertisers have an incentive to game the system in their favor, and engage in inefficient and endless bidding wars. Google was the first search engine to address these shortcomings by introducing its own pay per click system, AdWords Select, in February 2002. Google recognized that the ith highest bidder would never be willing to pay more than the (i+1)th highest bid plus a minimal increment. Its newly introduced generalized second-price auction reflected this principle: advertisers were still ranked in decreasing order of their bids, but an advertiser in position i would now pay a price per click equal to the bid of the advertiser in position i+1 (plus an increment)36. This price corresponds to the minimum amount of money that an advertiser must pay to retain their position, and is independent of its bid. The new auction structure makes the market less susceptible to gaming, and is thus more robust. Recognizing these benefits, Yahoo!/Overture also switched to GSP.

Note that the GSP auction is a generalization of the second-price auctions for single items. In the latter, bidders submit their bids for the single object; the one with the highest bid wins, but pays the second-highest bid. The winner’s payoff is its valuation minus the price it has to pay, which is non-zero for distinct bids; the other bidders’ payoff is zero. Under this pricing scheme, bidders have an incentive to bid their true valuations. Indeed, assume that bidder i submits a bid bi that is different from its true valuation vi: either bi >

vi, or bi < vi. Consider the first case. If i is the winner, then by bidding more it is still the

highest bidder, and will pay the second-highest bid, as before; thus, i has no incentive to overbid. On the other hand, if i is not the winner, then there must be another winner j with a higher bid bj > bi. If i bids less than bj, then it still loses the auction, and faces the same

situation as before; but if i bids more than bj then it wins the auction. In this case, it will

have to pay bj > bi > vi, effectively ending up paying more than what the object is worth to

him. Thus, i has no incentive to overbid. Take now the second case. If i is the winner, two things can happen if it underbids. Either it is still the winner, in which case it pays the second-highest bid as before, or it may end up losing the auction, in which case it does not get the object for a zero payoff. In either case, i has no incentive to underbid. If i is not the winner, then by underbidding it has no chance to be the auction winner, so it faces the same situation as before; thus, it has no incentive to underbid.

Although the GSP auction is a straightforward generalization of the single-item second-price auction, it turns out that it does not maintain the property of truthfulness. We demonstrate this by citing an example from [26]. Consider three bidders, with values per click of $10, $4, and $2, and two positions. The first position has a clickthrough rate of

36_{In fact, this was Overture’s implementation; Google AdWords additionally considers quality scores for every ad, as}

(18)

200 clicks per hour, whereas the second 199 clicks per hour. If the bidders bid truthfully, then bidder 1’s payoff is ($10−$4)*200 = $1200. If, instead, it underbids by bidding only $3 per click, then it will get the second position, but its payoff will be equal to ($10−$2)*199 = $1592 > $1200. Thus, bidder 1 has an incentive to underbid.

It turns out that the GSP auction does not maintain the truthfulness of the second-price auction, because it fails to capture accurately the underlying principle that characterizes the second-price auction. In particular, the GSP auction follows a pricing scheme, where a bidder who wins an object pays an amount of money that equals the bid right below. Indeed, this constitutes a straightforward generalization of the second-price auction which determines a price for the winner equal to the second-highest bid. However, generalizing the second-price in this direction yields an auction that is not truthful. In order to come up with an auction design that satisfies truthfulness, one must first get a better insight into the right interpretation of the second-price mechanism.

Revisiting the second-price auction, we can indeed provide an alternative interpretation: the winner is requested to pay an amount of money equal to the

externalities that it imposes on the others, i.e., the decreases in the valuations of other bidders because of its presence. To understand what this means, consider for a moment two bidders who bid for one item. Now, imagine that the highest bidder 1 with valuation

v1 did not participate in the auction; then the bidder 2 with valuation v2 would win the

auction. Because of bidder’s 1 presence, bidder 2 does not have the chance to acquire the item and increase its valuation by v2. In other words, bidder’s 1 presence imposes a

negative externality on bidder 2 equal to v2; to compensate for this, bidder 1 thus pays an

amount equal to v2. So, the winner ends up paying the second-highest valuation, which is

the standard second-price.

In the case of a single item, the two interpretations yield identical mechanisms, namely the second-price auction. However, for multiple items, the resulting auctions are very different. The more straightforward generalization yields the GSP auction as we saw above, while the externality-based interpretation yields the Vickrey-Clarke-Groves (VCG) auction, named after William Vickrey [69], Edward H. Clarke [17], and Theodore Groves [37]. Contrary to GSP, VCG gives bidders an incentive to bid their true value, and is

socially optimal, i.e., the bidder with the highest valuation acquires the highest position, the bidder with the second-highest valuation receives the second-highest position, etc.

We illustrate how the two auctions work in sponsored search with an example taken from [26]. Suppose there are two slots on a page and three bidders. An ad in the first slot receives 200 clicks per hour, while the second slot gets 100. Bidders 1, 2, and 3 have values per click of $10, $4, and $2, respectively, and assume that they bid truthfully. Payments per click in GSP are $4 and $2 (plus a tiny increment), so the total payments of bidders one and two are $800 and $200, respectively. Let us now compute VCG payments for this example. The second bidder’s payment is $200, as in GSP. However, the payment of the first advertiser is now $600: $200 for the externality that it imposes on bidder 3 (by forcing him out of position 2) and $400 for the externality that it imposes on bidder 2 (by moving him from position 1 to position 2 and thus causing him to lose (200−100) = 100 clicks per hour). The total revenues under GSP are $1000, whereas the total payments

(19)

under VCG are $800. Indeed, it turns out that if advertisers bid truthfully, then revenues under GSP are always higher than VCG [26].

4.3 Recent Market Development

Two striking features of the recent development in the sponsored search market are consolidation and convergence. In February 2010, Microsoft and Yahoo! announced a partnership agreement under the name Search Alliance, whereby Yahoo!’s both algorithmic and paid search result platforms would be powered by Microsoft37. This effectively leaves the US market with only two big players in the paid search advertising field. Moreover, although search engines are quite opaque about their auction protocols, it turns out that all major search engines have walked in Google’s steps: they have now incorporated an ad quality score in both the ranking and pricing schemes, and they use GSP auctions to sell keywords. One could say that Google, as the market leader, has paved the way that others have followed. This is also evident in the various practical aspects that we explored in Section 3, such as account and ad structure, keyword match, bidding expressivity, etc. Both consolidation and convergence are observed in all big markets, so it should come as no surprise that this is also the case with the market for sponsored search advertising.

Interestingly, GSP rather than VCG is used in practice, even though the latter would (at least theoretically) diminish incentives for strategizing and facilitate the advertisers’ task. Edelman et al. [26] attribute this to several reasons. First, VCG is hard to communicate to typical advertisers. Second, switching to VCG may entail substantial transition costs. VCG revenues are lower than GSP revenues for the same bids, and bidders might be slow to stop shading their bids. Third, the revenue consequences of switching to VCG are largely unpredictable. We believe that the introduction of the ad quality score has also played a role in the wide adoption of GSP. Indeed, the ad quality scores are now an integral part of both the ranking and pricing protocols; even if advertisers manipulate their bids, it is very difficult to game the system since they have no control over the ad quality scores.

5 Matching Markets, VCG, and the GSP Procedure

In the previous Section, we discussed how after years of testing and experimentation, the sponsored search industry adopted the GSP protocol. We also contrasted GSP to the truthful VCG mechanism, and studied the underlying principles that characterize the two auction designs. In this Section, we will investigate in more depth the structure and properties of the VCG mechanism, and will explain in what ways GSP and VCG are

37

See http://www.searchalliance.com/apac/en/yahoo-and-microsoft-to-implement-search-alliance (accessed March 7, 2013).

(20)

structurally related. In this direction, we will introduce a very interesting and well-studied class of models, namely, matching markets. This Section is based on Chapters 10 and 15 of [24], which together provide an excellent treatment of markets and auctions.

5.1 Matching Markets

A market refers to any one of a variety of systems, institutions, procedures, social relations and infrastructures whereby parties engage in an exchange. We focus on two-sided markets with exactly two sets of agents, the sellers, each with an item for sale, and the buyers, each of whom wants to acquire an item. We consider that a buyer j has a valuation vij for the item held by seller i, with the subscripts i and j indicating that the

valuation depends on the identities of both the seller i and the buyer j. We will also assume that each valuation is a nonnegative whole number. We assume that the sellers have a valuation of 0 for each item; they care only about receiving payments from the buyers.

To make the notion of payment clear, consider that each seller i puts its item up for sale for a price pi≥0. If a buyer j buys the item from seller i at this price, then its payoff is

equal to its valuation for the item, minus the amount of money it had to pay to acquire the item, i.e., vij – pi. Given a set of prices, one for each item, we assume that a buyer j wants

to buy from seller i for which its payoff is maximized. Note that if the payoff is maximized in a tie between several sellers, then the buyer is indifferent to the identity of the seller and may choose any of them. Second, if the payoff is negative for every choice of seller i, then the buyer does not transact and thus gets a payoff of 0. We call the seller or sellers than maximize the payoff for buyer j the preferred sellers of buyer j, provided the payoff from these sellers is nonnegative; otherwise, we say that buyer j has no preferred seller. For a set of prices, we define the preferred-seller graph on buyers and sellers by simply constructing an edge between each buyer and their preferred seller(s) in the corresponding bipartite graph.

a Sellers Valuations 12, 4, 2 8, 7, 6 7, 5, 2 b c x Buyers y z a Sellers Payoffs 7, 2, 2 3, 5, 6 2, 3, 2 b c x Buyers y z Prices 5 2 0 (a) (b)

(21)

a Sellers Payoffs 10, 3, 2 6, 6, 6 5, 4, 2 b c x Buyers y z Prices 2 1 0 a Sellers Payoffs 9, 3, 2 5, 6, 6 4, 4, 2 b c x Buyers y z Prices 3 1 0 (c) (d)

Fig. 4. Preferred-seller graphs for different sets of prices (adapted from [24]).

Figures 4(b)-4(d) depict the preferred-seller graphs for three different sets of prices when the buyer valuations are as in Figure 4(a). First, observe in Figure 4(b) that if each buyer simply claims its preferred item, then each buyer ends up with a different item; there is thus no contention for items. We call such a set of prices market-clearing, since they cause each item to get bought by a different buyer. Prices in Figure 4(c) are not market-clearing, because buyers x and z both want the item offered by the single seller a. Note that prices in Figure 4(d) are market clearing as well, in the sense that buyers can coordinate their actions so that each of them ends up with a different seller. Formally, a set of prices is market-clearing if the resulting preferred-seller graph has a perfect matching.

A natural question that arises when discussing market-clearing prices is whether they always exist for any set of buyer valuations. This question was answered affirmatively by the economists Damange, Gale, and Sotomayor in [20], who described an auction-like procedure that takes as input an arbitrary set of buyer valuations, and arrives at market clearing prices. In fact, their method is equivalent to a construction of market-clearing prices discovered by the Hungarian mathematician Egervary in 1916 [51]. A second natural question concerns the social welfare of the resulting assignment, i.e., the total valuation of the resulting assignment. Interestingly, one can prove that for any set of market-clearing prices, a perfect matching in the resulting preferred seller graph, is

socially optimal, i.e., has the maximum total valuation of any assignment of sellers to buyers.

5.2 Sponsored Search as a Matching Market

It is now straightforward to establish a connection between matching markets and the market for sponsored search. The advertising slots are the inventory that the search engine is trying to sell to the advertisers. One subtlety is that in the sponsored search market there is one seller (the search engine) that puts up several items (ad slots) for sale to potential buyers (advertisers), while the matching market model assumes several sellers, each of

(22)

them associated with exactly one item. It turns out that this is not important: the market-clearing prices and the perfect matching in the preferred-seller graph are computed as before, while the unique seller (search engine) collects all payments. From the advertiser’s side, we assume that each advertiser j has a revenue per click vj, i.e., the expected amount

of money it receives per user who clicks on the ad. We further assume that this value is intrinsic to the advertiser and does not depend on what was being shown on the page when the user clicked on this ad; for instance, we do not take into account ad externalities. To complete the construction of the matching market, we need to determine the valuation vij

that advertiser j has for slot i. This corresponds to the benefit that j receives from being shown in slot i, and it depends not only on the advertiser’s revenue per click, but on the

actual number of received clicks as well. Usually, we model the advertiser’s valuation for slot i as vij = qjrivj, where ri is the clickthrough rate of slot i, and qj is a quality factor for

advertiser j.

At this point, we have constructed a matching market with well-defined ingredients. The participants consist of a set of buyers and a seller that puts up several items for sale. Each buyer has a valuation vij for the item offered by item i, which depends on the

identities of both the buyer and item. The goal is to match up buyers with items, in a way that no buyer purchases two different items and the same item isn’t sold to two different buyers. According to the discussion in Section 5.1, this is always possible by constructing a set of market-clearing prices that produce a perfect matching in the preferred-seller graph. Moreover, the assignment of buyers to sellers in this perfect matching always maximizes the social welfare, i.e., the advertiser’s total valuation for the items they get.

5.3 VCG Prices and the Market-Clearing Property

Unfortunately, the construction of prices that we have just described can be carried out by the search engine, only if it knows the valuations of the advertisers. In practice, the search engine does not have a way to find out these valuations, and has to rely on truthful reporting from the advertisers’ side. The real challenge is then to design a price-setting procedure, where the advertisers have an incentive to report truthfully, in the sense that they cannot receive a higher payoff by misreporting. And this is exactly where the VCG mechanism that we discussed in Section 4.2 comes into play.

Recall that under the VCG principle, we first assign items to buyers so as to maximize the total valuation (social welfare). Then, the price buyer j should pay for item i it receives is equal to the externality that it imposes on the other buyers through its acquisition of this item. To make this clear, let S denote the set of sellers and B denote the set of buyers. Let denote the maximum total valuation over all possible perfect matchings of sellers and buyers. Now, let S−i denote the set of sellers with the seller i removed, and let B−j denote the set of buyers with buyer j removed. Then, if we give item i to seller j, the best total valuation the rest of the buyers could get is . On the other hand, if buyer j simply did not exist but item i were still an option for everyone else, then the best total valuation the

(23)

rest of the players could get is . Thus, the total harm caused by buyer j to the rest of the buyers is the difference between how they would do without j present and how they do with j present, i.e., the difference . This is the VCG price that we charge buyer j for item i: .

Interestingly, if items are assigned and prices computed according to the VCG mechanism, then two very interesting properties hold. First, the resulting assignment of buyers to sellers maximizes the total valuation of any perfect matching of items and buyers; this is easy to justify, since the VCG mechanism is designed to maximize the total valuation. Second, truthfully announcing a valuation is a dominant strategy for each buyer, i.e., each buyer has an incentive to report truthfully irrespective of what all other buyers report. An immediate consequence in sponsored search auctions is that the advertisers do not have an incentive to strategize their bids, and the search engine can get to know the advertisers’ true valuations. To explain the VCG price construction, let’s get back to the matching-market of Figure 4(a). First, we determine that the matching that maximizes the total buyers’ valuation assigns seller a to buyer x, b to z, and c to y. The matching of maximum valuation suggests how sellers and buyers match up. The second step is to compute the price every buyer has to pay to its assigned item. For instance, to determine the price that x must pay, first note that once x is assigned to a, the maximum total valuation among all matchings between the remaining sellers and buyers would be 11, by matching y to c and z to b. In the other hand, if x were not present, then the maximum total valuation possible would be 14, by matching y to b and z to a. The difference between these two quantities is the VCG price for buyer x: 14−11=3. Similarly, we can compute that the prices for items z and y are 1 and 0, respectively. Interestingly, this set of prices is the same in Figure 4(d), where we can see that the prices are market-clearing, since they yield a perfect matching in the induced preferred-seller graph. Later, we see that this is not a mere coincidence.

Comparing the market-clearing prices that we discussed in Section 5.1 and the VCG prices, we can notice that there is a crucial difference between them. The former are

posted prices, in that the seller simply announces a price and is willing to charge it to any buyer who is interested. The VCG prices, on the other hand, are personalized prices: they depend on both the item being sold and the buyer to whom it is being sold. The VCG price pij paid by buyer j for item i may well differ from the VCG price pik that buyer k

would pay for the same item.

It turns out, however, that market-clearing and VCG prices are in fact related. First, despite their definition as personalized prices, VCG prices are always market-clearing. That is, suppose we were to compute the VCG prices for a given matching market, first determining a matching of maximum total valuation, and then computing the corresponding VCG prices. Then, however, assume we go on to post the prices publicly: rather than requiring the buyers to follow the matching used in the VCG construction, we allow any buyer to buy any item at the indicated price. Despite this seemingly greater freedom, each buyer will in fact achieve the highest payoff by selecting the item it was assigned when the VCG prices were determined. In other words, VCG prices are market