• No results found

161Using probabilistic programming to extend Bayesian networks: predicting product suc-

In document Exploring Data Science (Page 166-168)

Modeling dependencies with Bayesian and

161Using probabilistic programming to extend Bayesian networks: predicting product suc-

class Network(popularity: Double) { val numNodes = Poisson(popularity) }

class Model(targetPopularity: Double, productQuality: Double, affordability: Double) { def generateLikes(numFriends: Int, productQuality: Double): Element[Int] = { def helper(friendsVisited: Int, totalLikes: Int, unprocessedLikes: Int): Element[Int] = { if (unprocessedLikes == 0) Constant(totalLikes) else { val unvisitedFraction = 1.0 – (friendsVisited.toDouble – 1) / (numFriends – 1) val newlyVisited = Binomial(2, unvisitedFraction) val newlyLikes = Binomial(newlyVisited, Constant(productQuality)) Chain(newlyVisited, newlyLikes, (visited: Int, likes: Int) => helper(friendsVisited + unvisited, totalLikes + likes, unprocessedLikes + likes - 1)) } } helper(1, 1, 1) } val targetSocialNetwork = new Network(targetPopularity) val targetLikes = Flip(productQuality) val numberFriendsLike = Chain(targetLikes, targetSocialNetwork.numNodes, (l: Boolean, n: Int) => if (l) generateLikes(n, productQuality) else Constant(0)) val numberBuy = Binomial(numberFriendsLike, Constant(affordability)) }

Three details of the code need additional explanation: the Poisson element used in the Network class, the generateLikes process, and the definition of numberBuy. Let’s first talk about the Poisson element and the numberBuy logic and then get to the generateLikes process, which is the most interesting part of the model.

■ A Poisson element is an integer element that uses what is known as the Poisson dis- tribution. The Poisson distribution is typically used to model the number of occurrences of an event in a period of time, such as the number of network

Listing 5.3 Product success prediction model in Figaro

Define a Network class with a single attribute defined by a Poisson element (see text) Create a Model class that takes the known control parameters as arguments Define a recursive process for generating the number of people who like the product (see text) The target social network is defined to be a random network, based on the target’s popularity. Whether the target likes the product is a Boolean element based on the product quality.

If the target likes the product, calculate the number of friends using generateLikes. If she doesn’t, she doesn’t tell friends about it, so the number is 0.

The number of friends who buy the product is a binomial (see text).

failures in a month or the number of corner kicks in a game of soccer. With a little creativity, the Poisson distribution can be used to model any situation where you want to know the number of things in a region. Here, you use it to model the number of people in someone’s social network, which is different from the usual usage but still a reasonable choice.

The Poisson element takes as an argument the average number of occur- rences you’d expect in that period of time, but allows for the number to be more or less than the average. In this model, the argument is the popularity of the target; the popularity should be an estimate of the average number of peo- ple you expect to be in the target’s social network.

■ Here’s the logic for the number of people who buy the product. Each person who likes the product will buy it with a probability equal to the value of the affordability parameter. So the total number of people who buy is given by a binomial, in which the number of trials is the number of friends who like the product, and the probability of buying depends on the affordability of the prod- uct. Because the number of people who like the product is itself an element, you need to use the compound binomial that takes elements as its arguments. The compound binomial element requires that the probability of success of a trial also be an element, which is why the affordability is wrapped in a Constant. A Constant element takes an ordinary Scala value and produces the Figaro ele- ment that always has that value.

■ The purpose of the generateLikes function is to determine the number of people who like the product after giving it to a target whose social network con- tains the given number of people. This function assumes that the target herself likes the product; otherwise, the function wouldn’t be called at all. The func- tion simulates a random process of people promoting the product to their friends if they like the product. The generateLikes function takes two argu- ments: (1) the number of people in the target’s social network, which is an Integer, and (2) the quality of the product, which is a Double between 0 and 1. The precise logic of the generateLikes function isn’t critical, because the main point is that you can use an interesting recursive function like this as a CPD. But I’ll explain the logic, so you can see an example. Most of the work of generateLikes is done by a helper function. This function keeps track of three values:

– friendsVisited holds the number of people in the target’s social network who have already been informed about the product. This starts at 1, because initially the target has been informed about the product.

– totalLikes represents the number of people, out of those who have been visited so far, who like the product. This also starts at 1, because you assume that the target likes the product for generateLikes to be called.

– unprocessedLikes represents the number of people who like the product for whom you’ve not yet simulated promoting the product to their friends.

163

In document Exploring Data Science (Page 166-168)