ASSIGNMENT: MODULE 10
ASSIGNMENT: MODULE 10
ALGORITHM DESIGN
ALGORITHM DESIGN
Input format: Input file has 12 columns with headings
Input format: Input file has 12 columns with headings
–
–
1.
1. event_epoch_time
event_epoch_time
2.
2.
user_id
user_id
3.
3. device_id
device_id
4.
4.
user_agent
user_agent
5.
5. pizza_name
pizza_name
6.
6.
isCheeseBurst
isCheeseBurst
7.
7. Size
Size
8.
8.
AddedToppings (colon separated string)
AddedToppings (colon separated string)
9.
9. Price
Price
10.
10. CouponCode
CouponCode
11.
11. Order_Event
Order_Event
12.
12. isVeg
isVeg
1.
1.
Map-only algorithm for filtering out all the records which have event_epoch_time, user_id,
Map-only algorithm for filtering out all the records which have event_epoch_time, user_id,
device_id, user_agent as NULL by taking original dataset as input.
device_id, user_agent as NULL by taking original dataset as input.
1.1. /* Here Row[1],/* Here Row[1], Row[2], Row[3]… denotes data in the column Row[2], Row[3]… denotes data in the column event_epoch_timeevent_epoch_time, user_id,, user_id,
2.
2. * device_id and so on with the index as shown in * device_id and so on with the index as shown in the list above.the list above.
3. 3. */*/
4.
4. Map(Key, Value)Map(Key, Value) 5.
5. Row = split(value,Row = split(value, ‘‘\\t’)t’) // Here,// Here, ‘‘\\t’t’ is to denote tab is to denote tab
6. 6. 7.
7. IF(Row[1] != NULL AND Row[2] != NULL AND Row[3] != NULL AND Row[4] != NULL AND)IF(Row[1] != NULL AND Row[2] != NULL AND Row[3] != NULL AND Row[4] != NULL AND) 8.
8. Write(Row)Write(Row) 9.
9.
10.
10. EXIT Map FunctionEXIT Map Function 11.
11. /*This function will outpu/*This function will output the Row as t the Row as a 2D array of the data a 2D array of the data we got from the table.we got from the table.
12.
12. * from here onwards Map(key, Row ) will denote the output of this function is taken as* from here onwards Map(key, Row ) will denote the output of this function is taken as
13.
13. * input for the map function* input for the map function
14.
14. */*/
2.
2.
An algorithm to read the user agent and extract
An algorithm to read the user agent and extract OS Version and Platform from it.
OS Version and Platform from it.
1.1. Map(Key, Row)Map(Key, Row) // Taking input form output of question 1, 1 row at a time from// Taking input form output of question 1, 1 row at a time from
2D Array, making the input as 1D array .
2D Array, making the input as 1D array .
2.
2. OS_P = split(Row[4],OS_P = split(Row[4], ‘:’)‘:’) 3.
3. OS_version =OS_P[2]OS_version =OS_P[2] //Assuming array’s index starts from 1 instead of 0//Assuming array’s index starts from 1 instead of 0
4.
4. Platform = OS_P[1]Platform = OS_P[1] 5.
5. Write(OS_version,1 )Write(OS_version,1 ) 6.
6. Write(Platform,1 )Write(Platform,1 ) 7.
7. // This will output the OS Version and Platform from user_agent// This will output the OS Version and Platform from user_agent
3.
3.
getCounter(“Orders”) creates a global variable of same name if
getCounter(“Orders”) creates a global variable of same name if already not available
already not available..
3.1. To find out the number of
3.1. To find out the number of veg and non-veg pizzas sold.
veg and non-veg pizzas sold.
1.1. Map(Key, Row)Map(Key, Row) // Taking input form output of question 1// Taking input form output of question 1
2.
2. getCounter(“ VeggetCounter(“Veg”)”) 3.
3. getCounter(“ Non-VeggetCounter(“Non-Veg”)”) 4.
4. IF(Row[12] == “Y”)IF(Row[12] == “Y”) 5.
5. getCounter(“Veg”).incrementBy(1)getCounter(“Veg”).incrementBy(1) 6.
6. IF(Row[12] == “N”)IF(Row[12] == “N”) 7.
7. getCounter(“Non -getCounter(“Non-Veg”).incrementBy(1)Veg”).incrementBy(1) 8.
8. ELSEELSE 9.
9. EXIT Map functionEXIT Map function 10.
10. 11.
11. EXIT Map FunctionEXIT Map Function 12.
12. PRINT VegPRINT Veg 13.
13. PRINT Non-VegPRINT Non-Veg 14.
14. /*Print statement would display the total number of Veg/Non-Veg Pizzas sold since the/*Print statement would display the total number of Veg/Non-Veg Pizzas sold since the
Veg and Non-Veg are global variables.*/
3.2 To find out the size wise distribution of pizzas sold
1. Map(Key, Row) // Taking input form output of question 1 2. getCounte r(“Small”) 3. getCounter(“ Medium”) 4. getCounter(“ Large”) 5. getCounter(“ Total”) 6. IF(Row[7] == “R”) 7. getCounter(“ Small”).incrementBy(1) 8. getCounter(“ Total”).incrementBy(1) 9. IF(Row[7] == “M”) 10. getCounter(“ Medium”).incrementBy(1) 11. getCounter(“ Total”).incrementBy(1) 12. 13. IF(Row[7] == “L”) 14. getCounter(“ Large”).incrementBy(1) 15. getCounter(“ Total”).incrementBy(1) 16. ELSE
17. EXIT Map Function 18. Exit Map Function
19.
20. //Total, Small, Medium and Large are global variable 21. Total = Small + Medium + Large
22. Distribution_small = (Small / Total)*100 23. Distribution_medium = (Medium / Total)*100 24. Distribution_large = (Large / Total)*100
25. PRINT Distribution_small, Distribution_medium, Distribution_large
26. //Prints the size-wise distribution as the percentage of total pizzas sold
3.3 To find out how many cheese burst pizzas were sold
1. Map(Key, Row) // Taking input form output of question 1 2. getCounter(“ Cheese_Burst_Total ”)
3. IF(Row[6] == “Y”)
4. getCounter(“ Cheese_Burst_Total ”).incrementBy(1)
5. ELSE
6. EXIT Map Function 7. EXIT Map Function
8. PRINT Cheese_Burst_Total
3.4 To f ind out how many small cheese burst pizzas were sold
1. Map(Key, Row) // Taking input form output of question 1 2. getCounter(“ Cheese_Burst_Small ”)3. IF(Row[6] == “Y” AND Row[7] == “R”)
4. getCounter(“ Cheese_Burst_Small ”).incrementBy(1) 5. EXIT Map function
6. PRINT Cheese_Burst_Small
7. //Ideally Cheese_Burst_Small will be 0 as cheese burst is available for medium and //large. But if there is error in data entry that would be seen in this case.
3.5 To find out the number of cheese burst pizzas whose cost is below Rs 500
1. Map(Key, Row) // Taking input form output of question 12. getCounter(“ Cheese_Burst_Cheap ”) 3. IF(Row[6] == “Y” AND Row[9] < 500)
4. getCounter(“ Cheese_Burst_Cheap ”).incrementBy(1) 5. EXIT Map Function
6. PRINT Cheese_Burst_Cheap //Prints number of cheese burst pizza sold below //Rs.500
4.
getCounter(“Orders”) function is not available and write the algor ithms for functions in
question-3.
4.1 To find out the number of veg and non-veg pizzas sold.
1. Map(Key, Row) // Taking input form output of question 1 2. IF(Row[12] == “Y”)
3. Pizza_type = “Veg” 4. IF(Row[12] == “N”)
5. Pizza_type = “Non-Veg” 6. Write(Pizza_type,1)
7. EXIT Map Function 8.
9. Reduce(key, ValueList) //Taking aggregated output of Map Function as input 10. Pizza_count = 0
11. for i = 0 to ValueList.length
12. Pizza_count = Pizza_count + 1 13. Write(key, Pizza_count)
14. Exit Reduce function
15. //output will be the number of veg/Non-veg pizzas sold
4.2 To find out the size wise distribution of pizzas sold
1. Map(Key, Row) // Taking input form output of question 1 2. IF (Row[7] == “S”) 3. Pizza_size = “Regular” 4. IF (Row[7] == “M”) 5. Pizza_size = “Medium” 6. IF (Row[7] == “L”) 7. Pizza_size = “Large” 8. Write(Pizza_size,1)
9. EXIT Map function 10.
11. Reduce(Key, ValueList) //Taking aggregated output of Map Function as input 12. Size_count = 0
13. for i = 0 to ValueList.length
14. Size _count = Size _count + 1 15. Write(key, Size_count)
16. Exit Reduce function 17.
18. Distribution(key, Size_count_list) // Taking the output of Reduce function as input 19. For(i = 0 to 2){
20. IF(Key == “Regular”) // here Regular, Medium, Large are integer variables 21. Regular = Size_count[i] 22. IF(Key == “Medium”) 23. Medium = Size_count[i] 24. IF(Key == “Large”) 25. Large = Size_count[i] 26. } 27.
28. Total = Regular + Medium + large
29. Distribution_small = (Regular / Total)*100 30. Distribution_medium = (Medium / Total)*100 31. Distribution_large = (Large / Total)*100 32.
33. PRINT Distribution_small, Distribution_medium, Distribution_large
4.3 To find out how many cheese burst pizzas were sold
1. Map(key, Row) // Taking input form output of question 1 2. IF(Row[6] == “Y”)
3. Crust = “Cheese_burst”
4. ELSE
5. Crust = “other” 6. Write(Crust, 1)
7. EXIT Map function 8.
9. Reduce(Key, ValueList) //Taking aggregated output of Map Function as input 10. CB_count = 0
11. for i = 0 to ValueList.length 12. CB_count = CB_count +1 13. IF(Key == “Cheese_burst ”)
14. Write(key, CB_count) //Output will be total number of Cheese burst 15. ELSE //pizzas sold, else no output
16. Return -1 17. Exit Reduce Function
4.4 To find out how many small cheese burst pizzas were sold.
1. Map(Key, Row)
2. IF(Row[6] == “Y” AND Row[7] == “R”)
3. CB_Pizza_Size = “Cheese burst Small”
4. ELSE
5. CB_Pizza_Size = “Cheese burst other ” 6.
7. Write(CB_Pizza_Size,1) 8. EXIT Map function 9.
10. Reduce(Key, ValueList) //Taking aggregated output of Map Function as input 11. CB_size_count = 0
12. for i = 0 to ValueList.length
13. CB_size_count = CB_size_count + 1 14.
15. IF(Key == “Cheese Burst Small ”) 16. Write(Key, CB_size_count) 17. ELSE
18. Return -1 19. EXIT Reduce Function
//Ideally CB_size_count will be 0 as cheese burst is available for medium and large sizes. Here the Map function would always exit before the Write command as there are no small cheese burst pizzas available. But if there is error in data-set that would be seen in this case.
4.5 To find out the number of cheese burst pizzas whose cost is below Rs.500
1. Map(Key, Row)2. IF(Row[6] == “Y” AND Row[9] < 500)
3. CB_cheap = “Cheese burst Price < 500 ”
4. ELSE
5. CB_cheap = “Cheese burst Price > 500 ” 6. Write(CB_cheap, 1)
7. EXIT Map function 8.
9. Reduce(CB_cheap, Valuelist) //Taking aggregated output of Map Function as input 10. CB_cheap_count = 0
11. for i = 0 to ValueList.length
12. CB_cheap_count = CB_cheap_count + 1 13. IF(Key == Cheese burst Price < 500 ”)
14. Write(CB_cheap, CB_cheap_count) //output will be “Cheese burst Price < 500, //<CB_cheap_count ’s Value>”. Else no output 15. ELSE
16. Return -1 17. EXIT Reduce Function