Continuous Delivery for
Linux/Windows/Hadoop
Agenda
•
Background
•
Problem
•
Solution
•
Demo
•
Q & A
Who I am
•
Wisely Chen ( [email protected] )
•
Release manager of Yahoo! Taiwan shopping and data team
•
Enjoys promoting open source tech
•
Hadoop Summit 2013 San Jose•
Ruby and Rails (All in Taiwan): Coscup 2006, Ubisunrise 2007, OSDC 2007•
Release Practice (All in Taiwan): Webconf 2013, Coscup 2012,2013, PHPConf 2012 , RubyConf 2012Team
Sanching Yu
Windows
Neal Lee
Hadoop
Yahoo! Taiwan Shopping
•
Top B2C site in Taiwan
•
Monday.com acquired in 2008
•
8 engineering teams
•
15K files changed monthly
Platform
Backend
(Window)
Frontend
(Linux)
(Hadoop)
Data
Catalog Member Payment Cash Flow Logistic Supplier Machine Learning Business Intelligence Data Mart
Problem
Frontend
Team
Backend
Team
DBData
Team
Hadoop WebService APIWe need a
multi-platform
Solution
•
Create B&R process
•
Build CI infrastructure
Solution
•
Create B&R process
•
Build CI infrastructure
Deploy Check Point Perf Test Integration Test Unit Test
Build and Release Process
Deploy Check Point Perf Test Integration Test Unit Test
Hadoop Build and Release Process
Original Difference
Frontend Team
Backend Team
Data Team
Process
Scrum
Waterfall
Waterfall
OS
Linux
Windows
Linux
Language
PHP/JAVA
.NET
Pig/MapReduce
Issue Tracking
Bugzilla
Custom System
N/A
Repos
Subversion
VSS
N/A
CI
N/A
N/A
N/A
Int Test
N/A
N/A
N/A
Multiplatform Complexity Reduction
Frontend Team
Backend Team
Data Team
Process
Scrum(not finish yet)
Scrum(not finish yet)
Scrum(not finish yet)
OS
Linux
Windows
Linux
Language
PHP/JAVA
.NET
Pig/MapReduce
Issue Tracking
Bugzilla
Bugzilla
Bugzilla
Repos
Subversion(migrate to git)
Subversion(migrate to git)
Subversion(migrate to git)
Communication Cost
Solution
•
Create B&R process
•
Build CI infrastructure
WebSite CI Infrastructure
Jenkins Master Windows Jenkins Slave Farm Selenium Server Farm Linux Jenkins Slave FarmBeta Server Farm Windows
Farm
Linux Farm Staging Server Farm
Windows Farm
Linux Farm
Hadoop CI Infrastructure
Jenkins MasterAlpha
Jenkins SlaveBeta Cluster
Hadoop JobTracker Jenkins Slave Hadoop node Hadoop node Hadoop node HadoopnodeNodeSlave
Prod. Cluster
Gateway PigServer
Deploy Check Point Perf Test Integration Test Unit Test
Build and Release Process
PigUnit
•
A simple xUnit framework
•
No cluster set up is required in local mode
•
We change default PigUnit behavior
•
All PigUnit tests run on single pig server
Deploy Check Point Perf Test Integration Test Unit Test
Build and Release Process
Vaidya
•
Rule based performance diagnosis of M/R jobs
•
Extendable framework
•
You can add your own rules
Vaidya Performance Test
Pig Job Pig Job Conf Sampling DataVaidya Performance Test
Pig Job HistoryPig JobPig Job Conf
Sampling Data
Vaidya Performance Test
Pig Job HistoryPig JobVaidya
Vaidya Rule Pig Job Conf Sampling DataVaidya Performance Test
Pig Job HistoryPig JobVaidya
Vaidya Rule Pig Job Conf Notify User Sampling DataVaidya Performance Test
Pig Job HistoryPig JobVaidya
Vaidya Rule Pig Job Conf Notify User Performance Result Next CI Stage Sampling Data<Diagnos)cTest>
<Title><![CDATA[
Balanaced Reduce Partitioning
]]></Title>
//////<ClassName>
//////////<[CDATA[org.apache...BalancedReducePar))oning]]>
///////</ClassName>
///////<Descrip)on>
///////////<![CDATA[This/rule/tests/...]]>
///////</Descrip)on>
///////<Importance><![CDATA[
High
]]></Importance>
///////<SuccessThreshold><![CDATA[
0.20
]]></SuccessThreshold>
See/if/the/reduce/job/is/ balanced/or/not/
Vaidya Rule Example
Solution
•
Create B&R process
•
Build CI infrastructure
Deploy Check Point Perf Test Integration Test Unit Test
Build and Release Process
Before Deploy
•
Check bug status
•
Documention
•
Create a git tag
•
Auto doc generating on bugzilla
•
Deploy to production cluster
Auto comment in bugzilla
Repos url
Release Note
Auto create git tag
Release Note [Bug xxx] log....
Application Logic Monitor
0 25 50 75 100 2-4 4-6 6-8 8-10 10-12 12-14 14-16 16-18 18-20 20-22 22-24 24-2Today
Same day in last week
Same day in last month
Notify user of something wrong SalesNum
25 50 75 100
11Q4: Original
25 50 75 1000 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
11Q4: Original
0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3 High Bug Rate(72%)25 50 75 100
11Q4: Original
25 50 75 1000 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
25 50 75 100
12Q1 : Create B&R Process...
0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
25 50 75 100
0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
25 50 75 100
0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
12Q4 : Build CI Infrastructure...
Dramatic throughput increase
25 50 75 100
0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
25 50 75 100
0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
25 50 75 100
13Q3: Automate Everything...
Deploy time decrease0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Time
Hadoop CI Demo
•
Demo1 : Unit test
•
Demo2 : Performance test
Conclusion
•
This process can reduce bug rate
•
This process can increase productivity
22.5 45 67.5 90
ReleaseBugRate
ReleaseThroughput
DeployCostTime
Web Site CI Process
Backend
3 Team
(Window)
Cash FlowFrontend
2 Team
(Linux)
Catalog Mobile Member Payment Logistic Supplier CMSData
2 Team
(Hadoop)
Machine Learning Business Intelligence Data Mart CRMBuild and release process
PHP Code Commit PHP Unit Test Deploy Linux Beta Beta Auto Test Deploy PHP Staging Staging Auto Test Staging UAT Check Test Status Release Meeting .NET Code Commit .NET Unit Test Deploy Win Beta Deploy .NET Staging Deploy PHP Prod0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Cost Time
Throughput increased dramaticly
Bug rate down to 22%
25 50 75 100
Significant bug rate drop
0 25 50 75 100 11 Q4 12 Q1 12 Q2 12 Q3 12 Q4 13 Q1 13 Q2 13 Q3
Release Bug Rate
Release Throughput
Deploy Cost Time
Throughput increased dramaticly
Bug rate down to 22%
25 50 75 100
Deploy cost time drop
Two B&R Processes
Frontend
Team
Backend
Team
Team
Data
Website
Automate Everything
Check Point Script Bugzilla Source Repos Log Deploy Script Notify User ScriptBug List Yes
No Check Bug Status
Jenkins
Style/Static/ Similarity