The Next Leap in JavaScript Performance
Mohammad Reza Haghighat
Senior Principal Engineer, Intel Corporation
•
HTML5 - The New Lingua Franca?
•
Exposing the full power of modern hardware to JavaScript
*
•
Bringing Perceptual Computing to the web platform
•
Supporting JavaScript programming in Internet of Things (IoT)
•
Summary
Agenda
HTML5 – The New Lingua Franca?
Native code
PC spiral
1991
APPS
.exe
2001
WEB
HTML, Flash
*Web – “Write once, run
on any browser”
2009
APPS
iOS*, Android*, Windows*App Stores
Walled Gardens
2015
WEB
HTML5
“Write Once, Run
Everywhere”
“New open standards created in the mobile era, such as
HTML5, will win on mobile devices.” – Steve Jobs
“If you want to do something that is universal, no
question, world is going HTML5.” – Steve Ballmer
“It looks to me like HTML5 will eventually become a way almost all
applications are built,
including those on new phones
.” – Eric Schmidt
Web: The Ubiquitous Software Platform
and the Application Model of the Future
Big Data
Rich Capabilities
& Content
Social
Contextual
Crowdsourced
Sensors
“Things”
4
•
HTML5 - The New Lingua Franca?
•
Exposing the full power of modern hardware to JavaScript
*
•
Bringing Perceptual Computing to the web platform
•
Supporting JavaScript programming in Internet of Things (IoT)
•
Summary
Agenda
Achieving
~ 1.5x native running time
via
targeting
asm.js
†
, a highly optimizable subset of
JavaScript defined by Mozilla
Astounding JavaScript
*
Performance With asm.js
asm.js : a highly optimizable low-level subset of JavaScript
http://www.unrealengine.com/html5/
Over 1M lines of C/C++ code
compiled to JavaScript
*by
Mozilla
*and Epic
Epic
*Games Unreal Engine
*3
† Courtesy of Mozilla Alon Zakai & Luke Wagner (http://people.mozilla.org/~lwagner/gdc-pres/gdc-2014.html#/)
asm.js
Emscripten
JavaScript
*
web
LLVM Bitcode
Very efficient code generated by Firefox
*JIT
Modern processors utilize parallelism to deliver high
performance within a constrained power budget
The March of Parallelism
2002
2006
2008
2012
32 nm Tock
2010
2011
2012
2013
22 nm Tick
22 nm Tock
Intel® Advanced Vector Extensions
AVX2
FMA and integer
support
AVX
256-bit floating point
1X=128-bit
Since 2001
Next Gen
Intel® Xeon Phi
TMAVX-512
512-bit vectors
8X
peak SIMD
operations per
core over 4
generations
2X
2X
2X
7Optimizing Web Runtimes for Parallelism
Web runtimes need to be parallel end-to-end
Parse + build
DOM
JavaScript
*
Layout Engine
Render
GPU: parallel
CPU: mainly single-threaded
35
%
33%
21
%
11%
Render
35%
Layout
33%
Other
21%
JS
11%
•
HTML5 runtimes of today are not scalable with number of cores
•
Need parallelism for both responsiveness and energy efficiency
Parallel Parsing and Compilation
Background JIT compilers now in Chrome
*
, Firefox, Internet Explorer
*
, Safari
*
PESPMA 2009
Four threads for JavaScript* parsing
and compilation JS and GFX execution
Epic
*
Citadel
*
profile on Firefox
*
43.6 16.6 12.8 6.7 6.4 6.2 4.6 2.2 0.9
Cycle Breakdown by Categories
js::compile gfx::compile os::others js::parse js::others browser::others os::mem js::jitted gfx::exec
bootstrap
launch
4
threads1
thread 9Layout Engine: a performance bottleneck
Mozilla
*
Firefox
*
Page-Load Tests
Zimbra
*
Collaboration Suite
*
ul em {color:blue}
CSS rule matching
~33% of the layout
HotPar 2010
Browser layout engine is a bottleneck but amenable to parallelism
10
Layout Engine
~42% execution
Parallel JavaScript
*
•
Started at Intel Labs, now with Mozilla
*
•
Extends JavaScript
*
with a data-parallel API
•
Designed for multi-core CPUs and GPUs
•
Simple, portable, and secure
Array increment example:
A.map(function(a) {return a+1;});
A.mapPar(function(a) {return a+1;});
Sequential
Parallel
Accelerated animation of 3D avatars:
more characters
and more realism
Parallel JavaScript goal is to enable data-parallelism in web applications
11SIMD – Single Instruction, Multiple Data
SIMD operations deliver great performance & power efficiency
Scalar Operation
C
x
C
y
C
z
C
w
=
=
=
=
A
x
A
y
A
z
A
w
B
x
B
y
B
z
B
w
+
+
+
+
C
x
C
y
C
z
C
w
A
x
A
y
A
z
A
w
B
x
B
y
B
z
B
w
+
=
SIMD Operation of
Vector Length 4
Intel
®
Architecture currently has SIMD
operations of vector length 4, 8, 16
SIMD - A Gap Between JavaScript
*
and Native
SIMD in JavaScript further reduces the performance gap
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
A Google
*/Intel/Mozilla
*ECMA TC39 Joint Project
•
Bugzilla
*
:
https://bugzilla.mozilla.org/show_bug.cgi?id=894105
•
John McCutchan’s strawman proposal:
http://wiki.ecmascript.org/doku.php?id=strawman:simd_number
C++ code for list average
“Proposed” JavaScript
*
code
SIMD code by ICC
SIMD.JS – The API
† Initial support for float32x4 and int32x4
Our SIMD prototype delivers 3x~4x Mandelbrot speedup
†
Our Firefox
*Prototype
Demo: Combining SIMD and Higher-Level Parallelism
SIMD speedup is nicely multiplied by WebWorkers
†
†
Source: Intel
®Peter Jensen :
https://github.com/PeterJensen/mandelbrot
WW: Number of WebWorkers
Our Chromium
*Prototype
SIMD Speedups on our Chromium
*
Prototype
3.2 3.6 3.8 3.9 4.6 5.0 6.0 9.5 3.2 3.8 3.4 6.1 6.5 5.0 5.6 11.8 6.8 3.1 2.7 4.5 4.2 3.8 5.4 9.30
2
4
6
8
10
12
14
Transpose4x4 AOBench Mandelbrot MatrixMultiplication VertexTransform Average ShiftRows Matrix4x4Inverse
SIMD x-times faster than non-SIMD
3
rdGeneration Intel
®Core
™i7 processor (3667U)@ 2.00 GHz, 32-bit, Ubuntu
*13
3
rdGeneration Intel
®Core
™i7 processor (3667U)@ 2.00 GHz, 64-bit, Ubuntu
*13
Intel
®Atom
™processor Z3770 @ 1.46GHz, Android
*4.4
Excellent early results while still focused on functionality
Theoretical speedup limit is 4
SIMD.JS benchmarks: https://github.com/johnmccutchan/ecmascript_simd/tree/master/src/benchmarks
SIMD.JS Proposal and Polyfill API
SIMD Number
(Google’s John McCutchan & Intel’s Peter Jensen):
http://wiki.ecmascript.org/doku.php?id=strawman:simd_number
Polyfill API:
https://github.com/johnmccutchan/ecmascript_simd
float32x4, int32x4, Float32x4Array, Int32x4Array
Constructors:
float32x4(x,y,z,w) float32x4.zero() float32x4.splat(s)
Operations:
abs, neg, add, sub, mul, div, clamp, min, max, reciprocal, reciprocalSqrt,
scale, sqrt, shuffle, shuffleMix, withX, withY, withZ, withW, lessThan, lessThanOrEqual,
equal, notEqual, greaterThanOrEqual, greaterThan, bitsToInt32x4, toInt32x4, …
The joint Google
*
/Intel/Mozilla
*
SIMD.JS proposal was approved to advance to
the next stage of ECMAScript
*
TC39 standardization stage
†
† A copy of the TC39 Presentation: http://esdiscuss.org/notes/2014-07/simd-128-tc39.pdf
Emscripten now targets SIMD.JS
Emscripten generates SIMD.JS from C++ SIMD intrinsics & auto-vectorized code
Near-native SIMD.JS speedup
C/C++
JavaScript
*
1.00
2.03
7.18
8.13
0
2
4
6
8
10
Speedup over Scalar JS
Scalar JS
Scalar C++
SIMD JS
SIMD C++
Crosswalk
in Brief
Application Runtime
Follow us at @xwalk_project
crosswalk-project.org
Open Source, using
Blink
*& Chromium
*Today on Android
*and Tizen
*Easy addition of
extensible APIs
Easy access to
device APIs
Intel
®
platform
capabilities
Latest HTML5 features in
packaged web apps
Focuses on security, performance
and standards compliance
Based on web technologies:
HTML5, CSS3, JavaScript
*Updated & released to the latest
Chromium every 6 weeks
Intel
®
XDK – Cross-platform Development Kit
Develop, debug, profile, and build responsive web & hybrid apps
Free at
http://xdk.intel.com
Remote
debugging & profiling
•
HTML5 - The New Lingua Franca?
•
Exposing the full power of modern hardware to JavaScript
*
•
Bringing Perceptual Computing to the web platform
•
Supporting JavaScript programming in Internet of Things (IoT)
•
Summary
Agenda
Toward Perceptual Computing
†
Devices sense & perceive user actions in a natural & intuitive way
† Source: Intel® Perceptual Computing SDK: www.intel.com/software/perceptualSpeech Recognition
Close-Range Tracking
Gesture Recognition
2D/ 3D Object Tracking
Facial Analysis
22Reinventing Everyday Usages
Perceptual Computing opens up new dimensions in interacting with machine
Learning & Education
3D Scanning and Sharing
Scan it
Share it
Customize
& Print it
Immersive Collaboration
Gaming
Out-of-reach
Device Input
Demos: Media Capture Depth Stream Extension
†
†
Source: Intel® Ningxin Hu:
https://github.com/huningxin/depth_stream_examples
WebRTC Google* Code: http://webrtc.googlecode.com/svn/trunk/samples/js/demos/html/ Magic Xylophone: Soundstep*.com: http://www.soundstep.com/blog/experiments/jsdetection/
Enabling 3D Camera on Web Platform
3D Camera
•
Beyond color: additional per-pixel distance
•
Intel® RealSense™ on PC & tablets soon
Applications
•
Real-time hand/finger/object tracking
•
3D scanning
•
Video conferencing
Depth on Web Platform
†
•
Media Capture Depth Stream Extension
•
Rendering & post-processing: <video>, <canvas>, WebGL
*
and SIMD.JS
•
Streaming: transmit as MediaStream via WebRTC RTCPeerConnection
† Source: Intel® Ningxin Hu: https://github.com/huningxin/depth_stream_examples
Proposed Media Capture Depth Stream Extension
†
† Source: http://w3c.github.io/mediacapture-depth/Web Application
Browser or HTML5 runtime
RGB
Stream
Depth
Stream
getUserMedia (WebRTC) API
Gaming
Wireless Display for the Web
Unlock exciting new user experiences in HTML5
Presentation
† Big Buck Bunny video: http://www.bigbuckbunny.org/
Media Sharing/Casting
†
•
Connects web content to screens around you
•
Hides display connection technologies from
the developer
•
Apple
*
AirPlay
*
, Microsoft
*
PlayTo
*
,
*
Chromecast
*
, Miracast
*
, Intel® Widi
•
Simple, high level API, easy to use
http://webscreens.github.io/presentation-api/
HTML5 Presentation API Proposal
†
† Source: Intel® Dominik Röttsches
New standards-based feature for the cross-platform web
•
HTML5 - The New Lingua Franca?
•
Exposing the full power of modern hardware to JavaScript
*
•
Bringing Perceptual Computing to the web platform
•
Supporting JavaScript programming in Internet of Things (IoT)
•
Summary
Agenda
Intel® XDK
IoT Edition
Companion Apps
Streamlined Workflow
Design, Test, and Build Tools
•
Quick start samples and templates
•
Built-in editor and emulators
•
UI Frameworks and Apache Cordova* APIs
•
Test and debug tools
•
Integration with Cloud Services APIs
Design and build cross-platform companion apps easily
for Android*, iOS*, and Windows*
Intel®
XD
K
IoT Edition
JavaScript* apps on IoT
devices
Integrated Development Environment
Create, Debug, and Run Tools
•
JavaScript allows easy on-board app development
and deployment for many IoT devices
•
Use JavaScript to define behavior of IoT device
•
Deploy, run, debug on IoT device with JavaScript
•
Integration with cloud, web services, and sensors
through JavaScript APIs
IoT Device
Edit JavaScript app
Send app to device
Run app remotely
Remote debug
Development Platform
Development System
Internet of Things (IoT) Device (Intel® Galileo):
•
PWM Led Controller on I2C bus
•
RGB Led
•
Node.js with Socket.io server
HTML App (Lenovo
*
K900):
•
Socket.io connection to IoT device
•
Change lighting color
•
Cordova
*
App
Both made using:
Demo: Programming Internet of Things using Intel® XDK IoT Edition
† Source: Intel® Dan Yocom: http://xdk-software.intel.com/iot_edition_demo_video
RGB Lighting
†
Intel® XDK IoT Edition
•
HTML5 - The New Lingua Franca?
•
Exposing the full power of modern hardware to JavaScript
*
•
Bringing Perceptual Computing to the web platform
•
Supporting JavaScript programming in Internet of Things (IoT)
•
Summary
Agenda
•
HTML5 is closing the gaps with native models
•
SIMD in JavaScript
*
enables a large new class of high-performance apps
•
JavaScript is about to get a lot faster for such domains as gaming
•
Depth Camera support in HTML5 WebRTC enables exciting use cases
•
JavaScript is proliferating rapidly in Internet of Things
•
Intel® XDK supports end-to-end programming for Internet of Things
•
HTML5 is the application model of the future
Summary
Web: The Ubiquitous Software Platform
and the Application Model of the Future
Big Data
Rich Capabilities
& Content
Social
Contextual
Crowdsourced
Sensors
“Things”
35
Download Firefox
*
Nightly and experience
†
the benefits of SIMD.JS
Leverage the power of SIMD.JS through
Intel® XDK
and
Crosswalk
Download Intel® XDK free at
http://xdk.intel.com
Call to Action
†
SIMD.JS demos:
http://peterjensen.github.io/idf2014-simd
Intel® Developer Zone
•
Free tools and code samples
•
Technical articles, forums and tutorials
•
Connect with Intel and industry experts
•
Get development support
•
Build relationships
Tools. Knowledge. Community.
software.intel.com
Legal Disclaimer
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm
Intel, Core, Atom, Xeon Phi, RealSense, Look Inside and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others.
Copyright ©2014 Intel Corporation.