Project Discussion
Multi-Core Architectures and Programming
Oliver Reiche, Christian Schmitt, Frank Hannig
Hardware/Software Co-Design, University of Erlangen-Nürnberg May 15, 2014
Administrative Trivia
Tutors • Oliver Reiche • Christian Schmitt • Sascha Roloff • Frank Hannig PlaceRoom 01.255-128 Department of Computer Science Cauerstr. 11
91058 Erlangen
GPU Hardware
Nvidia GPUs• codesigns30: GeForce 8800 GTS 512
• codesigns42: GeForce GTX 285
• codesigns43: GeForce GTX 285
• codesigns46: Tesla C2050 (SSH only)
• codesigns46: Tesla K20 (SSH only) AMD GPUs
• codesignsXX: Radeon HD 5870
• codesignsXX: Radeon HD 6970 ARM GPUs
Hardware
Tilera• codesigns35: Tilera TILEPro64 AMD CPU
• codesigns48: AMD Opteron Quad-Core (48 cores) OpenMPI Cluster
• all machines in room 02.133-128:
• 20
×
Intel Core i7, 4 cores (8 SMT cores), 8GiB RAMSoftware
Nvidia CUDA 6.0 • OpenCL 1.1 AMD APP SDK 2.9 • OpenCL 1.2 Android Renderscript • SDK 17 (Android 4.2.2) • NDK 9d Tilera MDE 3.0• Threading Building Blocks 3.0 MPI
• OpenMPI 1.6
Environment CUDA
• set environment for CUDA m o d u l e load cuda
• CUDA SDK (/opt/cuda/sdk)
• examples (sources) in/opt/cuda/sdk/C/src
• binaries in/opt/cuda/sdk/C/bin/linux/release • link against CUDA SDK
add the following to your Makefile:
# set d i r e c t o r y for c o m m o n . mk C U D A _ S D K _ P A T H ?= / opt / cuda / sdk R O O T D I R := $ ( C U D A _ S D K _ P A T H )/ C / src R O O T B I N D I R := bin R O O T O B J D I R := obj i n c l u d e $ ( C U D A _ S D K _ P A T H )/ C / c o m m o n / c o m m o n . mk tidy c l o b b e r c lea n : @rm - rf bin obj
Environment OpenCL
• set environment for OpenCL on Nvidia GPUs m o d u l e load cuda
• CUDA SDK (/opt/cuda/sdk)
• examples (sources) in/opt/cuda/sdk/OpenCL/src
• binaries in/opt/cuda/sdk/OpenCL/bin/linux/release • link against CUDA SDK
add the following to your Makefile:
# set d i r e c t o r y for c o m m o n . mk C U D A _ S D K _ P A T H ?= / opt / cuda / sdk R O O T D I R := $ ( C U D A _ S D K _ P A T H ) R O O T B I N D I R := bin R O O T O B J D I R := obj i n c l u d e $ ( C U D A _ S D K _ P A T H )/ O p e n C L / c o m m o n / c o m m o n _ o p e n c l . mk tidy c l o b b e r c lea n : @rm - rf bin obj
Environment Tilera
• set environment for Tilera m o d u l e load t i l e r a
• Tilera MDE / opt / t i l e r a
Environment Renderscript
Either use SDK or NDK:SDK simply Eclipse
NDK link against native Renderscript (C++) add the following to your CMakeLists.txt f i n d _ p a c k a g e ( R e n d e r S c r i p t R E Q U I R E D ) file ( GLOB P R O J E C T _ C P P *. cpp ) file ( GLOB P R O J E C T _ R S *. rs *. fs ) r s _ w r a p _ s c r i p t s ( P R O J E C T _ C P P $ { P R O J E C T _ R S }) r s _ d e f i n i t i o n s () r s _ i n c l u d e _ d i r e c t o r i e s () r s _ l i n k _ l i b r a r i e s ( $ { P R O J E C T _ N A M E } stdc ++) r s _ a d d _ e x e c u t a b l e ( $ { P R O J E C T _ N A M E } $ { P R O J E C T _ C P P })
Environment OpenMPI
• set environment for OpenMPI:
e x p o r t PATH =/ usr / lib 64 / mpi / gcc / o p e n m p i / bin : $ { PATH }
e x p o r t L D _ L I B R A R Y _ P A T H =/ lib 64 :/ usr / li b64 : $ { L D _ L I B R A R Y _ P A T H } advice: put this into your shell’s config file (e.g..bashrc)
• execute programs
m p i r u n - np 32 -- m a c h i n e f i l e m y m a c h i n e s m y p r o g r a m A machine file will be provided.
Login
Each group gets one account: SSH + (Mercurial | Git | Subversion) SSH
• one user account per group:mappraktX • external login via gatewaycodesigns14:
ssh m a p p r a k t X @ c o d e s i g n s 1 4 . i n f o r m a t i k . uni - e r l a n g e n . de
• from there allcodesignsXXservers are reachable
Repositories
Mail us your public SSH key and choose a repository type:
Mercurial hg clone ssh://[email protected]/seminar/map14/mappraktX Git git clone ssh://[email protected]/seminar/map14/mappraktX.git Subversion svn co svn+ssh://[email protected]/seminar/map14/mappraktX
Tentative Schedule
• status meetings every two weeks
• project presentations
• CW 28: 07.07.14 – 11.07.14
• CW 29: 14.07.14 – 18.07.14 • project: up to 2 students form a group
• project presentation:
• duration: (20+5) min
• group presentation
Projects
Each group selects and implements one application:
• mathematical problems
• N-Queens • image processing
• Harris corner detection
• optical flow calculation
• Viola Jones object detection
• bilateral grid filter
• local Laplacian • image compression
• JPEG, JPEG 2000 (complex) • partial differential equation (PDE) solver
Projects
Suggest own topic, e. g.
• neural networks, ant simulation, pattern matching
• computational fluid dynamics (CFD)
Free choice of target architecture and programming environment:
• GPU / embedded GPU / Tilera / 48-core CPU / MPI cluster
• CUDA / OpenCL / Renderscript / TBB / pthreads / MPI Tell us your choice by Thursday next week (22 May 2014)