Using Power to Improve C Programming Education
Jonas Skeppstedt
Department of Computer Science Lund University
Lund, Sweden
[email protected] jonasskeppstedt.net
Outline
Background and Problem Our approach
Forsete — an Automatic Grader
Advantages with the Power Architecture Conclusion and near future
Background and Problem 1(2)
There are two courses on C in Lund:
C Programming — focus on clean code plus ISO C Standard.
Algorithm Implementation — focus on efficient C.
The C11 atomic types, memory model, and multithreading is taught in the Multicore Programming course.
Background and Problem 2(2)
Previously the programming assignments were graded manually by teaching assistants during weekends.
The grading is very strict so most students need multiple iterations.
Problems with this approach:
72 hour latency from handin to reject — with occasional pass.
Since passed assignments are required for writing the exam, it often became stressful for some students.
It costs money to pay the TA’s.
A Different Approach — Automatic Grading
To eliminate these problems I wrote an automatic grader which cuts the latency to a few minutes (email plus a grading queue).
Students can try any number of times and almost all were finished before the written exam.
A new challenge is to motivate students to performing their best despite only a machine sees their code.
A Competition: Memory Efficiency
Assignments which pass all tests are assigned a score.
The score is the size of static data and code of their file.
Assignments with the same score are sorted by a timestamp.
There is a new assignment each week and ranks are accumulated:
RPN calculator
Find longest word in input Polynomial multiplication
After two assignments, three students each had accumulated four points.
The Prize
The Automatic Grader Forsete — forsete.net
Forsete is a judge in Nordic mythology who is always fair.
The Forsete program runs as root, fetches mails, and grades the code.
The score is then sent back with disassembled Power machine code.
You can try it by sending an email of the form:
Subject: assignment poly by username
Make the code as small as possible — my score is 735 bytes.
Sample input can be found at the site of the course book
Writing Efficient C Code: A Thorough Introduction, 2nd ed.:
writing-efficient-c-code.com
Forsete
Checks the source code against the Linux kernel style guide.
Creates a random problem for input, runs a reference implementation, and records heap usage by the reference implementation.
Forks to compile the code.
Forks, limits stack area, changes root directory and switches Unix uid.
Executes the program and checks heap usage and output for the random input and for corner cases.
A timeout kills too slow programs — this happened often and was very valuable to many students.
At most 4 times the reference heap size is allowed and no leaks.
Encouraging Simplicity and Elegance
We want students to learn writing simple and elegant C code.
Code efficiency is the focus of a different course (EDAF15).
Elegant code often is memory efficient.
Checking the high score list gives important feedback.
For the Longest Word, the scores ranged from 189 to 767 bytes.
So we need to create a desire to scrutinize machine code. How?
Advantages with the Power Architecture 1(3)
The generated code should be relatively predictable, and easy to read, including register usage. Power advantages:
fixed sized instructions — simplifies reasoning about size, large register sets, and
regular addressing modes.
Availability of mature optimizing compilers: gcc -Os is great.
Anton Klarén, winner of the 2015 EDAA25 Lund University Memory Efficient C Code Programming Competition:
The gcc compiler for Power does not generate any instruction that you don’t understand what it does or why it is there!
Advantages with the Power Architecture 2(3)
Easy access to detailed online documentation was also important.
Also, the course book introduces Power.
Availability of good development platforms — either e.g. a
POWER8 server or, as in Lund, several 4-way multiprocessors based on IBM’s 970MP clocked at 2.5 GHz.
We use Power not only for the C Programming course but also in
Multicore Programming Algorithm Implementation
and then development machines with good performance are essential.
Advantages with the Power Architecture 3(3)
In the Multicore Programming course, the advanced memory model of Power lets students explore what theory really means in terms of performance — Forsete was used for a parallel graph problem
(dataflow analysis) and here the score is execution time.
The winners were Valdemar Roxling and again Anton Klarén.
Availability of detailed pipeline simulators is yet another important advantage for Power when selecting a platform for CS education.
The pipeline visualizer (scrollpv) from IBM Austin has been invaluable in making students understand the performance of
superscalar processors and branch prediction, the reorder buffer (global completion table) and rename registers.
MSc Theses Using Power
Karl Hylén: Processor Models for Instruction Scheduling using Constraint Programming — first ever work in the area with
measurements on a real machine.
Anton Botvalde and Andreas Larsson: Performance Evaluation of ISO C restrict on the Power Architecture — noted why a deleted floating point load instruction could make a program slower —
valuable insight for compiler writers.
For both of these, using the Power architecture was crucial primarily because of the interesting machine, the detailed documentation from IBM of the 970MP pipelines, and IBM’s pipeline simulator.
Conclusion and Near Future
Also for universities, the Power Architecture is a fantastic platform.
In the Optimizing Compilers course in September we will use Power.
My book An Introduction to the Theory of Optimizing
Compilers with Performance Measurements on Power will be available in August.
It will have comparisons of clang, gcc, and my C compiler, which was validated for ISO C99 conformance in 2003.
The mentioned M.Sc. theses can be downloaded from:
jonasskeppstedt.net/theses
Resources and Remarks
The pipeline simulator is available as the Performance Simulator for Linux on Power (sim_ppc) in the SDK for Linux on Power at IBM.
Programming assignments with math problems are most appreciated.
It can obviously be dangerous to execute unknown arbitrary code.
By changing root directory, switching Unix uid, disabling the network for this uid it is safe.
It is important to make competitions intensive — the best competing students tend to spend a lot of time on this and that cannot go on for much more than three weeks.