• No results found

My last several articles have covered

different software packages that are useful to scientists trying to do computational science. I tried to explore as broad a spectrum of subjects as I could in those pieces, and I even covered some basic programming constructs like MPI or scipy.

But, I always have been limited by the amount of space a venue like this allows.

All I can do is provide a taste of what is out there and hope that readers take it away and learn more on their own. Also, in many cases people find themselves doing research in areas that never have been done by anyone else before. This means there will not be an appropriate software package, and researchers will need to write their own software from scratch.

One major problem for computational science researchers is that they simply do not have the time to attend normal classes over the span of a term or two in order to learn the skills they need to do their work. They need to be able to jump-start their research and essentially go from zero to 100mph in no time flat. Part of my day job is to help them do this. I provide crash courses in most of the subjects that they may need. But, what can they do when they leave and try to apply this information several days or weeks later? Enter the Software Carpentry site (software-carpentry.org), a resource that should be on every researcher’s bookmark list. I have no association with the author

and maintainer of the site. I’m just glad to have a high-quality source of information to which I can point my users.

The first level of resources available is a set of self-paced on-line workshops.

These workshops are distributed under a creative commons license, specifically the Creative Commons Attribution License. This means you are free to use the material and remix it, as long as you properly attribute the author. These workshops cover a vast number of subjects and are available as both PDF and PowerPoint files. For some of the workshops, video screencasts even are available, so it’s almost like having an instructor right there with you. Each topic is broken down into smaller sections to make digesting them easier. Additionally, exercises are available so you can review the material.

Many new researchers, graduate students and post-docs have had little or no experience in computational

science at all. Many never even have seen any type of UNIX environment. This is quite a stumbling block, as most high-performance computing centers that I know of run Linux. So you probably will want to start with the workshop The Shell. This workshop is broken down into the following sections:

■ Introduction

■ Files and Directories

WWW.LINUXJOURNAL.COM / MAY 2012 / 23

[

UPFRONT

]

■ Creating and Deleting

■ Pipes and Filters

■ Permissions

■ Finding Things

■ Job Control

■ Variables

■ Secure Shell (SSH)

This list of items should make new Linux users comfortable enough to use the command line effectively. Because most HPC clusters are accessed through an SSH connection, being comfortable on the command line is essential.

The next workshop you should look at is the Version Control workshop. To my mind, this is one of the most important subjects to learn for computational science research.

This is a field where code constantly is being toyed with, by many different people over long periods of time. It is of utmost importance to be able to back out experimental changes in the code when something breaks or when you’ve gone down the wrong path in your research. But, almost no one uses a version control system.

So, just to make my life easier as a research consultant, please go ahead and check out this particular workshop. It will be one of the most useful workshops you could attend.

As far as programming workshops go,

the only language explicitly covered is Python. The Python workshop is relatively complete, and it covers the following:

■ Basics: running Python, variables, comparison operators.

■ Control Flow: while loops and conditionals.

■ Lists: creating, deleting and maintaining lists, for loops.

■ Input and Output: dealing with files.

■ Strings: handling strings, formatting, concatenation.

■ Aliasing: what it is and how it can cause problems.

■ Functions: what they are and how to use them effectively.

■ First-Class Functions: binding functions to variables, passing functions to functions.

■ Libraries: importing modules, dealing with namespaces.

■ Tuples: creating and indexing, unpacking lists.

■ Slicing: slicing vs. indexing, and so on.

■ Text: how lines and characters are stored, dealing with unicode.

[

UPFRONT

]

Python is relatively similar to other languages (like C), so you should be able to apply what you learn here to those other languages with just minor syntax translations. Also, Python is growing in popularity in scientific programming circles due to its clean formatting rules and the relative ease of incorporating external high-performance libraries written in C or FORTRAN. In this sense, you almost can consider Python to be a glue language, but it has quite a lot of capability available directly through external libraries like numpy and scipy. You could do worse as a computational scientist than learn Python.

With its growing popularity, there is also a greater chance that the specific problem area you’re researching already has tools or libraries available.

Once you have at least one language under your belt, it is time to learn more of the details involved in programming itself.

These workshops cover the following:

■ Program Design: goes through a simple example of designing,

debugging and improving a program.

■ Testing: how you should test your software, handle exceptions and do unit tests.

■ Make: how to use rules, patterns and macros to build your software.

These topics cover a lot of the extra items you need to know in order to

program effectively, but they aren’t

strictly programming proper. That topic is covered by the following workshops:

■ Sets and Dictionaries: using associative data structures to represent data that doesn’t really fit into a list.

■ Regular Expressions: how to use regular expressions for pattern matching.

■ Databases: an introduction to SQL.

■ Data Management: an introduction to managing your data.

■ Matrix Programming: using numpy to handle numerical processing.

■ Multimedia Programming:

programming using sound, pictures and other media files.

■ Spreadsheets: using spreadsheets for analysis and visualization.

With these workshops, you will learn many of the programming elements and structures that will be of use to you in scientific programming. After this, you should have covered enough, hopefully, to be able to program a solution to the problem you are studying. Again, all of these workshops include exercises, so you actually can try applying what you have learned. I’m a firm believer that you don’t learn anything

WWW.LINUXJOURNAL.COM / MAY 2012 / 25

until you actually try to use it.

In-person workshops and boot camps also are available. Because all of the material is available for free reuse, you simply can use the workshop materials to put on your own workshop or boot camp.

The team behind Software Carpentry also is available to do in-person boot camps. You can contact them through the Web site to make arrangements. These boot camps are two- or three-day crash courses to cover the bulk of the material, and they’re always being offered at different places around the globe—follow the blog to see when one is being offered in your neck of the woods. If you do decide to run your own, the team at Software Carpentry is happy

to help out and spread the word through its network. A forum is available at the Web site for each of the workshop topics where you can discuss the material with other attendees or other presenters.

Finally, I suggest that you actually subscribe to the blog RSS feed. New workshops always are being added, and new boot camps always are being planned. Watching the RSS feed will keep you informed about these additions. As always, feel free to contact me if you have anything specific you’d like to see covered here. Hopefully, I’ve been able to plant the seed and give you ideas on how you can pick up the skills you need.

—JOEY BERNARD

[

UPFRONT

]