C Programming 2014/2015 – Lab session #3
Davide Ceresoli,
Thursday April 16, 2015 – Room 310
Reminder
To use the lab PCs, the username isprogc, the password ismarzoand the domain must be set toCSD. Compile: F9 . Run: Ctrl + F9 . Brackets Shift + Alt Gr + `e and Shift + Alt Gr + + .
1
Startup
Copy the file
lab03.zip
from the
Z:
shared drive and expand the
archive inside
Desktop/pocketcpp/pocketcpp
folder.
Copy also the Gnuplot and Jmol zip files and expand them.2
Printing with color
The DOS/Windows command prompt, the Linux and Mac OSX terminal support the so-called ANSI escape sequences to change the color, clear the screen, move the cursor. I’ve provided you with an include file (include "ansi_terminal.h"). For example:
example ansi.cc #include <iostream>
#include <string>
#include "ansi_terminal.h" using namespace std;
int main() {
cout << ansi_green << "Hello, " << ansi_blue << "world!" << endl; cout << ansi_normal;
string s;
cout << "press ENTER to continue"; getline(cin, s);
// clear screen cout << ansi_clear;
cout << ansi_cursor_down << ansi_cursor_down << ansi_cursor_right; cout << ansi_bold << ansi_bg_yellow << "Bye!" << endl;
// reset
cout << ansi_normal;
3
Working with files
In C++ files are considered data streams that can receive or provide data. We have already seen two specials streams: cout and cin, that allow you to print on the screen, and to enter data from the keyboard. To use files in C++ you must add#include <fstream>.
To read from file:
// ifstream = Input File Stream ifstream fin("name_of_the_file.dat");
if (!fin) { cout << "error opening file" << endl; exit(1); }
// read one double double x;
fin >> x;
// read an entire line string s;
getline(fin, s);
// don’t forget to close the file fin.close()
To write to file:
// ofstream = Output File Stream ofstream fout("name_of_the_file.dat");
if (!fout) { cout << "error opening file" << endl; exit(1); }
// write one double double x;
fout << x << endl;
// don’t forget to close the file fout.close()
4
Vector/arrays
Up to now we have been working with scalar types. Now we will introduce vector (aka arrays) that will store a number of values, indexed by asubscript.
• In C/C++ the array index starts from zero (not from one!).
• Therefore if there arenelements in the array, the last one is atn−1.
• There is not check if you try to read/write out of the array bounds. As a consequence, your code might crash!
• To subscript an array, C/C++ uses the square brackets.
• C/C++ doesn’t have powerful vector, matrix, grid, tensor, etc... types and it is better to use an external library.
4.1
C-style static-size arrays
double v[10]; // a vector of 10 integers v[0] = 7.0;
v[5] = v[0] + sqrt(2.0);
v[10] = 0.0; // possible crash: out of bounds
int sudoku[9][9]; // a 9x9 array of integers for (int i = 0; i < 9; i++)
{
for (int j = 0; j < 9; j++) {
sudoku[i][j] = 0; // make an empty Sudoku board }
}
4.2
C++-style dynamic-size arrays
If you don’t know beforehand the size of your arrays, C++ provides you avector<T>type. You must
#include <vector>. The C++ vector is one-dimensional and can be re-sized.
vector<double> v; // an vector with 0 doubles
int natoms; cin >> natoms;
v.resize(natoms); // resize it
v[0] = 1.0;
v[76] = 1/v[32]; // if size is smaller than 76, the // code can still crash
v.push_back(40.0); // extend vector by one element
for (int i = 0; i < v.size(); i++) // output the vector {
cout << v[i] << endl; }
There is no multidimensional vector type in the standard C++
library.
4.3
XYZ file format
The XYZ format is the simplest format to represent a molecule. The first line contains the number of atoms. The second line is a free-format comment. From the third line, there is the atom and the x, y, z coordinates in angstrom. For example:
water.xyz 3
water molecule
H 0.783837 -0.492236 -0.000000
O -0.000000 0.062020 -0.000000
H -0.783837 -0.492236 -0.000000
xyz.cc // read a XYZ file and return atoms, x, y, z
void read_xyz(string filename, vector<string>& atoms,
vector<double>& x, vector<double>& y, vector<double>& z) {
ifstream fin(filename.c_str());
if (!fin) {
cout << "error opening file: " << filename << endl; exit(1);
}
// read number of atoms int natoms;
fin >> natoms;
// skip comment line string dummy;
getline(fin, dummy); getline(fin, dummy);
// read atoms and coordinates atoms.resize(natoms);
x.resize(natoms); y.resize(natoms); z.resize(natoms);
for (int i = 0; i < natoms; i++) {
fin >> atoms[i] >> x[i] >> y[i] >> z[i]; //cout << i << " " << atoms[i] << endl; if (!fin.good())
{
cout << "error reading file: " << filename << endl; exit(1);
} }
// close file fin.close(); }
Here is the symmetric routine that writes a XYZ file given the atoms and the coordinates:
xyz.cc // write a XYZ file
void write_xyz(string filename, vector<string>& atoms,
vector<double>& x, vector<double>& y, vector<double>& z) {
ofstream fout(filename.c_str());
if (!fout) {
cout << "error opening file: " << filename << endl; exit(1);
// read number of atoms int natoms = atoms.size(); fout << natoms << endl;
// comment line
fout << "written by xyz.cc" << endl;
for (int i = 0; i < natoms; i++) {
fout << setw(4) << left << atoms[i] << fixed << setprecision(8) << right << setw(20) << x[i]
<< setw(20) << y[i]
<< setw(20) << z[i] << endl; }
// close file fout.close(); }
4.4
C++-style matrix and vectors
There are a lot of specialized math libraries for C++. We will use a very simple one, the Template Numer-ical Toolkit (TNT) library. You must add#include "tnt/tnt.h"and#include "tnt/tnt_linalg.h"
to the list of header files. The TNT library provides mathematical vector and matrix and basic linear algebra routines (i.e. eigenvalues). Here is an example taken from TNT:
example tnt.cc #include <iostream>
#include "tnt/tnt.h"
#include "tnt/tnt_linalg.h" using namespace std;
int main() {
const int N = 10, M = 20;
// create MxN matrix, all zeros TNT::Matrix<double> A(M,N, 0.0);
// initalize array values for (int i=0; i < M; i++)
for (int j=0; j < N; j++) A[i][j] = 1.0/double(i+j+1);
// vector
TNT::Vector<double> B(M); for (int i=0; i < M; i++)
B[i] = double(i);
return 0; }
5
Random numbers
I’ve created a small include file that provides convenient functions to generate random numbers. You must add#include "myrandom.h"to the list of include files. Here it is. The comments are self-explaining:
myrandom.h // myrandom.h - random number functions
#include <cstdlib> #include <cmath>
// make a truly random sequence void randomize()
{
srand(time(NULL)); }
// generate an integer random in [0,b] int random_number(int b)
{
return rand() % (b+1); }
// generate an integer random in [a,b] int random_number(int a, int b)
{
return a + rand() % (b-a+1); }
// generate a uniform distributed real random number in [0,1) double random_number()
{
return (double)rand() / (double)RAND_MAX; }
// generate a uniform distributed real random number in [0,1) double random_number(double b)
{
return b * (double)rand() / (double)RAND_MAX; }
// generate a uniform distributed real random number in [a,b) double random_number(double a, double b)
{
return a + (b-a)* (double)rand() / (double)RAND_MAX; }
// generate normal distributed random number centered in m, // with standard deviation s
double random_number_gaussian(double m=0.0, double s=1.0) {
double x1, x2, w, y1; static double y2;
if (use_last) /* use value from previous call */ {
y1 = y2; use_last = 0; }
else {
do {
x1 = 2.0 * random_number() - 1.0; x2 = 2.0 * random_number() - 1.0; w = x1 * x1 + x2 * x2;
} while ( w >= 1.0 );
w = sqrt( (-2.0 * log( w ) ) / w ); y1 = x1 * w;
y2 = x2 * w; use_last = 1; }
return m + y1*s; }
Please, take 10 minutes to look at
and review the
example ansi.cc,
xyz.cc, example tnt.cc, myrandom.h,
example random.cc
and ask questions.
6
Excercises
6.1
Remove hydrogens from a molecule
Modify thexyz.ccprogram. Count and output the number of hydrogen atoms. Create new arrays and copy from the original arrays everything except the hydrogen atoms. Write the arrays to a new XYZ file. Test it with the ViagraTMmolecule. Check your results with Jmol (http://www.jmol.org).
6.2
Dice roll
Call therandom number(5) function to roll a “special dice” that has numbers from 0 to 5 (instead of 1 to 6). If you rolln dices, let’s define thescore as the sum of all dices. The score is in the [0; 5n] range. What is the probability to scorem?
To solve this problem, perform a huge number (10000) of dice rolls, and make an histogram (hint: use avector<int> histo(5*n+1)of the resulting score. Then, divide by the number of rolls, and write the result to a file. Check manually the results in case of 2 and 3 dices. Do it for 10 dices. What is the resulting distribution? why? which theorem of statistics are we proving?
6.3
Random integration
Modify theintegral trapz.ccfile and add a subroutine that evaluates the integral with the following formula:
Z b
a
f(x)dx' b−a
N
i=N
X
i=1
f(xi) (1)
where xi is a real random number in the interval [a, b]. (Hint: use the random number(double a, double b)function). CalculateR010 exp(x) +x dx. How many pointsN do you need to achieve the same accuracy as the trapezium rule?
trapz 10 points: 24296.2 trapz 100 points: 22094.2 trapz 1000 points: 22075.6 analytic: 22075.5
6.4
Statistics
Read a single-column data file into a vector<double>. Since you don’t know the number of data points, start with an empty array and use push back() to append the data to the array and resize it automatically. Then compute (N is the number of data points):
mean: m= 1
N
X
i
Xi (2)
variance: v= 1
N−1
" X
i
Xi2−(X
i
Xi)2
#
(3) native std. dev.: σ=pv/N (4) The native standard error is significant only if the data points are uncorrelated. In addition to that, calculateand plot the auto-correlation function:
C(i) = 1
v(N−1)
X
t=0...N−i
(Xi−m)(Xi+t−m) (5)
and the auto-correlation timeκ:
κ= 1 + 2
C(i)>0
X
i.e. sum the values of C(i) until C(i) becomes negative. Finally, the best estimate of the standard deviation is given by:
˜
σ=pv κ/N (7)
Use the program on the data1.dat file which contains uncorrelated (stochastic) data points and on
data2.datfile that contains correlated data points.
6.5
Polynomial fitting
Given a set ofn(xi, yi) data pairs, the best fitting polynomial of degreemcan be found in the following
way:
• Let’s define a X matrix of sizen×(m+ 1), whereXj,i= (xj) i
• Let’s define a Y column vector of size n, where Yi =yi.
• The polynomial coefficients are the solution of the matrix equation (linear system): X C = Y, where C is a column vector of sizem+ 1.
• The system is clearly over-determined and in principle it doesn’t have solution. However, we seek the a solution in theleast squares sense:
min
C ||X C−Y||
2 (8)
Write a program that reads the two columns file data to fit.dat and calculates the X matrix (of
TNT::Matrixtype) andY (of TNT::Vectortype). The coefficients are found with the following piece of code:
TNT::Linear_Algebra::QR<double> lsq(X); TNT::Vector<double> C = lsq.solve(Y);
Output the C coefficient and plot the data and the fitting polynomial with gnuplot. Fit with a first, second and third order polynomial.
Appendix: using Gnuplot
gnuplot> plot ’data.dat’ using 1:2 with points gnuplot> plot sin(x), ’data_fit.dat’
gnuplot> set xrange[1:20] gnuplto> quit