联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2023-09-26 11:06

Numerical Simulation and Parallel Programming

Notebook

Version 1.02

Haide College, OUC

2023.08

1

0 Information

Author:Fukaya Takeshi, Iwashita Takeshi (Hokkaido University), ZHANG Linjie, FU Kai (Ocean University of China)

E-mails: zhanglinjie@ouc.edu.cn, kfu@ouc.edu.cn

✓Outcomes ✏

1. Learn numerical simulations for simple physical models described by differential equations.

2. Learn the basic of MPI (Message-Passing Interface) parallel programming.

✒ ✑

Keyword: Heat conduction equation, finite difference method, MPI parallelization

1 Schedule This part has 30 credit hours in total.

1-4: How to program on Linux server.

5-8: Learn numerical simulation for 1-D heat conduction.

9-12: Learn numerical simulation program for 2-D heat conduction.

13-16: Learn the basic of parallel programming with MPI.

17-24: Parallelize the 2-D heat conduction simulation program with MPI.

It is recommended that make use of the class time effectively. Keep up with the programming even if

you don’t understand something.

2 How to submit a report Explain at the beginning (or in the middle) of the class.

• Time Limit: Further notice.

• How to submit: Print in A4 paper. Single-sided printing, double-sided printing, black and white

printing, color printing, are acceptable.

3 Academic Integrity It is unacceptable to submit work that is not your own. Assignments will be

checked for similarity and copying to ensure that students work is their own. Information sources must

be paraphrased (put in your OWN words - not copied directly) and the reference must be included. You

are also responsible for complying with any additional rules related to assessments communicated to you

by your instructors. Assignments that are found to breach this may be given zero. Further actions may

be taken.

2

4 Precautions about this part

• Represent numerical values (for example, temperature value) in double decision.

• It will not lose points if your program have to be recompiled every time when the setting changes.

• In the sample program, “ ” means a half-width space.

5 Precautions about report

• When visualizing the computing results, use the actual coordinates or time, instead of the number

of steps (space or time).

• Don’t insert too many pictures or graphs unless it is necessary.

• All pictures and graphs should be mentioned in the report.

• If you cite a paper or a website, identify it as a reference.

• Formulas should be concise, and be explained in words.

3

1 How to program on Linux server

Important information of server

• Server IP address: Further notice

• User Account: Further notice

• Password: Further notice

Basic Linux Commands The server is a Linux system (a type of operating system) which is

considerably different from Windows, and similar to Mac. In order to be more productive in this class,

you need to learn some basic about how to get things done in the Linux system.

1. ls (short for list). To find out what is in your current directory, type

$ ls

To list all files in your directory including those whose names begin with a dot (hidden files), type

$ ls -a

2. mkdir (make directory). Make a subdirectory in your current working directory, for example test.

Type

$ mkdir test

3. cd (change directory)

To change to the directory you have just made, type

$ cd test

Exercise 1a Make another directory inside the test directory called backups

4. The directories . and ..

In Linux, “.” means the current directory, ”..” means the parent of the current directory, so typing

$ cd .. (NOTE: there is a space between cd and the dot)

4

will take you one directory up the hierarchy.

5. pwd (print working directory). To find out the absolute pathname of your working directory, type

$ pwd

6. ∼ (your home directory). To list the contents of your home directory, type:

$ ls ~

7. cp (copy). Use “echo” and “>” to creat a file. Change to directory “test”, and copy the file

NumSim.txt to the current directory, type:

$ echo "Numerical simulation" > NumSim.txt

$ cd ~/test

$ cp ~/NumSim.txt .

8. mv (move). mv file1 file2 moves (or renames) file1 to file2. mv file1 dir moves file1 to directory

dir. For example:

$ cp NumSim.txt NumSim.bak.txt

$ mkdir backups

$ mv NumSim.bak.txt backups

9. Removing files and directories rm (remove), rmdir (remove directory)

As an example, create a copy of the NumSim.txt file then delete it.

$ cp NumSim.txt temp.txt

$ ls (to check if it has created the file)

$ rm temp.txt

$ ls (to check if it has deleted the file)

Summary

ls list files and directories

ls -a list all files and directories

mkdir make a directory

cd directory change to named directory

cd change to home-directory

cd ∼ change to home-directory

cd .. change to parent directory

pwd display the path of the current directory

cp file1 file2 copy file1 and call it file2

mv file1 file2 move or rename file1 to file2

rm file remove a file

rmdir directory remove a directory

Table 1.1: Summary of basic Linux commands

5

Review of C programming While reviewing the C language programming, we will try to compute

C ← AB + C (1.1)

as an example, where A, B, C are n × n matrices. There are 6 steps:

1. Read matrix dimension n from command-line argument.

2. Allocate memory dynamically for A, B, C.

3. Initialize elements of matrix A, B, C.

4. Compute each element of matrix C with Eq. (1.1).

5. Display execution time and FLOPS.

6. Free memory, etc.

In step 1, dimension n of matrix is given from command-line arguments,

$./a.out 1000

where 1000 is the given value of n. In the program, we can use

n = atoi (argv[1]);

to read the command-line argument 1000 and save it to variable n.

In step 2, it is straightforward to treat matrix as 2-D array in C programming. But, it is a little

complicated to create a 2-D array dynamically. So we will treat matrix as a 1-D array with size n2. Let

ai,j be the element belong to column i and row j in matrix A, then the corresponding element in 1-D

array is A[j ∗ n + i] (0 ≤ i, j ≤ n − 1). we use column-major form here. After declaring a double-type

pointer ∗A, we allocate memory space for A with the following code.

A = (double *)malloc (sizeof (double)*n*n);

Note that if we don’t know the size of array until the program is executed, we have to allocate memory

space dynamically.

In step 3, in order to make it easier to verify the correctness of our program, the initial values of matrix

A, B, C are set as

aij = 1

√n

, bij = 1

√n

, cij = 0. (1.2)

(It’s not difficult to find the the values of matrix C after Eq. (1.1).)

Now consider step 4, which is the main part of this program. The calculation of Eq. (1.1) is

cij ← cij +

n

!−1

k=0

aikbkj , (for each i, j). (1.3)

So, we can write the program like

for (i = 0; i < n; i++)

{

for (j = 0; j < n; j++)

{

/* Compute each element of matrix $C$ */

}

}

6

We can confirm whether our program runs correctly by checking the values of matrix C after calculation.

For measuring the execution time, we can write a function get time as following.

#include <sys/time.h>

double get_time ()

{

struct timeval tv;

gettimeofday (&tv, NULL);

return tv.tv_sec + (double)tv.tv_usec*1e-6;

}

Then we can get the elapsed time (seconds) like

t0 = get_time ();

/* process to be measured */

t1 = get_time ();

time = t1 - t0;

In addition to the execution time, we can calculate FLOPS (Floating-point Operations Per Second) too.

FLOPS = Times of Floating-point Operations

Elapsed Time (second) (1.4)

Since the number of floating-point operations in Eq. (1.1) is 2n3, so flops in our program can be computed

as

flops = 2.0* (double)n* (double)n* (double)n/time;

Then we display the execution time and FLOPS.

printf ("n!=!%d,!!time!(sec)!=!%8.3e,!!FLOPS!=!%8.3e\n", n, time, flops);

Finally, free memory allocated in step 2.

free (a);

7

Task 1.1

Most compilers have optimization option. Depending on this option, our program is optimized

at compile time. Please check how the program performance influenced by optimization option.

Write a program that performs the calculation in Eq. (1.1). Then compile it like

$gcc main.c -OX -lm

The letter after the hyphen − is the uppercase letter O, and the X part is a number from

0(zero) to 3. The higher the number, the stronger the compiler optimization (0 means no optimization). ”-lm” may be necessary if you include math.h in your program.

Set n = 1000, 2000, 3000, change the optimization option from -O0 to -O3 and measure the

execution time and FLOPS for each case.

Task 1.2

The main part of this program (the part that calculates Eq.(1.1)) is a triple loop structure, and

the loop variables are i, j, k. The order of these loops can be changed. For example, j → k →

i does not change the final result, but the memory access pattern (the way of reading/writing

the elements of array) will be different. Set n = 1000, 2000, 3000, change the loop order (6 types

in total), and check the performance. Note that, in order to eliminate the effects of compiler

optimization, please compile your program with optimization option -O0. If the execution time

(performance) differs, please explain it as far as you can understand.

8

2 Numerical simulation for 1-D heat

conduction

Computer simulations are widely used in the fields of engineering and natural sciences. Generally, when

simulating a certain physical phenomenon, four steps are included.

1. Modeling: derive one or more partial differential equations (PDEs) to describe the phenomenon.

2. Discretization: transform partial differential equations into discrete form.

3. Calculation: solve discretized problems with computer, get the numerical solutions of partial differential equations.

4. Visualization: Display calculation results into visible form.

We will experience the above flow through a simple physical phenomenon as an example.

Problem setting: One-dimensional heat conduction Consider the conduction of heat

along a stick of length L as shown in Fig. 2.1. We don’t care about the cross-sectional area since it’s very

small relative to the length. Let the left endpoint be the original point (x = 0), and the x-axis is taken in

the length direction. Heat conduction happens along ±x direction. In addition, heat exchange between

this stick and the outside world occurs only at the end points.

Fig 2.1: A stick

Derivation of partial differential equations Let u(x, t)denote the temperature at position x

at time t, 0 ≤ x ≤ L, 0 ≤ t ≤ T. 1 This 1-D heat conduction phenomenon can be described by a partial

differential equation. (See Task 2.3 for the derivation.)

∂t

u(x, t) = α

∂2

∂x2 u(x, t). (2.1)

1Consider heat conduction up to time t = T.

9

α is temperature diffusivity. Eq. (2.1) is solvable when initial condition u(x, 0) (temperature distribution

of the object at time t = 0) and boundary condition (condition related to temperature at both left and

right ends) are given.

Discretization Most of the PDEs that appear in engineering and the natural sciences are difficult to

solve analytically. Therefore, instead of finding an analytical solution, computer is used to get numerical

solution.

In general, the domains in which the PDE is defined are continuous, and the solution is continuous

too. In case of Eq.(2.1), they are [0, L] × [0, T] and u(x, t) separately. So, they cannot be handled directly

on the computer, therefore “discretization” is required.

Now we will study how to discretize the 1-D heat conduction problem given in Eq.(2.1). As mentioned

before, it is not possible to handle all points (x, t) of 0 ≤ x ≤ L, 0 ≤ t ≤ T on a computer. Therefore, we

separate the spatial domain x by step size ∆x and the time domain t by step size ∆t. (See Fig. 2.2 and

Fig. 2.3)

xi := i∆x (i = 0, 1,..., L

∆x =: Mx), (2.2)

tn := n∆t (n = 0, 1,..., T

∆t =: N). (2.3)

Then we only compute the solutions on the discretized points (xi, tn) only, that is

un

i := u(xi, tn). (2.4)

Fig 2.2: Spatial discretization Fig 2.3: Temporal discretization

Now, we will compute the solutions on (Mx + 1) × (N + 1) discrete points. We approximate the

derivative with forward difference in time and with central difference in space. Then we get

un+1

i − un

i

∆t = α

un

i+1 − 2un

i + un

i−1

(∆x)2 . (2.5)

See Task 2.4 for details. Note that Eq. (2.5) is defined only for the points i = 1, 2,...,Mx − 1, end points

are not included. For end points (i = 0, Mx), we have

un

0 := u(0, tn), un

Mx := u(L, tn) (2.6)

by applying the given boundary conditions.

Calculation of numerical solution Discretization has made it possible to handle the PDE

problem on a computer.

First, Transform Eq. (2.5) to the following from

un+1

i = ··· . (2.7)

10

Now we can see that only the value at time step n appears on the right side. In other words, if un

i is

known, the values of un+1

i can be easily obtained from Eq. (2.7). 2 The values at time step n = 0 is given

by initial condition.

u0

i := u(xi, 0). (2.8)

Therefore, we take the values at time step n = 0 as a starting point, then we compute the values of other

time steps n = 1, 2,... with Eq. (2.7) in order.

See Program 2.1 for the skeleton. Note that, in this program, instead of setting the step sizes ∆x and

∆t, we use the number of discrete points in each dimension Mx, N, which can be specified from run-time

arguments. print data (...) is a function (described later) that outputs the calculation results to file.

Program 2.1: Overview of 1-D heat conduction simulation program by finite difference method

1 /∗ Include required header files ∗/

2 #define L 1.0 //Length of bar

3 #define T 1.0 //Maximun time to simulation

4 #define alpha 1.0 //Thermal diffusivity

5 #define T SPAN 20 //Result output frequency (adjusted appropriately)

6

7 int main(int argc, char ∗argv[])

8 {

9 /∗ Declaration of variables (add each necessary variable) ∗/

10 double ∗u, ∗uu, dx, dt;

11 int Mx, N;

12

13 /∗ Get the values of Mx and N from the arguments ∗/

14

15 /∗ Dynamic allocation of memory, variable setting, etc. ∗/

16

17 for(i = 0; i <= MX; i++)

18 {

19 /∗ Set initial conditions for u ∗/

20 }

21 print data(...); //Output the initial state to a file

22

23 for(n = 1; n <= N; n++)

24 {

25 for(i = 1; i < MX; i++)

26 {

27 /∗ Calculate the value (uu) of the next time step from the current value (u) ∗/

28 }

29 for(i = 1; i < MX; i++)

30 {

31 /∗ Copy the value of uu to u ∗/

32 }

33 if(n%T SPAN == 0)

34 {

35 print data(...); //Output to a file at a regular frequency

36 }

37 }

38

39 /∗ Free memory ∗/

40

41 return 0;

42 }

2At a certain point i, values at time step n + 1 can be (easily) obtained from the values at the previous step n. We call

this explicit method.

11

Visualization When analyzing the calculation results, it is important to show them in a visually easyto-see form, that is visualization. In this experiment, visualization will be performed using a free software

gnuplot, which can graph data stored in file.

First, save the calculation results (temperature in this sample) of a certain time step to file (See

Program 2.2). Function print data creates a file named data ******.dat, and outputs the x-coordinate

and computed temperature (separated by space) to this file. 3

Program 2.2: Function that saves the calculation result to file

1 void print data(int Mx, int n, double dx, double dt, double ∗u)

2 {

3 FILE ∗fp;

4 char sfile[256];

5 int i;

6

7 sprintf(sfile, "data_%06d.dat", n);

8 fp = fopen(sfile, "w");

9 fprintf(fp, "#time!=!%lf\n", (double)n ∗ dt);

10 fprintf(fp, "#x!u\n");

11 for(i = 0; i <= MX; i++)

12 {

13 fprintf(fp, "%lf!%12lf\n", (double)i∗dx, u[i]);

14 }

15 fclose(fp);

16 return;

17 }

Now we can use gnuplot to display data in a single file. First, create a script files shown in Programs

2.3 in the same folder with your program. Then run the following command from the terminal.

$gnuplot sample0.gp

Program 2.3: Scipt of creating a gif still image (sample0.gp)

1 set terminal gif

2 set output "test_still.gif" #Adjust as needed

3 set xrange [0:1] #Adjust as needed

4 set yrange [0:100] #Adjust as needed

5 plot "***.dat" using 1:2 with line

6 unset output

If it works, a gif animation image file named test still.gif will be found in the same folder.

Since the temperature may change over time, it will be a good idea to create a gif animation to visualize

the state of time change. Here’s how it works. First, create two script files shown in Programs 2.4 and

2.5 in one folder. Then run the following command from the terminal.

$gnuplot sample1.gp

If it works, a gif animation image file named test.gif will be found in the same folder.

Program 2.4: Script of creating a gif animation (sample1.gp)

1 reset

2 set nokey

3 set xrange [0:1] #Adjust as needed

4 set yrange [0:100] #Adjust as needed

5 set size square

6

3Lines beginning with # are ignored as comments when plotting with gnuplot.

12

7 set terminal gif animate delay 50 #Display interval can be adjusted by changing 50

8 set output "test.gif" #Specifying the name of the output file

9

10 n0 = 0

11 nmax = 100 #Number of divisions in the time direction

12 dn = 20 #Step size (program T SPAN value)

13

14 load "sample1.plt"

15

16 unset output

Program 2.5: Script of creating a gif animation (sample1.plt)

1 if(exist("n")==0 || n<0) n = n0

2

3 title n = sprintf("t!=!%f", 1.0∗n/nmax)

4 unset label

5 set label 1 at graph 0.05,0.95 title n #The position of the legend can be adjusted by changing (0.05, 0.95)

6

7 filename = sprintf("data_%06d.dat", n)

8 plot filename using 1:2 with line

9

10 n = n + dn

11 if(n <= nmax) reread

12

13 undefine n

Task 2.1

Set L = 1.0, T = 1.0, α = 1.0.

• Initial conditions: u(0, 0) = 20.0, u(x, 0) = 0.0 (0 <x<L), u(L, 0) = 50.0

• Boundary condition:u(0.t) = 20.0, u(L, t) = 50.0

Based on the explanation so far, create a simulation of 1-D heat conduction (including file output)

and complete the following tasks.

1. Run the program with Mx = 10, N = 1000 and print the results to file every 20 time steps.

And

• Check the temperature at t = 0, 0.1, 0.2, 0.5 by displaying it as a still image with

gnuplot.

• Create a gif animation with gnuplot.

2. For more precise simulation, increase the number of divisions Mx. Note that if we just

increase Mx, the simulation maybe fail (the program works correctly, but the calculated

values are meaningless, such as Nan and Inf). Check if it works with the setting Mx =

50, N = 1000.

3. To prevent the simulation from failing when the number of points in the spatial dimension

(Mx) is increased, we should increase the number of segments in time (N). Therefore,

consider increasing Mx = 50 and then increasing N = 2000, 3000, 4000, 5000 seperately.

Check if the simulation fails at each N. If the simulation is successful, create a gif animation

of the result.

13

Task 2.2

In order to verify the validity of the simulation results, we compare the numerical solution and

the analytical solution in this task. Set L = 1.0, T = 1.0, α = 1.0.

• Initial condition: u(x, 0) = sin(πx)

• Boundary condition: u(0, t) = u(L, t)=0.0

The analytical solution of the partial differential equation of Eq. (2.1) is

u(x, t) = e−π2t sin(πx). (2.9)

1. Set Mx = 10, N = 1000 and run the program. Then, at t = 0.1, 0.5, compare the results

calculated by program with Eq. (2.1). Here is an example of the comparison. Add

replot exp (-pi*pi*0.1)*sin (pi*x)

to Programs 2.3 (after line 5).

2. In general, the accuracy of simulation will be improved as the number of divisions increases.

Therefore, execute the program in three cases (Mx, N) = (10, 200),(50, 5000),(100, 20000)

and compute

max

i

|un

i − u˜n

i |

|u˜n

i | . (2.10)

u is the value obtained by simulation (numerical solution), and ˜u is the value obtained from

the analytical solution.

14

Task 2.3

We will derive the partial differential equation shown in equation (2.1) here. Please fill in the

blanks.

Let q(x, t) be the heat that flows through the unit cross section in a second. The flow

in the positive direction of x axis is positive. According to Fourier’s law, if κ is the

coefficient of thermal conductivity, then

q(x, t) = (a)   . (2.11)

Next, let h be a small length, and consider the region between x and x+h. The amount

of heat stored in a small length in a second is

∆q = q(x, t) − q(x + h, t) ) (b)   . (2.12)

Note that, the Taylor expansion is performed for x, and the terms after h2 are ignored.

If the density of the object is ρ and the specific heat is cp, the amount of heat required

to raise the temperature of this small region by 1 degree is (c)  . Therefore, if all

the heat stored in this small region in a second is immediately spent on the temperature

change of the object. Denote α := κ

cpρ , then the temperature change per second is

∂t

u(x, t) ≈ ∆q

(d)

= (e)    . (2.13)

Now, we get equation (2.1).

Task 2.4

When deriving equation (2.5), we approximate the derivative. For the first-order derivative

( d

dx u(x)), please indicate “forward difference”, “backward difference”, and “center difference”

separately. And, for the second-order derivative ( d2

dx2 u(x)), show the approximation obtained by

repeating the center difference for the first-order derivative.

Task∗ 2.5

In equation (2.5), the derivative in the time dimension was approximated by “forward difference”

and in the spatial dimension, “center difference” was selected. Please give the formula with “backward difference” in time dimension and “center difference” in spatial dimension. And describe

the difference with Eq. (2.5).

15

3 Numerical simulation for 2-D heat

conduction

We will simulate 2-D heat conduction. It is an advanced version of 1-D. It is also a basis for parallelization.

Problem setting: 2-D heat conduction Consider heat conduction in a plate-shaped object

shown in Fig. 3.1. It is assumed that its thickness is very small, so we can ignore it. We also assume that

heat exchange only happen on its edges.

Fig 3.1: Plate-shaped object

Partial differential equations and discretization Write the temperature at time t at

position (x, y) as u(x, y, t). The 2-D heat conduction can be described like

∂t

u(x, y, t) = α

" ∂2

∂x2 u(x, y, t) + ∂2

∂y2 u(x, y, t)

#

. (3.1)

In this experiment, initial condition is

u(x, y, 0) = u0(x, y), (0 <x<Lx, 0 <y<Ly), (3.2)

boundary condition is

u(x, y, t) = g(x, y, t), (x, y ∈ ∂Ω). (3.3)

Here, ∂Ω is the set of points on boundary. u0(x, y) and g(x, y, t) are a given functions. By finding the

solution of function u(x, y, t) we can know how the temperature on plate changes.

16

Discretization First, let the step sizes in x, y directions be ∆x, ∆y, and the time step size be ∆t, then

we have

xi := i∆x (i = 0, 1,..., Lx

∆x =: Mx), (3.4)

yj := j∆y (i = 0, 1,..., Ly

∆y =: My), (3.5)

tn := n∆t (n = 0, 1,..., T

∆t =: N). (3.6)

Discretization in spacial dimension is shown in Fig. 3.2.

Fig 3.2: Discretization in 2D

Our propose is to calculate the numerical solution of Eq. (3.1), i.e.,

u(n)

i,j := u(xi, yj , tn). (3.7)

Similar to the case of 1-D, the derivative in the x, y dimension is approximated by center difference,

and forward difference is used for time derivative. Then we can get

u(n+1)

i,j − u(n)

i,j

∆t = α

$u(n)

i+1,j − 2u(n)

i,j + u(n)

i−1,j

(∆x)2 +

u(n)

i,j+1 − 2u(n)

i,j + u(n)

i,j−1

(∆y)2

%

. (3.8)

By using Eq. (3.8) and the given initial and boundary conditions, numerical solutions u(n)

i,j can be calculated.

Calculation of numerical solution The calculation is basically the same as 1-D. First, set the

values at time step n = 0 with initial condition. Then, compute the values of other steps.

The most difference between the calculation of 1-D and 2-D, maybe, is how to store the values of ui,j

in an array. From the contents of “Review of C programming”, we know that 1-D array is more convenient

when we allocate memory dynamically. So, we keep using 1-D array here.

17

Visualization Visualization for 2-D problem is also similar to 1-D learned before. First, output

computed result data to file, please

• Output x, y, u(x, y, t) splited by spaces.

• Output a blank line when x changes.

Then we get output file like

#x y u (x, y, t) ※ Comment line (can be omitted)

0.0 0.0 ***

0.0 0.1 ***

0.0 0.2 ***

[omit]

0.0 0.9 ***

0.0 1.0 ***

(Important: a blank line)

0.1 0.0 ***

0.1 0.1 ***

[omit]

0.1 0.9 ***

0.1 1.0 ***

(Important: a blank line)

0.2 0.0 ***

0.2 0.1 ***

[omit]

In the case of 2-D heat conduction, bird’s-eye view (as shown in Fig. 3.3) and color distribution map

(as shown in Fig. 3.4) are well used.

0 0.2 0.4 0.6 0.8 1 0

0.2

0.4

0.6

0.8

1

10

20

30

40

50

60

70

80

"data_003400.dat" using 1:2:3

Fig 3.3: Bird’s-eye view

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

"data_003400.dat" using 1:2:3

0

10

20

30

40

50

60

Fig 3.4: Contour plot

Programs 3.1 and 3.2 are examples for outputting still gif image with bird’s-eye view and contour

plot, respectively. Note that l in “w l” is the short for “line” (the first letter of “line”). After creating

script files, run the following command from the terminal.

18

$gnuplot ***.gp

If it works, a gif still image file will be found in the same folder.

Program 3.1: Script of creating a gif still image with bird’s-eye view (plot 3d bird still.gp)

1 reset

2 set terminal gif

3 set output "test_heat_bird_still.gif"

4 set nokey

5 set view equal xy

6 set xlabel ’x’

7 set ylabel ’y’

8

9 set xrange [0:1]

10 set yrange [0:1]

11 set zrange [0:100]

12 set size square

13

14 splot ’data_000000.dat’ using 1:2:3 w l

15

16 unset output

Program 3.2: Script of creating a gif still image as contour plot (plot 3d map still.gp)

1 reset

2 set terminal gif

3 set output "test_heat_map_still.gif"

4 set nokey

5 set view equal xy

6 set xlabel ’x’

7 set ylabel ’y’

8

9 set xrange [0:1]

10 set yrange [0:1]

11 set zrange [0:100]

12

13 set cbrange [0:100] #Adjust as needed

14 set palette defined (0 ’blue’, 50 ’red’, 100 ’yellow’)

15 unset surface

16 unset ztics

17 set pm3d at b

18 set view 0,0

19 set colorbox vertical user origin 0.8, 0.2

20

21

22 splot ’data_000000.dat’ using 1:2:3 w l

23

24 unset output

The simulation results can be visualized with a gif animation by gnuplot too. You can realize it on

the basis of 1-D. Hint: the following code will make the time information more clear in gif animation.

set label 1 center at screen 0.5, 0.85 title_n

19

Task 3.1

Create a simulation program based on the explanations so far, and simulate the state of 2-D heat

conduction. Let Lx = Ly = 1.0, T = 1.0, α = 1.0. Initial condition and boundary conditions are

given as:

• u(x, y, 0) = 0.0 (x, y /∈ ∂Ω)

• u(x, 0, t) = 20x + 10 (0 ≤ x ≤ Lx, t ≥ 0)

• u(x, Ly, t) = 40x + 40 (0 ≤ x ≤ Lx, t ≥ 0)

• u(0, y, t) = 30y2 + 10 (0 <y<Ly, t ≥ 0)

• u(Lx, y, t) = 50y2 + 30 (0 <y<Ly, t ≥ 0)

Execute program with the following parameters and check the results.

1. Mx = My = 10, N = 1000

2. Mx = My = 20, N = 1000

3. Mx = My = 20, N = 4000

If it does not break down, create a gif animation with bird’s-eye view and color distribution.

(The frequency of outputting the calculation results should be set appropriately so that the gif

animation looks natural.)

Task 3.2

Set Lx = Ly = 1.0, T = 1.0, α = 1.0, and the initial and boundary conditions are the same as in

Task 3.1. It is known that the simulation does not fail if we compute N with

N = 2αT

$&Mx

Lx

'2

+

&My

Ly

'2

%

.

Therefore, we will determine N according to this formula and perform the following tasks. (Set

the compile option to -O2.)

1. Measure the execution time in case of Mx = My = 100. See “Review of C programming”

for the time measurement method. Note that, do not include time for outputting to files,

since we only care about the execution time of calculation.

2. In order to simulate it more accurately, make the division in the spatial dimension finer.

Make Mx, My 10 times of that in 1, that is Mx = My = 1000. Let’s assume that the

execution time is proportional to the number of elements in the array. Please predict the

execution time when Mx = My = 1000 from the real time measured in 1. Make a rough

prediction, not a precise one.

3. In order to perform the simulation more accurately, consider setting the spatial division to

100 times, 1000 times, 10000 times. For each case, predict the execution time in the same

way as in 2. At the same time, estimate memory requirements (byte). Note that we use

two double-precision (double) arrays to store temperature data.

20

Task∗ 3.3

Set Lx = Ly = 1.0, T = 1.0, α = 1.0, initial condition is u(x, y, 0) = sin(πx) sin(πy), boundary

condition is u(x, y, t)=0.0 (x, y ∈ ∂Ω).

1. Try to find the analytical solution in this case. (Hint: Try to guess from the analytical

solution for one dimension (Task 2.2).)

2. Compare the analytical solution with the simulation results at t = 0.1 in case of Mx = My =

10, 50, 100 separately. Take N with the formula shown in Task 3.2. Method of comparison

is the same as in Task 2.2.

21

4 Basic of parallel programming with

MPI

In this part, we will learn basic techniques of MPI (Message Passing Interface), which is widely used

in parallelizing simulation programs.

Introduction Let’s start with an overview of MPI. MPI is (strictly) an API standard for message

communication between processes. An implementation of this (MPI library) is provided by vendors for

free or not free. By now, we can use MPI in C language (or Fortran, etc.).

The parallel model of MPI is SPMD (Single Program Multiple Data) (See Fig. 4.1). The unit

of processing is called process. All of the processes are carried out at the same time. Each MPI process

is assigned a unique identification number (starting from 0) called rank, and has its own global variables,

environment. The programs executed on each process are the same (a.out in the case of Fig. 4.1), that

is “Single Program”. On the other hand, the data held by each process is generally different, that is

“Multiple Data”.

Fig 4.1: Parallel processing of MPI

Let’s illustrate the parallelism mechanism of MPI in more detail with Fig. 4.2. As already mentioned,

since the program on each process is same, so all of the processes have same variables. For example, in

Fig. 4.2, both rank 0 process (hereinafter, process 0) and rank 1 process (process 1) have a double type

array named “data”. However, each process can hold different elements (values).

Each process can only use the data held by itself, since the memory space of each process is independent to each other. if they want to exchange data with other processes, they have to send/receive data

(pass message) using functions given by MPI. See Fig. 4.2 as an example,

• process 0: Receive data from process 1 by function (MPI Recv)

• process 1:Send data to process 0 by function (MPI Send)

There are three kinds of functions provided by MPI. We will focus on one-to-one communication due

to limited credit hours.

• One-to-one communication: Functions for sending and receiving data between two processes (e.g.

MPI Send, MPI Recv)

22

Fig 4.2: Schematic diagram of memory space and data passing of MPI

• Group communication: Functions for aggregating data between many processes (e.g. MPI Reduce,

MPI Gather)

• Others: Functions for initialization, termination, information acquisition, time measurement, etc.

(e.g. MPI Init, MPI Comm size, MPI Wtime)

The minimum knowledge you should know for doing MPI parallelization is

• Basic structure of MPI program (initialization, termination, acquisition of rank number, etc.)

• How to compile MPI program (environment dependent)

• How to run MPI program (environment dependent)

23

MPI programming example (1): MPI Hello world In this program, each process sleeps

for my rank ∗ 2 seconds, where my rank is its rank number. Then a random number is generates. Along

with “Hello world”, rank number and random number are displayed.

Program 4.1: MPI hello world

1 #include <mpi.h>

2 #include <stdio.h>

3 #include <stdlib.h>

4 #include <sys/time.h>

5 int main(int argc, char ∗argv[])

6 {

7 int nprocs, my rank;

8 int rand num;

9 struct timeval s;

10

11 MPI Init(&argc, &argv);

12 MPI Comm size(MPI COMM WORLD, &nprocs);

13 MPI Comm rank(MPI COMM WORLD, &my rank);

14

15 sleep(my rank∗2);

16

17 gettimeofday(&s, NULL);

18 srand(s.tv sec);

19 rand num = rand()%1000;

20

21 printf("Hello!world!(rank:!%d,!random_num:!%3d)\n", my rank, rand num);

22

23 MPI Finalize();

24

25 return 0;

26 }

In this program, lines related to MPI are:

• Line 1: Include header file for using MPI.

• Line 11: MPI initialization (must be done before calling all MPI functions).

• Line 12: Get the total number of processes.

• Line 13: Get rank number.

• Line 23: MPI termination.

Compile program 4.1 with

$mpicc mpi_parallel_hello.c

Then, run the executable MPI program with

$mpirun -np P ./a.out

where P is the number of processes evaluates the MPI job.

Task 4.1

Set the number of processes P to 1, 2, and 4, Run program 4.1, and check the outputs.

24

MPI programming example (2): One-to-one communication (in the case of

two processes) In Program 4.1, multiple processes are carried out in parallel. But there is no data

exchange. Now, we will learn how to send/receive message between two processes.

Program 4.2: Sending and receiving data between 2 processes

1 #include <mpi.h>

2 #include <stdio.h>

3 #include <stdlib.h>

4 #include <sys/time.h>

5 int main(int argc, char ∗argv[])

6 {

7 int nprocs;

8 int my rank, your rank;

9 int my num, your num;

10 MPI Status stat;

11 struct timeval s;

12

13 MPI Init(&argc, &argv);

14 MPI Comm size(MPI COMM WORLD, &nprocs);

15 MPI Comm rank(MPI COMM WORLD, &my rank);

16

17 sleep(my rank∗2);

18

19 gettimeofday(&s, NULL);

20 srand(s.tv sec);

21 my num = rand()%1000;

22

23 if(my rank == 0)

24 {

25 your rank = 1;

26 MPI Send(&my num, 1, MPI INT, your rank, 0, MPI COMM WORLD);

27 MPI Recv(&your num, 1, MPI INT, your rank, 1, MPI COMM WORLD, &stat);

28 }

29 else

30 {

31 your rank = 0;

32 MPI Recv(&your num, 1, MPI INT, your rank, 0, MPI COMM WORLD, &stat);

33 MPI Send(&my num, 1, MPI INT, your rank, 1, MPI COMM WORLD);

34 }

35

36 printf("My!rank!is!%d.!My!num!is!%3d.!Your!num!is!%3d.\n", my rank, my num, your num);

37

38 MPI Finalize();

39

40 return 0;

41 }

In this program, a random number is generated in each process as in program 4.1. After that,

• Line 26: Process 0 sends its random number to process 1 with MPI Send.

• Line 27: Process 0 receive from process 1 with MPI Recv.

• Line 32: Process 1 receives from 0 process 0 with MPI Recv.

• Line 33: Process 1 sends its random number to process 0 with MPI Send.

MPI Send and MPI Recv used in this example are one-to-one communication functions provided by MPI.

• int MPI Send (const void *buf, int count, MPI Datatype datatype, int dest, int tag, MPI Comm

comm)

Function for sending data, will be blocked until sending is completed.

25

Fig 4.3: Deadlock Fig 4.4: Right way

– buf: a pointer to the buffer that contains the data to be sent.

– count: the number of elements in the buffer. If the data part of the message is empty, set the

count parameter to 0.

– datatype: the data type of the elements in the buffer array, for example, MPI INT, MPI DOUBLE,

MPI LONG, MPI CHAR.

– dest: the rank of the destination process within the communicator that is specified by the

comm parameter.

– tag: the message tag that can be used to distinguish different types of messages. Must be the

same number on the sending side and receiving side.

– comm: the handle to the communicator, basically MPI COMM WORLD.

• int MPI Recv (void *buf, int count, MPI Datatype datatype, int source, int tag, MPI Comm comm,

MPI Status *status)

Function for receiving data, will be blocked until receiving is completed.

– buf: a pointer to the buffer for storing the received data.

– count: the number of elements to be received.

– datatype: the data type of the elements in the buffer array.

– source: the rank of the sending process within the communicator that is specified by the comm

parameter.

– tag: the message tag that can be used to distinguish different types of messages. Must be the

same number on the sending side and receiving side.

– comm: the handle to the communicator, basically MPI COMM WORLD.

– status: on return, contains a pointer to an MPI Status structure.

Note that, some MPI message passing functions, such as MPI Send and MPI Recv, do not move to

the next step until the sending/receiving is completed (Library dependent). This is called “blocking

communication ”. Therefore, if both processes call MPI Send firstly as Fig. 4.3, then neither of them

can complete their sending mission since the other one is not ready to receive. So they have to wait for

ever. This state is called deadlock, which should be treated very carefully in MPI programming. To avoid

this, we can let one process sending and the other one receiving as shown in Fig. 4.4. There are other

solutions such as using MPI Sendrecv or ‘‘non-blocking communication” (asynchronous communication)

functions.

Task 4.2

Set the number of processes to 2, run program 4.2, and check the outputs.

26

MPI programming example (3): One-to-one communication (in the case of

multiple processes) Next, consider sending/receiving messages when the number of processes is a

general number (maybe greater than 2).

Program 4.3: Sending and receiving data between processes (when the number of processes is general)

1 #include <mpi.h>

2 #include <stdio.h>

3 #include <stdlib.h>

4 #include <sys/time.h>

5 int main(int argc, char ∗argv[])

6 {

7 int nprocs;

8 int my rank, prev rank, next rank;

9 int my num, prev num, next num;

10 MPI Status stat;

11 struct timeval s;

12

13 MPI Init(&argc, &argv);

14 MPI Comm size(MPI COMM WORLD, &nprocs);

15 MPI Comm rank(MPI COMM WORLD, &my rank);

16

17 sleep(my rank∗2);

18

19 gettimeofday(&s, NULL);

20 srand(s.tv sec);

21 my num = rand()%1000;

22

23 prev rank = (my rank−1+nprocs)%nprocs;

24 next rank = (my rank+1)%nprocs;

25

26 MPI Sendrecv(&my num, 1, MPI INT, next rank, 0,

27 &prev num, 1, MPI INT, prev rank, 0,

28 MPI COMM WORLD, &stat);

29 MPI Sendrecv(&my num, 1, MPI INT, prev rank, 1,

30 &next num, 1, MPI INT, next rank, 1,

31 MPI COMM WORLD, &stat);

32

33 printf("My!rank!is!%d.!My!num!is!%3d.!Prev!num!is!%3d.!Next!num!is!%3d.\n",

34 my rank, my num, prev num, next num);

35

36 MPI Finalize();

37

38 return 0;

39 }

In program 4.3, each process exchange data with its neighbors as shown in Fig. 4.5. It is assumed

that process 0 and process nprocs − 1 are adjacent, like a ring.

As mentioned previously, we should avoid deadlock when sending/receiving data. Therefore, for the

sake of easy, we use function MPI Sendrecv here. MPI Sendrecv is basically a combination of MPI Send

and MPI Recv. The send and receive processes are processed in an appropriate order so that deadlock

does not occur inside function. The sender and receiver do not have to be the same process, nor do data.

• int MPI Sendrecv (const void *sendbuf, int sendcount, MPI Datatype sendtype, int dest, int sendtag,

void *recvbuf, int recvcount, MPI Datatype recvtype, int source, int recvtag, MPI Comm comm,

MPI Status *status)

Function for sending and receiving data.

– sendbuf: initial address of send buffer.

– sendcount: number of elements in send buffer.

– sendtype: type of elements in send buffer.

27

– dest: rank of destination.

– sendtag: send tag.

– recvbuf: initial address of receive buffer.

– recvcount: number of elements in receive buffer.

– recvtype: type of elements in receive buffer.

– source: rank of source.

– recvtag: receive tag.

– comm: the handle to the communicator, basically MPI COMM WORLD.

– status: on return, contains a pointer to an MPI Status structure.

In program 4.3,

• 26–28: Send data to its next process and receive data from its previous process

• 29–31: Send data to its previous process and receive data from its next process

A common mistake is to send data to its next process and receive data from its next process too, which

generally causes deadlock.

Fig 4.5: Sending and receiving in case of more than 2 processes

Task 4.3

Set the number of processes to 2, 3, 4, run program 4.3, and check the outputs.

28

MPI programming example (4): Compute the sum of array values in parallel

The serial version is given in program 4.4. It computes the sum of elements in array val. The sum of val

can be checked easily. Please don’t calculate the sum with formula directly in your program.

Program 4.4: Computing sum of array elements (serial version )

1 #include <stdio.h>

2 #define N 10000

3 int main(int argc, char ∗argv[])

4 {

5 double val[N];

6 int i;

7 double sum;

8

9 for(i = 0; i < N; i++)

10 {

11 val[i] = (double)i; //initialization

12 }

13

14 sum = 0.0;

15 for(i = 0; i < N; i++)

16 {

17 sum += val[i];

18 }

19

20 printf("sum!is!%8lf\n", sum);

21

22 return 0;

23 }

Now we consider the parallelization from line 14-18, which is the main calculation part of program4.4.

• Let P be the number of processes. Each process only computes the sum of N/P elements. The last

process is in charge of the remainder.

• Let process 0 gather the results calculated by each process. That is, process 0 receives data from

the others with MPI Recv, and the other processes send data to process 0 with MPI Send).

• Let process 0 calculate the total sum and output it.

Task 4.4

Parallelize program 4.4, run it with 2, 3, 4 processes separately, and check the results.

Task∗ 4.5

Since each process only computes the sum of a portion of array val, it is more reasonable if each

process only store those elements that used by itself. Please improve program 4.4 to reduce the

memory required by each process. Then, execute it with 2, 3, 4 processes, and check the results.

29

Supplementary explanation of Task 4.4 First, compile the serial version (program 4.4), then

run it.

$gcc main.c

$./a.out

The output is

$sum is ****.***

There is only one process now. Now try the following command without recompiling the program.

$mpirun -np 4 ./a.out

The output is

$sum is ****.***

$sum is ****.***

$sum is ****.***

$sum is ****.***

Command mpirun starts 4 processes this time. They execute the same program (a.out). That is the same

computation repeated 4 times, which is meaningless.

Consider the parallelization of program 4.4. First, insert

#include <mpi.h>

MPI_Init (&argc, &argv);

MPI_Finalize ();

to proper position. Then get the number of processes (n procs) and rank (my rank).

MPI_Comm_size (MPI_COMM_WORLD, &n_procs);

MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);

Note that the rank of process (my rank) starts from 0 and ends with (n procs-1). We can confirm whether

they are acquired correctly by the following code.

printf ("my_rank!is!%d,!!n_procs!is!%d,!!sum!is!%8lf\n", my_rank, n_procs, sum);

Now, divide the task to P processors. Our propose is to calculate

S =

N

!−1

i=0

val[i].

Rewrite it with

S =

P

!−1

p=0

Sp,

Sp =

Np

!+1−1

i=Np

val[i] (p = 0, 1, ··· , P − 1), N0 = 0, NP = N.

最前

变量定义在

retum 0 前

30

Note that the calculation of each Sp is independent of each other. In other words, it is possible to calculate

all Sp at the same time. Therefore, let P equal to the number of processes (n procs), and let each process

calculates Sp, p is the rank of each process.

If N is divisible by n procs, we set l n = N/n procs as the number of elements that one process should

be in charge of (hereinafter referred to as l n). If it is not divisible, let the last process (rank n procs-1)

be in charge of the remainder.

l_n = N / n_procs;

if (my_rank == n_procs-1)

{

l_n += N%n_procs;

}

Then the range of indexes that each process should be responsible for is determined as following.

l_istart = (N/n_procs) * my_rank;

l_iend = l_istart + l_n - 1;

Hence the loop in Program 4.4 (line 15) should be modified as

for (i = l_istart; i <= l_iend; i++)

Note that, each process only computes and holds the partial sum (SP ). In order to compute the sum of

the all array elements (S), we should let a certain process (for example, process 0) gather the computed

SP from the other processes, and add them together.

In detail, first, let all processes, excepting process 0, send their partial sum to process 0.

MPI_Send (&sum, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);

Then, let process 0 receive partial sum from process i (i = 1, 2, ..., n procs-1) and add them to sum.

MPI_Recv (&tmp_sum, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD, &stat); sum += tmp_sum;

At last, let process 0 output the result, which is the sum of all elements in array val.

if (my_rank == 0)

{

printf ("sum!is!%8lf\n", sum);

}

Make rank 0 output the value of sum so that we can see the result.

After modifying the program so far, recompile it, and execute it with 4 processes.

$mpicc main.c

$mpirun -np 4 ./a.out

Change the number of processes, and check whether the results are consistent.

算⺟个进程从哪执⾏到 哪

31

Supplementary explanation of Task 4.5 Look back the program achieved in Task 4.4, we

can find that each process only use part of array val, from l istart to l iend. However, each process

stores the whole array. That is, memory is wasted. This will be a problem when we want to handle large

problems. Therefore, we will improve the program achieved in Task 4.4, so that only the required memory

is allocated.

First, allocate memory dynamically

val = (double *)malloc (sizeof (double)*l_n);

where l n is the number of elements referenced by each process.

Now, consider initializing elements of array val. It is

val[i]=(double)i;

in Task 4.4. However, the size of val is l n now, not N. Therefore, the range of index i on the left side

(val[i]) is from 0 to l n-1. The index i on the right side (double) i is from l istart to l iend. There

is a gap between local index and global index (see Picture 4.6), which needs to be dealt with carefully.

Fig 4.6: global index & local index

A conversion between local index and global index is needed. Hereinafter, the local index is referred

to as l i, and the global index is referred to as g i. The conversions between local index and global index

are

g_i = (N/n_procs)*my_rank + l_i;

l_i = g_i - (N/n_procs)*my_rank;

By using this, the initialization part becomes

for (l_i = 0; l_i < l_n; l_i++)

{

val[l_i]=(double)((N/n_procs)*my_rank + l_i);

}

32

5 Parallelization of 2-D heat

conduction simulation with MPI

Parallelization policy: Data distribution and communication In 2-D thermal heat

conduction simulation, the values of ui,j , (1 ≤ i ≤ Mx − 1, 1 ≤ j ≤ My − 1) are calculated at each

time step. For the calculation of ui,j , only the values of previous time step are necessary. There is no

dependency among unknowns at different locations. In other words, it is possible to calculate ui,j at

different points at the same time. Therefore, we divide the whole area into several regions and let each

processor in charge of one region.

There are various ways for dividing area. We consider the division shown in Fig. 5.1. Each process

only compute those ui,j in its region, l istart ≤ i ≤ l iend, 1 ≤ j ≤ My − 1.

Fig 5.1: Dividing area into 3 regions (in the case of

3 processes)

Fig 5.2: Overlapping regions of process 1 (black

dots)

Note that, in order to calculate ui,j at point (i, j), we need the values at this point and four adjacent

points (i ± 1, j ± 1) at previous time level. Fig. 5.2 is an example when we focus on process 1. Note that

the values on black dots are not held by process 1. They are hold by process 0 and process 1. So, process

1 should exchange values in “overlapping region” with its neighbour processes.

33

Precautions for programming: Sending and receiving multiple values In this

program, it is necessary to send/receive multiple values in overlapping region. Functions MPI Send/ Recv/

Sendrecv can send/receive multiple values at once if they are continuous in memory. The initial address

of them is specified with argument buf, and the number of values is specified with argument count.

Precautions for programming: Order of sending and receiving In this program, each

process should exchange data with its neighbor processes. We should be very careful to avoid deadlock.

Note that, the first process and the last process only have one neighbor. Of course, we can use “if else”

to handle it. But we have an easier way to avoid branching.

Prepare variable prev rank to save the rank of previous process, and next rank to save the rank

of next. For the first process, prev rank can be set with MPI PROC NULL. So do next rank for the last

process.

Precautions for programming: Outputting and visualization We should avoid multiple processes accessing file at the same time. A relatively simple way is to let each process output

to a different file. For example, process 0 output to file “data 00 001000.dat”, process 1 output to file

“data 01 001000.dat”, etc. We can put them together when visualizing with gnuplot.

splot for [i=0:2] sprintf("data_%02d_001000dat", i) using 1:2:3 w l

Precautions for programming: Measurement of execution time Since multiple processes are running in parallel, we should make sure that all processes start at the same time, and wait for

the slowest one.

MPI_Barrier (MPI_COMM_WORLD);

t_start = MPI_Wtime ();

Processing that requires time measurement

MPI_Barrier (MPI_COMM_WORLD);

t_end = MPI_Wtime ();

tiem = t_end - t_start;

MPI Barrier is a function that synchronizes between processes. MPI Wtime is a time acquisition function.

Both of them are provided by MPI.

34

Task 5.1

Parallelize the 2-D heat conduction simulation program created before. Don’t worry about wasted

memory in this task. Parameters are basically the same as Task 3.1: Lx = Ly = 1.0, T = 1.0, α =

1.0, Mx = My = 200, N = 160000. Initial conditions and boundary conditions are

• u(x, y, 0) = 0.0 (x, y /∈ ∂Ω)

• u(x, 0, t) = 20x + 10 (0 ≤ x ≤ Lx, t ≥ 0)

• u(x, Ly, t) = 40x + 40 (0 ≤ x ≤ Lx, t ≥ 0)

• u(0, y, t) = 30y2 + 10 (0 <y<Ly, t ≥ 0)

• u(Lx, y, t) = 50y2 + 30 (0 <y<Ly, t ≥ 0)

Set the number of processes from 1 to 8. Output the bird’s-eye view of solution at final time (T),

and measure the execution time for each case. Note:

• When testing whether program is correct, set Mx, My, N be small.

• When measuring time, do not include input/output time.

Task 5.2

Modify the program in Task 5.1 to avoid wasting memory. Then change the process number

from 1 to 8 and execute it to verify the operation. The settings and conditions are the same as

in Task 5.1.

Task∗ 5.3

Consider heat source inside the plate. The partial differential equation becomes

∂t

u(x, y, t) = α

" ∂2

∂x2 u(x, y, t) + ∂2

∂y2 u(x, y, t)

#

+ S(x, y), (5.1)

where S(x, y) is the strength of heat source at position (x, y) and is time-independent.

Set Lx = Ly = 1.0, T = 1.0, α = 1.0, Mx = My = 40, N = 10000. Initial condition is u(x, y, 0) =

0.0, and boundary condition is u(x, y, t)=0.0,(x, y) ∈ ∂Ω. There is a heat source S(0.5, 0.5) =

40000. Modify your program to simulate the state of heat conduction under these conditions.

Task∗ 5.4

Consider heat conduction problem (5.1) with Lx = Ly = 1.0, α = 1.0 and an analytical solution

u(x, y, t) = −e−2π2αt

(sin(πx) sin(πy)) + 20xy(x − 1)(y − 1). (5.2)

Boundary condition is u(x, y, t)=0.0,(x, y) ∈ ∂Ω and initial condition is obtained by setting

t = 0 in Eq. (5.2). Heat source is S(x, y) = 40α(x(1 − x) + y(1 − y)). Get simulation results at

t = 0.1 by using different Mx = My = 10, 50, 100. Take N with the formula shown in Task 3.2.

Compare the results to the analytical solution by figures and compute the errors according to Eq.

(2.10) and perform analysis.

35

6* Computation of steady-state heat

conduction

For the heat conduction problem, if the boundary conditions are time-independent, the system will arrive

to a steady state after sufficient time. If our purpose is to find this steady state, set the time change of u,

∂u/∂t to 0. Therefore, in the case of 1-D, the problem becomes

d2

dx2 u(x)=0. (6.1)

As before, discretize it with central difference (number of divisions: M), we get

ui+1 − 2ui + ui−1

∆x2 = 0 ⇔ 2ui − ui+1 − ui−1 = 0, (i ≤ i ≤ M − 1), (6.2)

where u0 and uM are determined by boundary conditions. This is a linear system with M − 1 unknowns.

There are various methods for solving this linear system. For example, LU decomposition (Gaussian

elimination), Jacobi method and CG method. In general, when the coefficient matrix is large and sparse

(mostly zero components), LU decomposition is not workable. We will solve it with conjugate gradient

(CG) method.

The algorithm of CG method is shown in Algorithm 1. The initial approximate solution is x0 = 0.

And & = 10−12 is a tolerance for convergence judgment.

Algorithm 1 CG method (Conjugate Gradient )

1: x0 = 0

2: r0 = b, p0 = r0

3: for k = 0, 1, 2,... do

4: qk = Apk

5: αk = r"

k rk/p"

k qk

6: xk+1 = xk + αkpk

7: rk+1 = rk − αkqk

8: if ||rk+1||2/||b||2 ≤ & then

9: convergence

10: end if

11: βk = r"

k+1rk+1/r"

k rk

12: pk+1 = rk+1 + βkpk

13: end for

36

Task∗ 6.1

Write the unknowns of the simultaneous equation of Eq. (6.2) as vector

x =



u1

u2

.

.

.

uM−2

uM−1



.

Then Eq. (6.2) can be rewritten as

Ax = b.

Please give the elements of coefficient matrix A and the right-hand side vector b.

Task∗ 6.2

Use conjugate gradient method to find the steady state of 1-D heat conduction. The boundary

condition is u0 = 0.2, uM = 1.0. Compare the results with that we simulated before.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp