Assignment 2, Deadline 15 November 2019
General Remarks
• The complete assignment must be written as a single Jupyter notebook
that contains the code, relevant output, graphcis and all explanations.
• The Notebook must be executable without errors. To make sure that this
is the case, before submitting reload the notebook and run it from scratch
to make sure it runs fine. Any submitted notebook that does not run
correctly will receive an automatic 0 mark.
• While you can (and should) include in your notebooks results from longer
runtime, make sure that the parameter choices in your notebook on submission
are chosen such that the notebook does not take longer than 2
minutes to run on your computer. We are most likely using much faster
machines for marking. But notebooks that have a substantially longer
running time will be rejected.
• You must follow in your code PEP8 coding guidelines with the only exception
that lines can be longer than 80 characters. But please do not make
them much longer than 100 characters so that they are still readable (consider
at a soft limit). Failure to adhere to PEP8 coding guidelines can
result in substantial point reductions in severe cases.
• Correctness of the codes will count 40% of the mark. You will receive
30% for presentation, which includes sensible graphs and thorough documentation.
The remaining 30% will be for efficient coding. However,
a submission without documentation will still count as a fail as this is a
mandatory component.
• You will be able to submit your Jupyter notebooks from a few days before
the deadline onwards to the Moodle course page. A corresponding
announcement will be made in time.
The following two questions are computational exercises that require you to
develop code implementations with PyOpenCL. It is expected that you develop
efficient implementations that make use of advanced CPU parallelisation features.
If you have problems with OpenCL CPU drivers on your own computers
you can use the Microsoft Azure Notebook platform.
1
Question 1: OpenCL CSR matrix-vector product
Given a sparse matrix in CSR format. Develop a class that derives from Scipy
LinearOperator and which is initialised with data, indices, and indptr array
that describe the sparse matrix. The class shall provide an efficient OpenCL
accelerated matrix-vector product. Make use of efficient parallelisation and
SIMD features provided by Intel’s AVX2 technology. Make sure that your data
movements are efficient. In particular, the sparse matrix data should only be
transferred to the device once and not during each matrix/vector product.
Question 2: Potentials generated by random particles.
We consider the Laplace equation
−∇2u = f
on the unit square with zero conditions on the boundary. Based on the finite
difference 5 point stencil discretization discussed in the lecture develop a LinearOperator
that takes a vector u of values on an M ×M grid (xi, yj ) for xi =iM+1 ,j = 1, . . . , M and yi described in the same way, and returns the approximate
evaluation of −∇2u for this vector. The evaluation of the 5 point stencil should
be parallelized using PyOpenCL without storing the matrix explicitly.
The CG iterative solver in Scipy only requires a LinearOperator that implements
matvecs. As right-hand side you can use the simple function f = 1.
For different values of M plot the convergence curve of the residual r = kb −
Axjk2/kbk2 (for these it is best to use a semilogy plot). Also, investigate how
the number of iterations grows as M increases. Finally, produce nice plots of
the solution u on the unit square.
2
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。