Multiprocessors and Reconfigurable Hardware: practica on multiprocessors

  Practicum assignment

Preferably use Visual C++ (community edition is for free)
Consider the following basic algorithms:

        sum of 2 arrays, convolution/filter on a 1-dimensional signal, reduction of an array (e.g. sum of all elements), dot product of two vectors (arrays), matrix multiplication

Implement and compare the performance based on the following approaches:

        sequential programming, auto-parallelization/auto-vectorization, vectorized (SIMD), multithreaded, with OpenCL (GPU), and with OpenMP.

Compare the outcomes to make sure the algorithms are doing the same thing (use random data as input).
Measure time, calculate computational performance (operations/second), bandwidth (bytes per second) and cycles per basic instruction (query the clock frequency).
Compare this for all the versions.
Calculate the speedup with the sequential version with full optimization (-O2 flag).
Also compare the sequential version with versions with lower optimization levels.

Additional questions:

          How many vector registers does your CPU have?
          Try to measure the time (in cycles) for a scalar and a vector operation.


- Back to the top -