c++ - OpenMP recursive tasks -
consider following program calculating fibonacci numbers.
uses openmp tasks parallelisation.
#include <iostream> #include <omp.h> using namespace std; int fib(int n) { if(n == 0 || n == 1) return n; int res, a, b; #pragma omp parallel { #pragma omp single { #pragma omp task shared(a) = fib(n-1); #pragma omp task shared(b) b = fib(n-2); #pragma omp taskwait res = a+b; } } return res; } int main() { cout << fib(40); }
i use gcc version 4.8.2 , fedora 20.
when compiling above program g++ -fopenmp name_of_program.cpp -wall , running it, see when looking htop 2 (sometimes 3) threads running. machine i'm running program on has 8 logical cpus. question is, need offload work onto 8 threads. tried export omp_nested=true, leads following error while running program:
libgomp: thread creation failed: resource temporarily unavailable
point of program not efficiently compute fibonacci numbers, use tasks or similar in openmp.
with omp_nested=false, team of threads assigned top-level parallel region, , no threads @ each nested level, @ 2 threads doing useful work.
with omp_nested=true, team of threads assigned @ each level. on system there 8 logical cpus, team size 8. team includes 1 thread outside region, 7 new threads launched. recursion tree fib(n) has fib(n) nodes. (a nice self-referential property of fib!) code might create 7*fib(n) threads, can exhaust resources.
the fix use single parallel region around entire task tree. move omp parallel
, omp single
logic main, outside of fib. way single thread team work on entire task tree.
the general point distinguish potential parallelism actual parallelism. task directives specify potential parallelism, might or might not used during execution. omp parallel
(for practical purposes) specifies actual parallelism. want actual parallelism match available hardware, not swamp machine, have potential parallelism larger, run-time can balance load.
Comments
Post a Comment