Quantcast
Channel: ARM Connected Community: Message List
Viewing all articles
Browse latest Browse all 8551

Re: About the Global Task Scheduling of the big.LTTILE MP, I do not understand when we use Multiprocessing technology,such as OpenMP,on the big.Little platform, I do not know which cpu will process the task?

$
0
0

I agree with Pete above - the OS will move openMP threads to big or LITTLE cores based on dynamically measured performance and load history for that thread. A few other points on OpenMP that might be helpful:

 

1.  OpenMP assumes that all processing elements (CPUs) are of the same type and capability. Ideally, the programmer isn't supposed to care where a work item (loop iteration etc) ends up.  2.  OpenMP implementations are split into a run-time component and a library component. An application links with the library and typically uses environment variables to influence the run-time in limited ways.

3.  Typically, the run-time component (under direction from the library) spawns a worker thread per CPU. The number of these threads per CPU can be modified using environment variables.  The library and application combination will farm out jobs to these worker threads as needed. The key thing to bear in mind is that by default, the affinity of these worker threads is not set - they don't have a preference assigned to big or LITTLE cores. So on an N CPU system, there will be N worker threads spawned but where they run is up to the scheduler. The OS scheduler is expected to 'do the right thing' and schedule to the appropriate core. In this default case, the worker threads would end up on suitable processors on a big.LITTLE system, as directed by our scheduler modifications depending on the load variation of these threads. It is possible to use OpenMP directives to pin worker threads to specific CPUs. If this is done, the big.LITTLE MP extensions will respect this affinity expression and the threads will not be touched, irrespective of their load accrual. So if the threads are pinned to the wrong size core it would be suboptimal, and so perhaps better to let the OS make the decision unless there are threads known to have low CPU performance needs, or known to be memory bound for example.

 

Hope this helps - this is an interesting question.


Viewing all articles
Browse latest Browse all 8551

Trending Articles