The Intel Xeon Phi Knights Landing manycore processor comes with new interesting features: on-chip high-bandwidth memory and several user-selectable NUMA configurations. In this paper, we look into how these affect applications that target the Open Community Runtime (OCR), an asynchronous tasked-based runtime system for future parallel architectures.
We have extended our OCR runtime to make it NUMA aware and to allow it to use the high-bandwidth memory. We have conducted a range of experiments, comparing OpenMP, TBB, our OCR implementation, and the reference OCR implementation on different machine configurations using a memory intensive seismic simulation.