The distribution of workload among available computational units is an essential problem for every parallel system. It has been attended thoroughly from many perspectives, such as thread scheduling in operating systems, task scheduling in frameworks for parallel computations, or constrained scheduling in real-time systems.
However, each system has unique properties and requirements, thus we cannot design a universal scheduler which would accommodate all of them. In this paper, we propose methods for task scheduling in parallel frameworks that process highly heterogeneous tasks that consists of both computational and CPU-blocking operations.
Furthermore, we extend the idea of heterogeneous tasks to design combined scheduler for multi-core CPUs and many-core GPUs. Our methods significantly increase the utilization of hardware resources, which leads to improvement of speedup and throughput.