References

[1]

CERNLIB - CERN Program Library. https://cernlib.web.cern.ch/cernlib.

[2]

NVIDIA CUDA Toolkit. http://developer.nvidia.com/cuda-toolkit.

[3]

Intel Threading Building Blocks. http://www.threadingbuildingblocks.org/.

[4]

The OpenMP API specification for parallel programming. http://www.openmp.org/.

[5]

Structure of arrays or SoA is a layout separating elements of a structure into one parallel array per field. https://en.wikipedia.org/wiki/AOS_and_SOA.

[6]

Sequential back-end defined in Thrust.

[8]

Muriel Pivk and Francois R. Le Diberder. SPlot: A Statistical tool to unfold data distributions. Nucl. Instrum. Meth., A555:356–369, 2005. arXiv:physics/0402083, doi:10.1016/j.nima.2005.08.106.

[9]

A.C. Genz and A.A. Malik. Remarks on algorithm 006: An adaptive algorithm for numerical integration over an N-dimensional rectangular region. Journal of Computational and Applied Mathematics, 6(4):295 – 302, 1980. URL: \url{http://www.sciencedirect.com/science/article/pii/0771050X8090039X}, doi:\url{https://doi.org/10.1016/0771-050X(80)90039-X}.

[10]

Jarle Berntsen, Terje O. Espelid, and Alan Genz. An adaptive algorithm for the approximate calculation of multiple integrals. ACM Trans. Math. Softw., 17(4):437–451, Dec 1991. URL: \url{http://doi.acm.org/10.1145/210232.210233}, doi:\url{https://doi.org/10.1145/210232.210233}.