This commit does some preparation for a new Vulkan compute shader
"blue_noise_filter.glsl". Note that every call of "filter" is followed
by a call to "min_max". The goal is to combine a single invocation of
Vulkan "filter" and log(n) invocations of Vulkan "min_max" in the same
command buffer, which may help with performance. This will be achieved
by passing the "max_in_buf" to the new "filter" compute shader, which
will hold the results of applying the precomputed-gaussian. This buffer
will then be copied to "min_in_buf", and then all is set to call the
Vulkan "min_max" compute shader log(n) times.
Note that log(n) comes from the fact that the Vulkan "min_max" compute
shader does a "reduce" on the input buffers where each SIMD invocation
compares two values and reduces it to 1. Doing this approximately log(n)
(log base 2) times will reduce the input gradually into a single minimum
and single maximum. This works due to having two separate "layouts" for
the same "min_max" shader where the "in" and "out" buffers are swapped
per "layout", and so by calling the other layout each time ensures that
the proper buffers are reduced. (This work has already been done. What's
left is to combine the "filter" and "min_max" Vulkan compute shaders
into the same Vulkan command buffer. But first, the actual setup for the
new Vulkan "filter" compute shader still has some work to do.)
This commit combines the minmax execution via Vulkan compute. The
previous implementation executed compute in vulkan_minmax with a new
command buffer each time. This implementation combines all required
executions of compute in vulkan_minmax in a single command buffer and
uses a pipeline to ensure the enqueued compute calls stay in order.
When only a single item in pbp buffer is changed, only update the single
item in the staging buffer and the device buffer prior to Vulkan compute
execution on buffers.
Order of backends to use:
OpenCL -> Vulkan -> CPU threads
Unless I figure out a way to make Vulkan faster, OpenCL will be the
default backend used, or at least it will have higher priority than
Vulkan if both OpenCL and Vulkan is available.
It was assumed in the previous commit's message that the shader data was
stored on the stack. In actuality, the usage of std::vector<char> uses
dynamically allocated memory, which means the data should be on the heap
not the stack.