When max_swapchain_images was originally added it worked properly,
but was subsequently broken by using the value to specify the number
of buffers allocated.
Due to how the dispmanx driver works, only 2 buffers are ever actually
used, so the 3rd buffer in the "swapchain" ended up doing nothing.
Fix this by restoring it to the original intent, that is, if
max_swapchain_images <= 2 wait for vsync after the flip (reducing
lag), otherwise wait at the last possible moment (increasing lag).
Additionally, fix up some unnecessary void* usage where type safety
could be maintained.