BUG/MEDIUM: queues: Fix arithmetic when feeling non_empty_tgids
Fix the arithmetic when pre-filling non_empty_tgids when we still have
more than 32/64 thread groups left, to get the right index, we of course
have to divide the number of thread groups by the number of bits in a
long.
This bug was introduced by commit
7e1fed4b7a8b862bf7722117f002ee91a836beb5, but hopefully was not hit
because it requires to have at least as much thread groups as there are
bits in a long, which is impossible on 64bits machines, as MAX_TGROUPS
is still 32.