amdkfd: use sizeof(long) granularity for the pasid bitmask
All the bit operations (such as find_first_zero_bit()) read sizeof(long) bytes
at a time. If we allocated less than sizeof(long) bytes for the bitmask we
would be accessing invalid memory when working with the bitmask.
Change the allocator to allocate sizeof(long) multiples for the bitmask.