The participants will be contacted through a mailing list and the participation rate is relatively low, so I'm afraid that using a token might not give a good balance of answers for each question.
Agreed. If you send it out to a big group with a low response rate, the idea with pre-defining is not a good option. It was because I thought it is a small sample with people you know well (like an internal team, etc.).
In your case, I don't think this makes sense.
I think the only thing you could really do is to actively observe the number of people coming in and adapt accordingly.
E.g. you could create a random number of between 1-50. So 1-10 is question 1, 11-20 is question 2, etc.
This way, when you see that one question is quite high and another one is quite low, you could adapt the relevance equations that show the respective questions based on the random number accordingly.
Another tought could be to check, depending on the version you use either the plugin from denis that allows you to access statistics within the survey (in case of LS3). In LS4/5 a similiar approach seems to be already implemented in LS without a plugin. But I am not familiar with it, because I haven't used it myself yet and I have seen only few examples talking about this in the forum yet.
With this, it should be possible to create a "least filled/lowest bucket" system. That will spill over to other buckets when one is full.
But this could be quite a lot of work and if this is just for one project, the manual approach might be the one that will do the trick.