In the sequential version of lil-gp all the subpopulations reside on the same CPU and thus share the same random number generator. Lil-gp provides several ways to initialize the random number generator. Parallel lil-gp uses only one of them in order to ensure correction operations.
Why does the generation of random number become an issue? Well in this case the problem mainly occurs at the moment that the subpopulations are being generated at random. If all the CPU's used in a parallel experiment were to use the same seed this would result in the same individuals being generated on all the CPU's (provided that each CPU processes the same number of subpopulations). This seriously limit the performance of the genetic programming loop as well as that of the parallel program since it would compute the same things several times in parallel.
To circumvent this problem each CPU is provided a different seed
at the start of the experiment. This is done in very simple way at the
moment by incrementing the seed each time it has been send. People who
suspect this as the cause of bad results should have a look at the send_parameter
function. Please let me know if any scientifically justified methods exist
for that matter I will be happy to integrate it into parallel lil-gp.
Previously the distribution of subpopulations on CPU's
and the need for exchanges between different CPU's has been introduced.
One can identify 4 kinds of exchanges depending on the source and destination
subpopulations only. There are 2 additional variations when if one considers
the element being exchanged, an entire individual or a tree.
|Destination: local||Destination: not local|
|Origin: not local||
These four kind of exchanges will be described very briefly.
Above the four kinds of exchanges have been described shortly. What was not mentioned there is that one can adopt 2 different strategies when it comes to importing trees or individuals.
Indeed, no one can predict the evolution of the individuals of a population. In some experments the individuals tend to become bigger hence requiring more computation time when being evaluated. Many other reasons that can lead to an unequal balance of the computational work load exist (not necessarily GP related ones). A direct result of this is that there is no guarantee that the different nodes in de PVM are processing generations at the same pace (give identical hardware that is).
Since the different subpopulations can evolve at different speed it is possible that one node has to wait for a tree/individual that should be send by another node. Or on the other hand on could decide to discard this exchange and move to the next one. This has the advantage of avoiding CPU idle time but makes it impossible to reproduce exactly some runs. The parallel kernel makes it possible to chose between the 2 import strategies: synchronous and asynchronous imports.
In order to make it possible to offer both operation modes the parallel kernel handles exchanges differently than the original kernel. In synchronous mode it is necessary for the kernel to execute the exchange exports first. If this is not done deadlocks are likely to occur if one wants any exchange topology to be possible. In a second step the kernel will receive any trees and individuals that might have been send to it. Only after these two steps the exchanges will be done in a way similar to the one used by the original kernel.
The differences between synchronous and asynchronous mode are represented in the table underneath.
|Export||Examine the exchanges and send to other populations||Idem|
|Import||Wait until all the trees/individuals specified in the exchanges have been received. Also check the generation of elements that are being received.||Look if something arrived, if so we import it otherwise we end the import step. No generation check is done. Elements do not need to be from any precise generation.|
|Local and imports(bis)||Execute the local exchanges or insert the received individuals/trees in the local populations.||If individuals/trees were not received we skip this exchange.|
Asynchronous operations should be faster but I honestly did not notice are real difference yet. The price to pay is the your experiments are depending on factors you can not control (load of the network and cpu load). Synchronous operations the other hand can be reproduced but are generally slower (maybe not from the genetic point of view of course). But you now can repeat your experminents.