Current Groovy implementation always do boxing, unboxing regardless types of primitive parameters. This is the reason why it's slow down for arithmetic operations. I've come up with some experiments to optimise primitive for Groovy.
The idea is that I'll generate primitive callsite, or primitive calls when:
1. typing is obvious. For example,
int a = 5
int b = a * (a + 5)
in the above example, the second expression will have 1 box/unbox call and 1 primitive call because of type of (a+5).
With (a +5), this can be generated as Object call((int)a, (int)5) because type of a is obvious, also '5'.
But we won't know type of the result. So that a * (...) will be generated using Object call ((int)a, Object) rather than Object call(int, int).
One may argue that this might be not worth, as for a very long and complex expression this optimisation will be useless. For example,
The above optimisation can do it job only just for 2 inner most expressions. But you will be surprised that it's actually ~23% faster ! Thus IMHO, it's worth to do that.
2. This second part is to trying to optimise that long expression. The entire expression can be replaced with all primitive calls if only if their meta-class and meta-methods are "DEFAULT". I hope I could detect this via a property of CallSite, then generate 1. fast path, 2. original path. This second idea has not been fully experimented yet. So wait and see what can be done.
The idea is that I'll generate primitive callsite, or primitive calls when:
1. typing is obvious. For example,
int a = 5
int b = a * (a + 5)
in the above example, the second expression will have 1 box/unbox call and 1 primitive call because of type of (a+5).
With (a +5), this can be generated as Object call((int)a, (int)5) because type of a is obvious, also '5'.
But we won't know type of the result. So that a * (...) will be generated using Object call ((int)a, Object) rather than Object call(int, int).
One may argue that this might be not worth, as for a very long and complex expression this optimisation will be useless. For example,
private double a(int i, int j) {
return 1 / ((i+j)*(i+j+1)/2.0D +i+1)
}
The above optimisation can do it job only just for 2 inner most expressions. But you will be surprised that it's actually ~23% faster ! Thus IMHO, it's worth to do that.
2. This second part is to trying to optimise that long expression. The entire expression can be replaced with all primitive calls if only if their meta-class and meta-methods are "DEFAULT". I hope I could detect this via a property of CallSite, then generate 1. fast path, 2. original path. This second idea has not been fully experimented yet. So wait and see what can be done.
No comments:
Post a Comment