It really does seem like there is some-kind of sweet spot in which the O(n log m) version of the FFT algorithm is faster than the O(n log n). Perhaps this make sense…if m is very small in comparison to n, the sheer number of Fourier Transforms and their associated overhead outweighs the asymptotic benefit, but if m is large compared to n, O(n log m) is roughly the same as O(n log n) and the increased complexity means the former takes longer. In any case, it’s back to basics and I’m currently implementing the Fourier Transform myself for two reasons: firstly so that I can verify the above results aren’t due to FFTW’s “plans” and secondly as it should providing a starting point for implementing the NTT myself, which I think may be necessary in order to fully understand what’s going on. Some more research on the NTT has yielded some potential bounds on the input in order to prevent any overflow, but the more I read the less clear the workings of the library become. Understanding the NTT has also required a bit of a “back to basics” approach…much of this morning was spent trying to get some basic number theory concepts clear in my head, which should hopefully help with the NTT as a whole.
So at the end of the first week, I feel it’s been a bit up and down. At times you seem to be getting a lot done but there are also times when it’s hard going. I guess that’s the nature of anything like this.
Plan for next week: get some definitive results on the algorithms for exact pattern matching with don’t cares.