Not so fast

Progress made in all of the intended areas today. Managed to get something working with NTTs using the library, though there’s a question over false positives. I do seem to be getting some of those now, but that may be an implementation error. Something to look into tomorrow I think.

I’ve been putting together a test harness today as well in order to run the algorithms and benchmark them. Early results are a little unexpected. So far it seems that the theoretically faster O(n log m) FFT-based algorithm is generally slower than the O(n log n) version (though it does have some advantages — the smaller memory footprint means I’ve been able to test a few larger inputs). However, there do seem to be *some* input sizes that reverse the trend. Perhaps there’s a sweet spot of pattern size in relation to the text size. More work is needed, but I think the problem may be down to the sheer number of Fourier transforms being performed. Perhaps some tweaking with FFTW is required. Annoyingly I thought I’d realised my mistake…I was re-computing the Fourier Transforms of the pattern for each substring of the text (as I was using the original algorithm as a black box)…but after fixing there was only a fairly modest improvement.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">