Not so fast |

Progress made in all of the intended areas today. Managed to get something working with NTTs using the library, though there’s a question over false positives. I do seem to be getting some of those now, but that may be an implementation error. Something to look into tomorrow I think.

I’ve been putting together a test harness today as well in order to run the algorithms and benchmark them. Early results are a little unexpected. So far it seems that the theoretically faster O(n log m) FFT-based algorithm is generally slower than the O(n log n) version (though it does have some advantages — the smaller memory footprint means I’ve been able to test a few larger inputs). However, there do seem to be *some* input sizes that reverse the trend. Perhaps there’s a sweet spot of pattern size in relation to the text size. More work is needed, but I think the problem may be down to the sheer number of Fourier transforms being performed. Perhaps some tweaking with FFTW is required. Annoyingly I thought I’d realised my mistake…I was re-computing the Fourier Transforms of the pattern for each substring of the text (as I was using the original algorithm as a black box)…but after fixing there was only a fairly modest improvement.

Ben Smithers

Rambling thoughts on Programming / Bioinformatics / Personal Life

Not so fast

Leave a Reply Cancel reply