Swish-e 2.4.3

Written in the mid-afternoon in English • Tags: ,

Updated Swish-e to version 2.4.3 in pkgsrc.

Notable changes:

  • “Fixed” libxml2’s change in UTF8Toisolat1() return value

    Bernhard Weisshuhn supplied a patch to parser.c for checking the return value of UTF8Toisolat1(). Seems that libxml2 now returns the number of characters converted instead of zero for success. http://bugzilla.gnome.org/show_bug.cgi?id=153937

  • Added swish-config and pkg-config

    Swish now provides a swish-config script and config file for the pkg-config utility. These tools help when building programs that link with the swish-e library.

  • Added SwishFuzzy function

    SwishFuzzy function (SWISH::API::Fuzzy) lets you stem a word without first searching. This might be helpful for playing with queries prior to the search.

  • Fixed Buzzwords (and other word lists entered in the config)

    Words entered in config were not converted to lower case before storing in the index.

  • Fixed metaname mapping problem in Merge

    Peter Karman found an error when merging indexes where the source indexes had the same metanames, but listed in a different order in their config files. Words would then be indexed under the wrong metaID number in the output index.

  • Added -R option to support IDF word weighting in ranking. (karman)

    Added Inverse Document Frequency calculation to the getrank() routine. This will allow the relative frequency of a word in relationship to other words in the query to impact the ranking of documents.

  • Swish.cgi now kills swish-e on time out

    The example script swish.cgi uses an alarm (on platforms that support alarm) to abort processing after some number of seconds, but it was not killing the child process, swish-e. Bill Schell submitted a patch to kill the child when the alarm triggers.

  • The template search.tt was renamed to swish.tt

    The template was renamed because it’s used by swish.cgi, not by search.cgi, which was confusing.

  • Updates to the search.cgi

    The example script search.cgi was updated to work better with mod_perl and to use external template files and style sheets.

  • New MS Word Filter

    James Job provided the SWISH::Filter::Doc2html filter that uses the wvWare ([59]http://wvware.sourceforge.net/) program for filtering MS Word documents. If both catdoc and wvWare are installed then wvWare will be used.

  • Change in way symbolic links are followed

    John-Marc Chandonia pointed out that if a symlink is skipped by FileRules, then the actual file/directory is marked as “already seen” and cannot be indexed by other links or directly. Now, files and directories are not marked “already seen” until after passing FileRules (i.e after a file is actually indexed or a directory is processed).

  • UseStemming didn’t take no for an answer

    UseStemming was coded as an alias for FuzzyIndexingMode when Snowball was compiled in (the default), but “no” doesn’t always mean no when the Norwegian stemmer is available.

  • Updated the index_hypermail.pl

    Updated to work with latest version of hypermail (pre-2.1.9).

  • Fixed segfault when generating warnings while parsing

    Parser.c was incorrectly calling warning() incorrectly. And -Wall was not catching this!

The complete list of revisions is at http://swish-e.org/docs/changes.html.