After switching to Xapian-based search in the library/catalog, an empty
query stopped acting as a match-all query. This commit restores the old
behaviour in that regard.
Returning status code 204 in case of an empty results doesn't show the
empty results page as described in #466. Reverting the changes in #396
fixes the issue.
The existing example.zim has an improper main page entry hence an
unusable index as reported in openzim/libzim#521
To replace the buggy zim, a new zim has been generated using the latest
zimwriterfs with latest libzim.
-------------------------------------------------------------------------
The directory used to create zim is given below and are two pages from
wikibooks site:
htmlContent
├── favicon.png
├── FreedomBox for Communities_Offline Wikipedia - Wikibooks, open books for an open world_files
│ ├── index.php
│ ├── load.php
│ ├── poweredby_mediawiki_88x31.png
│ └── wikimedia-button.png
├── FreedomBox for Communities_Offline Wikipedia - Wikibooks, open books for an open world.html
├── Wikibooks_files
│ ├── 234px-Megakaryocyte1.svg.png
│ ├── 287px-ChewyGingerCookies.jpg
│ ├── 36px-Commons-logo.svg.png
│ ├── 40px-Wikiquote-logo.svg.png
│ ├── 41px-Wikispecies-logo.svg.png
│ ├── 46px-Wikisource-logo.svg.png
│ ├── 48px-MediaWiki-2020-icon.svg.png
│ ├── 48px-Phacility_phabricator_logo.png
│ ├── 48px-Wikimedia_Cloud_Services_logo.svg.png
│ ├── 48px-Wikimedia_Community_Logo.svg.png
│ ├── 48px-Wikivoyage-Logo-v3-icon.svg.png
│ ├── 51px-Wiktionary-logo.svg.png
│ ├── 53px-Wikipedia-logo-v2.svg.png
│ ├── 59px-Wikiversity_logo_2017.svg.png
│ ├── 86px-Wikidata-logo.svg.png
│ ├── 88px-Wikinews-logo.svg.png
│ ├── Haskell-logo.png
│ ├── index.php
│ ├── load.php
│ ├── poweredby_mediawiki_88x31.png
│ └── wikimedia-button.png
└── Wikibooks.html
The command for writing the zim:
$ zimwriterfs --welcome=Wikibooks.html --favicon=favicon.png --language=en --title=Wikibooks --description=testZim --creator=test --publisher=test --verbose ./htmlContent ./out/example.zim
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.
Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
This change fixes the failure of the LibraryTest.filterByPublisher
unit-test broken by the previous commit.
The previous approach used in `publisherQuery()` for building a phrase
query enforcing the specified prefix for all terms fails if
1. the input phrase contains a non-word term that Xapian's query parser
doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc);
2. the input phrase contains at least three terms that Xapian's query
parser has no issue with.
Using the `quest` tool (coming with xapian-tools under Ubuntu) the
issue can be demonstrated as follows:
```
$ quest -o phrase -d some_xapian_db "Energy & security"
Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2))
Exactly 0 matches
MSet:
$ quest -o phrase -d some_xapian_db "Energy & security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
$ quest -o phrase -d some_xapian_db 'Energy 1/2 security act'
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
$ quest -o phrase -d some_xapian_db "Energy a#1 security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
```
The problem comes from parsing the query with the default operation set
to `OP_PHRASE` (exemplified by the `-o phrase` option in above
invocations of `quest`). A workaround is to parse the phrase with a
default operation of `OP_OR` and then combine all the terms with
`OP_PHRASE`.
Besides stemming should be disabled in order to target an exact phrase
match (save for the non-word terms, if any, that are ignored by the
query parser).
Moved the `filter.hasQuery()` check inside `buildXapianQuery()`.
`Library::filterViaBookDB()` only cares if the query that is going to be
run on the book DB would match all documents. The rest of changes
related to enhancing the usage of Xapian for the catalog search will
happen inside `buildXapianQuery()` and `updateBookDB()`.