The `ServerTest.RandomOnNonExistentBook` unit test was replaced with a
more general one testing multiple 404 scenarios where the content of the
body is checked too.
This test was introduced with the purpose of testing the error message
in the 404 page returned by /random for a non-existent book. The actual
expected output currently present in this new unit-test is too much for
that purpose and may become a maintenance burden if more tests of that
kind are added.
Packages/build-deb CI flows failed on ubuntu-bionic and ubuntu-focal
with the following mismatch in the ServerTest.suggestions unit-test:
```
[ RUN ] ServerTest.suggestions
../test/server.cpp:715: Failure
Expected equality of these values:
r->body
Which is: ...
removeEOLWhitespaceMarkers(expectedResponse)
Which is: ...
With diff:
@@ -2,5 +2,5 @@
{
\"value\" : \"Ray (movie)\",
- \"label\" : \"Ray (<b>movie</b>...\",
+ \"label\" : \"Ray (<b>movie</b>)\",
\"kind\" : \"path\"
, \"path\" : \"A/Ray_(movie)\"
Test context:
url: /ROOT/suggest?content=zimfile&term=movie
```
For some reason (probably, a bug), the implementation of
`Xapian::MSet::snippet()` on those platforms decided that a single closing
parenthesis is more than is appropriate for inclusion in the snippet and
replaced it with a (longer) ellipsis.
Taking advantage of the necessity to work around that bug, the
ServerTest.suggestions's functional coverage was enhanced - the
problematic test point was replaced with a new one using a phrase
instead of a single term.
The HumanReadableId can contains special char (`&`/`=`/...)
As it is used as to create a url in the opds template,
we must url encode it.
- We don't need to encode the book id as it is a uuid, it never contains
special char.
- We don't need to encode the book url as it is read from the library and
the url must already be correctly encoded in the library.xml.
(tests modified accordingly)
As the name suggests it, this endpoint is not smart :
It returns the content as it is and only if it is present
(no compatibility or whatever).
The only "smart" thing is to return a redirect if the entry is a redirect.
As we render the entry's xml in a separated steps, we need to pass the
rootLocation to all the internal rendering.
Testing with and without root is not so easy.
I've simply made all server tests using a ROOT prefix.
We can assume that if the ROOT is present everywhere we need it, it will not
when we don't need. (As long as we don't hardcode "ROOT" in the server.)
As a result of this clean-up the /suggest endpoint too stopped
generating confusing 404 Not Found errors (which, like in /meta's case
is not that important). Another functional change is that the "term"
parameter became optional.
Book.updateTest is going to be modified so that it relies on
functionality tested by Book.updateFromXMLTest. Hence the order of the
tests better reflect that dependency.
Deduplicated the mustache templates static/templates/catalog_v2_entries.xml
and static/templates/catalog_v2_complete_entry.xml (the latter was
renamed to static/templates/catalog_v2_entry.xml).
This will allow handle_suggest API to accept two arguments `start` and
`suggestionLength` that will allow handle_suggest to retrieve
suggestions in the given range rather than the default 0-10 range.
With this, we eventually want to see the usage of getResults giving
a FAILING TEST. This happens because the second argument to
getResults is NOT `end` of the range, but `maxResultCount` to retrieve.
This will be fixed in the next commit.
Language code to human friendly name translation is now done with the
help of the ICU library. It works if the line
```
-include $(LANGSRCDIR)/resfiles.mk
```
in the file `source/data/Makefile.in` of the icu4c dependency is not
commented out. Currently, the said line is commented out (along with
some other include's) by the `icu4c_custom_data.patch` patch of the
`kiwix-build` tool.
This changes the output of `/catalog/search` as follows:
- Entire search query (rather than only the value of the `q` parameter)
is put in the <title> node.
- Search performed with an empty query presents itself as "All zims".
- The feed id remains stable for identical searches on the same
library.
/catalog/v2/entries is intended to play the combined role of
/catalog/root.xml and /catalog/search of the old OPDS API. Currently,
the latter role is not yet implemented.
Implementation note: instead of tweaking and reusing
`OPDSDumper::dumpOPDSFeed()`, the generation of the OPDS feed is done via `mustache`
and a new template `static/catalog_v2_entries.xml`.
Note: This commit somewhat relaxes validation of non variable
`<updated>` elements in the OPDS feed - the contents of any `<updated>`
element is replaced with the YYYY-MM-DDThh:mm:ssZ placeholder.
After switching to Xapian-based search in the library/catalog, an empty
query stopped acting as a match-all query. This commit restores the old
behaviour in that regard.
Returning status code 204 in case of an empty results doesn't show the
empty results page as described in #466. Reverting the changes in #396
fixes the issue.
The existing example.zim has an improper main page entry hence an
unusable index as reported in openzim/libzim#521
To replace the buggy zim, a new zim has been generated using the latest
zimwriterfs with latest libzim.
-------------------------------------------------------------------------
The directory used to create zim is given below and are two pages from
wikibooks site:
htmlContent
├── favicon.png
├── FreedomBox for Communities_Offline Wikipedia - Wikibooks, open books for an open world_files
│ ├── index.php
│ ├── load.php
│ ├── poweredby_mediawiki_88x31.png
│ └── wikimedia-button.png
├── FreedomBox for Communities_Offline Wikipedia - Wikibooks, open books for an open world.html
├── Wikibooks_files
│ ├── 234px-Megakaryocyte1.svg.png
│ ├── 287px-ChewyGingerCookies.jpg
│ ├── 36px-Commons-logo.svg.png
│ ├── 40px-Wikiquote-logo.svg.png
│ ├── 41px-Wikispecies-logo.svg.png
│ ├── 46px-Wikisource-logo.svg.png
│ ├── 48px-MediaWiki-2020-icon.svg.png
│ ├── 48px-Phacility_phabricator_logo.png
│ ├── 48px-Wikimedia_Cloud_Services_logo.svg.png
│ ├── 48px-Wikimedia_Community_Logo.svg.png
│ ├── 48px-Wikivoyage-Logo-v3-icon.svg.png
│ ├── 51px-Wiktionary-logo.svg.png
│ ├── 53px-Wikipedia-logo-v2.svg.png
│ ├── 59px-Wikiversity_logo_2017.svg.png
│ ├── 86px-Wikidata-logo.svg.png
│ ├── 88px-Wikinews-logo.svg.png
│ ├── Haskell-logo.png
│ ├── index.php
│ ├── load.php
│ ├── poweredby_mediawiki_88x31.png
│ └── wikimedia-button.png
└── Wikibooks.html
The command for writing the zim:
$ zimwriterfs --welcome=Wikibooks.html --favicon=favicon.png --language=en --title=Wikibooks --description=testZim --creator=test --publisher=test --verbose ./htmlContent ./out/example.zim
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.
Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
Now the LibraryTest.filterCheck unit-test validates the actual entries
returned by `Library::filter` (previously only the count of the results
was checked).
The new unit-test fails with a reason not expected before it was
written. The `Library::filter()` operation returns a correct result
after the call to `removeBookById()` (this was a surprise!) but it has
a side-effect of re-adding an empty book with the id still surviving
in the search DB (the emptiness of this re-created book doesn't allow
it to pass the other filtering criteria, which explains why the result
of `Library::filter()` is correct). Had to add a special check
to the new unit-test against that hidden side-effect of
`Library::removeBookById()` + `Library::filter()` combination.
The search text in the catalog query is interpreted as partial by
default, but partial query mode can be disabled in C++. The latter
possibility is not exposed via the /catalog/search kiwix-serve endpoint,
though.
Mimetype may contain a parameters.
Then, the mimetype would be something like "text/html;foo=bar;foz=baz"
It will contains a `;` and `=` and it conflicts with the same operators
we use to separate the items in our list.
We have to use a more advanced algorithm which takes the context into
account.
Fix#416
Originally reported against case sensitivity of the Range header
(see issue #387), this fix applies to all request headers (since
according to RFC 7230 all header fields are case-insensitive, see
https://tools.ietf.org/html/rfc7230#section-3.2). However, a
corresponding unit-test was added only for the Range header.