From now on, the `lang` parameter of the /catalog/search,
/catalog/v2/entries, and /catalog/v2/partial_entries endpoints is
interpreted as a comma-separated list of languages.
Before this change RequestContext::get_query() returned a reordered
query string (alphabetically sorted by the parameter names).
This fix facilitiates testing of responses where the request URL appears
in the response.
Multizim search requires that all selected books be in the same
language.
No new URL query parameter was introduced for specifying the intended
search language - `books.filter.lang` can be used for that purpose.
The server_search unit-test was updated to use a slightly cheating
library xml file where the language of example.zim was tweaked from "en"
to "eng" in order to match that of zimfile.zim. Note that this change
drops from the tested server two other goofy ZIM files corner_cases.zim
and poor.zim that have been/are included in ServerTest.
During static resource preprocessing and compilation their cacheid
values are embedded into libkiwix and can be accessed at runtime.
If a static resource is requsted without specifying any cacheid
it is served as dynamic content (with short TTL and the library id
used for the ETag, though using the cacheid for the ETag would
be better).
If a cacheid is supplied in the request it must match the cacheid of the
resource (otherwise a 404 Not Found error is returned) whereupon the
resource is served as immutable content.
Known issues:
- One issue is caused by the fact that some static resources don't get a
cacheid; this is resolved in the next commit.
- Interaction of this change with the support for dynamically customizing
static resources (via KIWIX_SERVE_CUSTOMIZED_RESOURCES env var) was
not addressed.
Before this fix the root URL for a book was assumed to resolve to the
main page. This was not true for ZIM files containing an entry at an
empty path or with a path equal to "/", resulting in issue #826. The
logic behind this behaviour is found in `kiwix::getEntryFromPath()`.
The fix to that issue is a little more general and will result in an
HTTP redirect in any case where `kiwix::getEntryFromPath(zim, path)`
returns an entry with a real path different from the requested one. In
particular, this will affect the behaviour on ZIM files with the old
namespace scheme, where the requested resource - if not found - is also
looked up in the 'A', 'I', 'J', and/or '-' namespaces. Now instead of
returning the contents of that other resource an HTTP redirect response
will be sent.
If `kiwix-serve` is run with the `--nosearchbar` option the toolbar is
disabled (hidden) in its viewer.
Note however that certain actions performed by the viewer merely with
the purpose of keeping the toolbar up-to-date are still carried out.
`--nosearchbar` option of `kiwix-serve` (despite its misleading name)
was used to disable the entire taskbar. This commit accounts for the
existence of that option only partially:
1. Links to books on the welcome/library page are affected - by default
books are displayed in the viewer, but in a kiwix-serve instance run
with --nosearchbar books are loaded in the top window.
2. The `/viewer` endpoint is enabled unconditionally, so if anyone
enters the viewer URL in the address bar they will see books in the
viewer.
Made the viewer respect the `--blockexternal` and `--nolibrarybutton`
options of `kiwix-serve`. Those options are passed to the viewer
via the dynamically generated resource `/viewer_settings.js`.
The only place that the root link is now used is in /skin/index.js,
so added it in static/templates/index.html. But it seems that nothing
prevents us from from switching from aboslute paths to relative paths
in /skin/index.js, which will eliminate the need for the root link
altogether.
As a result of this change content is never decorated by kiwix serve.
This resulted in compiler aided discovery of all call sites where the
default values were used. For OPDS/catalog requests now passing true for the
`raw` parameter, since XML content isn't supposed to undergo any
transformations.
Removed the isHomePage param from one of the variants of
`ContentResponse::build()`. The other overload is dangerous since
failing to review&update all of its call site may result in changed
semantics. Will do it in a couple of separate commits.
The next goal is to redirect old-style /book/path/to/entry URLs to
/content/book/path/to/entry, which seemed pretty trivial.
However, given the current handling of some endpoint URLs, more work was
required to ensure that invalid endpoint URLs (e.g. "/random/number" or
"/suggest/fr") are not interpreted as content URLs. Previously, that was
not a user-observable issue, since the result would be an immediate 404
error (except in certain edge cases, like handling the request for
"/random/number" when there is a book with name "random" containing an
article at path "/number"). With redirection of URLs that were assumed
to refer to content a 404 error would be issued for the
transformed URL ("/content/random/number") which may be confusing.
Therefore this change is to ensure the correct routing of endpoint URL
handling.
Book content is now served under /content/book/...
The old access to book content via a top-level URL /book/... is so far
preserved for backward compatibility.
Redirects were changed to use the new URL scheme. Links in the search results
still use the old scheme.
If the server is initialized with a library.xml file, then the id
specified in the XML file is used (rather than the UUID recorded in the
ZIM file).
Note that in test/data/library.xml the book ids are fake and
different from the real ZIM IDs; that file was created for testing
of the /catalog endpoint which doesn't access ZIM content, so the
the same ZIM file zimfile.zim was added to library.xml three times as
three different books (with unique human-friendly ids). This explains
the diff in test/library_server.cpp.
During work on the kiwix-serve front-end, the edit-save-test cycle is
a multistep procedure:
1. build and install libkiwix
2. build kiwix-tools
3. run kiwix-serve
4. reload the web-page in the browser
When making changes in static resources that are served by kiwix-serve
unmodified, the steps 1-3 can be eliminated if kiwix-serve is capable of
serving resources from the file-system. This commit adds such a
functionality to kiwix-serve. Now, if during startup of kiwix-serve the
environment variable `KIWIX_SERVE_CUSTOMIZED_RESOURCES` is defined it is
assumed to point to a file where every line has the following format:
URL MIMETYPE RESOURCE_FILE_PATH
When a request is received by kiwix-serve and its URL matches any of the
URLs read from the customized resource file, then the resource data is
read from the respective file RESOURCE_FILE_PATH and served with
mime-type MIMETYPE.
Though this feature was introduced in order to facilitate the
development of the iframe-based content viewer, it can also be useful to
users who would like to customize the kiwix-serve front-end on their own
(without re-building all of kiwix-serve).
There is some overlap with a feature of the kiwix-compile-resources
script that also allows to override resources. The differences are:
1. The new way of customizing front-end resources has all such resources
listed in a text file and there is a single environment variable
from which the path of that file is read. kiwix-compile-resources
associates a separate environment variable with each resource.
2. The new way uses regular paths to identify a resource. The
kiwix-compile-resources method encodes the resource path by replacing
any non-alphanumeric characters (including the path separator) with
underscores (so that the resulting resource identifier can be used
to construct the name of the environment variable controlling that
resource).
3. The new method allows adding new front-end resources. The old method
only allows to modify existing resources.
4. The new method allows (actually requires) to specify the URL at which
the overriden resource should be served (similarly, the MIME-type can/must
be specified, too). The old method only allows to override the contents of
a resource.
5. The new method only allows to override front-end resources that are
served without any preprocessing by kiwix-serve at runtime. The old
method allows to override template resources as well (note that
internationalization/translation resources cannot be overriden using the
old method, either).
If we keep a reference to a `Reader` it is better to (share) owning
the reference. Else the reader may be deleted after we create the searcher.
This is especially the case now we are creating the `Reader` at demand
and we don't store it in the library's cache.
We have to reuse the query the user give us to generate the
pagination links.
At search result rendering step we don't have access to the query object.
The best place to know which arguments are used to select books
(and so which arguments to keep in the pagination links) is when we
parse the query to select books.
Fix tests (pagination links) with book selector other than "books.id="
(pattern=jazz&books.query.lang=eng)
libzim's search is not thread safe (mainly because xapian is not).
So we must protect our search objects from multi thread calls.
The best way to do this is to associate a mutex to the `zim::Searcher`
and lock the searcher each time we access object derivated from the
searcher (search, results, iterator, ...)
When ConcurrentCache store a shared_ptr we may have shared_ptr in used
while the ConcurrentCache has drop it.
When we "recreate" a value to put in the cache, we don't want to recreate
it, but copying the shared_ptr in use.
To do so we use a (unlimited) store of weak_ptr (aka `WeakStore`)
Every created shared_ptr added to the cache has a weak_ptr ref also stored
in the WeakStore, and we check the WeakStore before creating the value.
The prefix will be used to parse a "query to select book" in different context.
For now we have only one context : selecting books for the catalog search.
But we will want to select books to do fulltext search on them
(will be done in later commit)
- Throw a exception if we cannot extract from string.
(We throw the same exception as `std::sto*`)
- Add a specialization to extract string from string
- Add some unit test
The story of search_results.css
static/skin/search_results.css was extracted from
static/templates/no_search_result.html before the latter was dropped.
static/templates/no_search_result.html in turn seems to be a copied and
edited version of static/templates/search_result.html.
In the context of exploratory work on the internationalization of
kiwix-serve (PR #679) I noticed duplication of inline CSS across those
two templates and intended to eliminated it. That goal was not fully
accomplished (static/templates/search_result.html remained untouched)
because by that time PR #679 grew too big and the efforts were diverted
into splitting it into smaller ones. Thus search_results.css slipped
into one of those small PRs, without making much sense because nothing
really justifies preserving custom CSS in the "Fulltext search unavailable"
error page.
At the same time, it served as the only case where a link to a cacheable
resource is generated in C++ code (rather than found in a template).
This poses certain problems to the handling of cache-ids. A workaround
is to expel the URL into a template so that it is processed by
`kiwix-resources`. This commit merely demonstrates that solution. But
whether it should be preserved (or rather the "Fulltext search
unavailable" page should be deprived of CSS) is questionable.
In the absence of the "userlang" query parameter in the URL, the value
of the "Accept-Language" header is used. However, it is assumed that
"Accept-Language" specifies a single language (rather than a comma
separated list of languages possibly weighted with quality values).
Example:
Accept-Language: fr
// should work
Accept-Language: fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5
// The requested language will be considered to be
// "fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5".
// The i18n code will fail to find resources for such a language
// and will use the default "en" instead.
At this point a potential issue has been revealed. Now we produce
the final HTML via 2-level template expansion
1. Render parameterized messages
2. Render the HTML template
In which templates we should use double mustache "{{}}" (HTML-escaping)
tags and where we may use triple mustache "{{{}}}" (non-escaping) tags?
Introduced a new resource compiler script kiwix-compile-i18n that
processes i18n string data stored in JSON files and generates sorted C++
tables of string keys and values for all languages.
The "Fulltext search unavailable" error page is now generated using the
static/templates/error.html template. Also added two test cases checking
that error page.