Commit Graph

2173 Commits

Author SHA1 Message Date
Matthieu Gautier 2b38d2cf1b Copy the lrucache test from libzim.
- Adapt lrucache.cpp for rigth include path
  and use `kiwix::lru_cache` instead of `zim::lru_cache`.
- Add missing `#include <set>` in lrucache.h
2022-06-02 12:37:52 +02:00
Matthieu Gautier 0081b4d8e7 Make the limit of zim files per search configurable.
The default value is 0, which means no limit.
2022-06-02 12:37:52 +02:00
Matthieu Gautier b74910b2af Limit the number of zim in multizim fulltext search.
We are currently limiting to 5 but it will be changed in next commit.
2022-06-02 12:37:50 +02:00
Matthieu Gautier cf30233358 Prefix env variable name with `KIWIX_` 2022-06-02 12:23:43 +02:00
Matthieu Gautier f0065fdd6f Introduce Error exception to do i18n 2022-06-02 12:23:42 +02:00
Matthieu Gautier c72132054d Move i18n helper functions 2022-06-02 12:22:28 +02:00
Matthieu Gautier 077ceac5a5 Make the search_rendered handle multizim search.
This introduce a intermediate mustache object to store information
about the request made by the user.
2022-06-02 12:22:28 +02:00
Matthieu Gautier 39d0a56be8 Use selectBooks in handle_search 2022-06-02 12:22:28 +02:00
Matthieu Gautier 76d5fafb72 Introduce `selectBooks`
`selectBooks` allow us to parse a query in a "standard" way to get
the book(s) on which the user want to work.
2022-06-02 12:22:28 +02:00
Matthieu Gautier 4438106c2f Add a prefix in get_search_filter
The prefix will be used to parse a "query to select book" in different context.
For now we have only one context : selecting books for the catalog search.
But we will want to select books to do fulltext search on them
(will be done in later commit)
2022-06-02 12:22:28 +02:00
Matthieu Gautier 76ebfd7ea4 Move get_search_filter and subrange. 2022-06-02 12:22:27 +02:00
Matthieu Gautier 22996e4a6b Allow user to select multiple books when doing search. 2022-06-02 12:22:27 +02:00
Matthieu Gautier 98c54b2279 Handle multiple arguments in RequestContext. 2022-06-02 12:22:27 +02:00
Matthieu Gautier 854623618c Use the newly introduced searcherCache for multizim searcher. 2022-06-02 12:22:25 +02:00
Matthieu Gautier fd0edbba80 Use a set of id as key for a the searcher Cache.
It will allow use to cache seacher for multiple zim files.
2022-05-24 14:55:48 +02:00
Matthieu Gautier f5af0633ec Move the searcher cache into the Library 2022-05-24 14:55:48 +02:00
Matthieu Gautier 740581c55c Link the cache size to the book count.
Unless explicitly set via user env variable.
2022-05-24 14:55:48 +02:00
Matthieu Gautier 582e3ec46d Use a concurrent cache to store Archive cache. 2022-05-24 14:55:48 +02:00
Matthieu Gautier 28fb76bbc2 Remove m_readers in `Library::impl`
It is a deprecated interface and it is a simple wrapper on Archive.
2022-05-24 14:55:48 +02:00
Matthieu Gautier 7c688a4acc Move `getCacheLength` to a generic helper function `getEnvVar` 2022-05-24 14:55:48 +02:00
Kelson d4da05e591
Merge pull request #764 from kiwix/pre_multisearch
Preparatory work on multizim
2022-05-23 19:29:08 +02:00
Matthieu Gautier 66b2449800 Remove unnecessary catch
Catch of std::exception is already made in `handle_request`
2022-05-23 19:17:28 +02:00
Matthieu Gautier aad95e3413 Introduce a results intermediate object in the template rendering.
Url in href must not be html encoded. As we already url encode the path, it
is ok to have `'` in the url.
2022-05-23 19:16:14 +02:00
Matthieu Gautier f0dd34b6db Introduce buildQueryData helper in SearchRenderer 2022-05-23 19:13:25 +02:00
Matthieu Gautier bbdde93f49 Introduce a pagination object to render search result. 2022-05-23 19:12:17 +02:00
Matthieu Gautier cb62da65c3 Raise a exception if something went wrong in the template rendering. 2022-05-23 10:56:39 +02:00
Matthieu Gautier 288b4ae7df Fix count of remote books in `Library::Impl::getBookCount` 2022-05-23 10:56:39 +02:00
Matthieu Gautier 52c12b0c2f Introduce `Library::Impl::getBookCount`
We simply introduce a `getBookCount` which is not protected by a lock.
2022-05-23 10:56:39 +02:00
Matthieu Gautier 4695f47dd2 Introduce operator+= to simplify response creation. 2022-05-23 10:56:39 +02:00
Matthieu Gautier f42f6a60df Use extractFromString to parse request argument.
On top of reusing code, it throw a exception if we cannot convert given
argument in the type we want.
2022-05-23 10:56:39 +02:00
Matthieu Gautier 717c39f2ef Better ExtractFromString
- Throw a exception if we cannot extract from string.
  (We throw the same exception as `std::sto*`)
- Add a specialization to extract string from string
- Add some unit test
2022-05-23 10:56:39 +02:00
Matthieu Gautier aa1f73472d Remove unecessary BookDB helper class.
It was needed to not expose Xapian in public header.
Now we can remove it and directly use a Xapian db.
2022-05-23 10:56:39 +02:00
Matthieu Gautier 090c2fd31a Move LibraryBase out of public API.
We use composition instead of inheritance to implement Library.
2022-05-23 10:56:39 +02:00
Matthieu Gautier ff2c7b1fb2
Merge pull request #765 from kiwix/unittests_for_search_results_page 2022-05-23 10:55:28 +02:00
Veloman Yunkan 963362e1ea One more test-point for search result pagination 2022-05-18 13:30:42 +04:00
Veloman Yunkan 1a8d874a2c Testing the request for an out-of-bounds page 2022-05-18 13:30:42 +04:00
Veloman Yunkan 8e7658bb10 Almost full coverage of search result pagination
The snippets in the test data had to be updated to account for
pagination-dependent snippet variability of pre-7.2.2 libzim.
2022-05-18 13:28:52 +04:00
Veloman Yunkan 8f2f93371b Changed a test in order to avoid a bug in Xapian
Xapian version 1.4.18 contains a bug in snippet generation caused by
incorrect handling of stemming.

The test-point with a search pattern "beatles" produced snippets with no
highlights of the search term. Debugging showed that the search pattern
"beatles" was transformed to a search term "beatl" which then didn't
match the word "beatles" in the text from which a snippet had to be
extracted.

The test case passed on my development machine as well as for most CI
configurations. However the "Packages / build-deb (ubuntu-bionic)"
variant failed because of a slightly different handling of punctuation
at the snippet boundaries:

Test context:
  url: /ROOT/search?pattern=beatles&content=zimfile
  actual snippet:   ...side "Yellow Submarine" ...........
  expected snippet: ...-side "Yellow Submarine" ...........

Above mismatch resulted in a looser comparison of the snippet contents
and failed the requirement that the snippet MUST contain highlights
(this is how the said bug in Xapian was discovered).

An attempt to change the search pattern to "field" didn't eliminate the
problem. Despite the search pattern itself being in singular form (i.e.
identical to its stemmed version) the plural form "fields" in the
snippet was still not highlighted.

Using for a search pattern an adjective instead of a noun achieved the
desired outcome.
2022-05-18 13:28:52 +04:00
Veloman Yunkan eeca88573b Validation of snippets in search results
The "expected" snippets in the test data must be a union of all possible
snippets produced at runtime for a given (document, search terms) pair
on all platforms of interest:

- Overlapping snippets must be properly merged

- Non-overlapping snippets can be joined with a " ... " in between.
2022-05-18 13:20:27 +04:00
Veloman Yunkan 4521249452 Excluded snippets from search results validation 2022-05-18 13:05:29 +04:00
Veloman Yunkan 21e183c2e4 First test for a non-first page of search results 2022-05-18 12:45:47 +04:00
Veloman Yunkan d56ccbd019 First search results test-point with pagination 2022-05-18 12:45:47 +04:00
Veloman Yunkan 825cf1c948 Added a test-point for a large unpaginated search 2022-05-18 12:45:47 +04:00
Veloman Yunkan 57c31a43a4 Another simple test-point for /search endpoint 2022-05-18 12:45:47 +04:00
Veloman Yunkan 84c68d4d7b Search results pagination bugfix
Search results pagination is disabled for a single page outcome too.
2022-05-18 12:45:47 +04:00
Veloman Yunkan f2cf42427a New unit-test TaskbarlessServerTest.searchResults
This is a preliminary implementation checking only the following
cases:

- no search results
- all search results fitting on a single page

The second test-case fails because of a bug in search renderer (leading
to the pagination footer being pointlessly enabled). Will fix it in the
next commit.
2022-05-18 12:45:47 +04:00
Veloman Yunkan 612ecc975d Support for testing a server without a taskbar
Taskbar injected by a server adds distraction to unit-tests focusing
on the HTML contents of the returned pages. The new test-suite
TaskbarlessServerTest will have taskbar disabled.
2022-05-18 12:45:47 +04:00
Veloman Yunkan ae56d399b7 Explained why search_result.html needs inline CSS
In #727 inline CSS [was extracted](e4a4b2f961)
from `static/templates/no_search_result.html` into a separate stylesheet
resource. The purpose was to later

1. get rid of the custom `static/templates/no_search_result.html` error
   template and use a general purpose error template instead (this was
   accomplished by PR #744).

2. deduplicate the CSS code between `static/templates/no_search_result.html` and
   `static/templates/search_result.html` by making the latter to also refer to
   an internal CSS resource rather than containing inline stylesheet code.

While preparing to implement the 2nd point, I figured out that
`kiwix::SearchRenderer` is used as a component in `kiwix-desktop` too,
which probably would be upset by a link to a libkiwix's internal CSS resource.

This commit documents that finding.
2022-05-18 12:45:47 +04:00
Kelson eaa8c3c91c
Merge pull request #776 from kiwix/fix_i18n_windows
Specify utf8 encoding when opening i18n resource file.
2022-05-17 22:50:20 +02:00
Matthieu Gautier 26c06d8c2a Specify utf8 encoding when opening i18n resource file.
Else, on windows, we will try to open files with "local" encoding (cp1252)
2022-05-17 18:36:35 +02:00