Commit Graph

1359 Commits

Author SHA1 Message Date
Nikhil Tanwar c1d7cc37fd Add tags in tiles for /nojs endpoint
Adds span elements for tags
2023-03-28 21:49:31 +05:30
Nikhil Tanwar 6071b98fb7 Import book tiles
Tries to copy the same design of tiles as main page with javascript enabled
2023-03-28 21:49:31 +05:30
Nikhil Tanwar dca47d35f7 Introduce /nojs endpoint
Adds /nojs endpoint for fallback.
Currently, it serves an HTML with book names in library
2023-03-28 20:25:44 +05:30
Nikhil Tanwar d8656ec149 Introduce HTMLDumper
HTMLDumper class will be used to dump library in HTML format. It inherits from LibraryDumper
2023-03-28 20:25:44 +05:30
Nikhil Tanwar f1873876b2 Extract LibraryDumper from OPDSDumper
This change creates a new common class for dumping the library into various formats: LibraryDumper
2023-03-28 20:25:44 +05:30
Veloman Yunkan eb002ae306 Deprecated Book::getLanguage()
Introduced `Book::getCommaSeparatedLanguages()` instead.
2023-03-08 15:24:53 +01:00
Veloman Yunkan 2550306052 One more usage of Book::getLanguages()
`Book::getLanguages()` is used instead of `Book::getLanguage()` when
determining the set of languages for a collection of books.
2023-03-08 15:24:53 +01:00
Veloman Yunkan 51fcb90dc0 Library::updateBookDB() uses Book::getLanguages() 2023-03-08 15:24:53 +01:00
Veloman Yunkan b1ad319d52 Enter Book::getLanguages() 2023-03-08 15:24:53 +01:00
Veloman Yunkan 5bda7fd45c Support for multilang ZIMs 2023-03-08 15:24:53 +01:00
Veloman Yunkan ac742e9da2 Redirection of slashless root URL
With non-empty root location, the canonic form of the root URL for a
kiwix server is now required to end with a slash (to match the situation
for an empty root location). This requirement enables usage of relative
URLs on the welcome page and resources/scripts loaded through that page.

A slashless root URL is redirected to the slashful version.
2023-02-22 17:54:20 +04:00
Veloman Yunkan 2e0124710a `?count=0` OPDS catalog queries return 0 results
... which is a useful way of finding out the total number of results
with the least consumption of resources.
2023-02-10 19:15:29 +01:00
Veloman Yunkan e8c8a297b5 Registered MIME-types for .ico and .json
As a result, favicon.ico stopped being considered a compressible resource.
2023-02-10 15:07:00 +01:00
Nikhil Tanwar 12f0614350 Add Serbo-croate language name
Adds "srpskohrvatski" as name for "hbs" language tag.
2023-02-10 09:20:23 +05:30
Veloman Yunkan c2fffacbbd Renamed a data member 2023-02-09 10:40:23 +01:00
Veloman Yunkan 02f631fdb6 Got rid of RequestContext::full_url 2023-02-09 10:40:23 +01:00
Veloman Yunkan 05a66ead6e URI-encoding of the root location part
Now the root location is URI-encoded too.

In order to properly test this change the root location in the tests was
changed from "/ROOT" to "/ROOT#?" (or "/ROOT%23%3F" in URI-encoded form),
which is why this commit is so big.
2023-02-09 10:40:07 +01:00
Veloman Yunkan 97f0314fe6 Saving a few CPU cycles
This silly optimization in fact helps to avoid a somewhat more serious
waste of CPU cycles that would otherwise result in the next commit.
2023-02-08 22:16:27 +01:00
Veloman Yunkan a7fe4193e3 Preparing to save a few CPU cycles 2023-02-08 22:16:27 +01:00
Veloman Yunkan 2c5e84b6b3 Simpler fullURL2LocalURL() 2023-02-08 22:16:27 +01:00
Veloman Yunkan 71a66e0528 Passing of unrooted URL into RequestContext()
This change doesn't make much sense on its own - the real goal is to
prepare some ground for easier implementation of URI-encoding of the root
location.
2023-02-08 22:16:27 +01:00
Veloman Yunkan a807ce27f1 URI-encoding when redirecting legacy URLs to /content
Testing of this functionality revealed that the query part containing +
symbols (as replacement for spaces in the parameter values) isn't
forwarded properly as the + symbols are URI-encoded (this is a bug on
the part of the `RequestContext::get_query()` the result of which
already contains URI-encoded +'s).
2023-02-08 22:16:27 +01:00
Veloman Yunkan 2e9bec95b0 Proper URI-encoding in InternalServer::build_redirect()
- Before this change `InternalServer::build_redirect()` only URI-encoded the
  article path, ignoring the book name and/or the root location components of
  the URL.

- In order to be able to test this fix, corner_cases.zim was renamed to
  contain a couple of special URL symbols in its filename. The
  `create_corner_cases_zim_file` script was updated accordingly.
2023-02-08 22:16:09 +01:00
Matthieu Gautier 1ba588272c Get `Waiting` downloads before `Active` ones.
`Waiting` can become `Active` while we are getting the downloads.
We may have rare case where we miss a download if we get `Active` before
`Waiting`.
2023-02-08 15:42:17 +01:00
Matthieu Gautier 2c3b7409aa Remove the default value of follow parameter in `updateStatus`.
`false` is a pretty bad default value as most user want to track
the real download.

By removing the default value, we force user to make a choice.
We could have change the default value to true but it would have been
a silent API change and we don't want that.
2023-02-08 15:42:17 +01:00
Matthieu Gautier 18b7b5f277 Mark constant methods as const. 2023-02-08 15:42:17 +01:00
Matthieu Gautier 0e612de4d1 Make `Downloader` return shared_ptr instead of raw pointer.
This is dangerous by nature to return raw pointer on internal data.
2023-02-08 15:42:17 +01:00
Matthieu Gautier 52ae5c3a5f Make Downloader thread safe. 2023-02-08 15:42:17 +01:00
Matthieu Gautier d1fe1b89ae Do not automatically update the status of existing Download.
User may already have a pointer to the `Download` and it is not protected
against concurrent access.

We could update the status of new created `Download` as by definition,
no one have a pointer on it.
But it better to not do it neither :
- For consistency
- Because the first call on update status may be long on windows (because
  of file preallocation). It is better to not block the downloader for that.
2023-02-08 15:42:17 +01:00
Matthieu Gautier 1aa8521e15 Remove the lock.
As we now build a new request handle for every request, we don't need
a lock.

libcurl itself is thread safe as long as we don't share a handle.
2023-02-08 15:42:17 +01:00
Matthieu Gautier 95ebb6a492 Build a new curl "handle" at everyrequest instead of reusing the same one. 2023-02-08 15:42:17 +01:00
Veloman Yunkan ca079a72cc Some clean-up 2023-01-25 19:15:12 +04:00
Veloman Yunkan 471c5b89f4 Dropped the 2nd param of urlEncode()
`urlEncode(str)` is now equivalent to the previous `urlEncode(str, true)`.
2023-01-25 19:15:12 +04:00
Veloman Yunkan 3bf8211b70 Made 2nd param of urlEncode() mandatory
This is a precautionary step before dropping the said parameter.
2023-01-25 19:15:12 +04:00
Veloman Yunkan ec81d5904d Proper URI-encoding in kiwix::getSearchUrl() 2023-01-25 19:15:12 +04:00
Veloman Yunkan 63e0d5c7c2 RequestContext::get_query() is fully URI-encoded 2023-01-25 19:15:12 +04:00
Veloman Yunkan 772243e832 Category name is fully URI-encoded 2023-01-25 19:15:12 +04:00
Veloman Yunkan bad13d76b4 Removed unused code 2023-01-25 19:15:12 +04:00
Veloman Yunkan 0bde4d9412 Properly URI-encoded links in search results
Special URI symbols occurring in the item path part of the search result
link were NOT encoded, because that would also encode the path separator (/)
symbol. Now that `urlEncode()` never encodes the / symbol, it is safe to
encode all other URI-special symbols in the path.
2023-01-25 19:15:12 +04:00
Veloman Yunkan 239b108fa7 / is no longer a reserved char for urlEncode()
This change is a quick hack solving known issues with URI-encoding in
libkiwix.

This change removes the slash character from the list of URL separator
symbols in URL encoding/decoding utilities, and makes it a symbol that
is safe to leave unencoded.

Effects:

- `urlEncode()` never encodes the '/' symbol (even when it is requested
  to encode the URL separator symbols too).

- `urlDecode(str)`/`urlDecode(..., false)` will now decode %2F to '/';
  other encoded URL separator symbols are NOT decoded when the second
  argument of `urlDecode()` is set to false (which is the default).
2023-01-25 19:15:12 +04:00
Veloman Yunkan c5ccbd37e2 Extracted isHarmlessUriChar() 2023-01-25 19:15:12 +04:00
Veloman Yunkan aa2e443eb8 Fixed indentation
Replaced tabs with spaces.
2023-01-25 19:15:11 +04:00
Veloman Yunkan 82d477009d '#' is a URI delimiter symbol 2023-01-25 19:15:11 +04:00
Veloman Yunkan e49081da80 Fixed urlEncode() for chars below 0x10 2023-01-25 19:15:11 +04:00
Veloman Yunkan e35e7585e0 Server sets userlang cookie as global and permanent
Without specifying the "Path" attribute of the cookie in the "Set-Cookie" header
we end up with multiple instances of the cookie for different URLs. We
want a single "global" cookie for kiwix-serve. Besides we want it to be
"permanent" rather than a session cookie, hence the large (1-year-long)
TTL value for the "Max-Age" attribute.
2023-01-24 19:01:32 +01:00
Veloman Yunkan fcb97c3c06 Sparing use of "Set-Cookie: userlang=..." header
Server adds the "Set-Cookie: userlang=..." header to the response only
if the "userlang" cookie is not already present with the same value.
2023-01-24 19:01:32 +01:00
Veloman Yunkan 8eb527389e URI-encoding of redirections to URLs with special symbols 2023-01-10 17:41:59 +04:00
Veloman Yunkan 28e9fb48b6 Properly implemented parseUserLanguagePreferences() 2022-12-14 15:34:46 +01:00
Veloman Yunkan 634f3fcf14 Properly implemented selectMostSuitableLanguage() 2022-12-14 15:34:46 +01:00
Veloman Yunkan 88597e1834 Enter selectMostSuitableLanguage() 2022-12-14 15:34:46 +01:00
Veloman Yunkan 69b3e1f8a7 Moved user language preferences into i18n.{h,cpp} 2022-12-14 15:34:46 +01:00
Veloman Yunkan 669d8898ac Enter UserLangPreferences 2022-12-14 15:34:46 +01:00
Veloman Yunkan 14f0f79061 User language control via userlang cookie 2022-12-14 15:34:46 +01:00
Veloman Yunkan 1d74b5e311 Server sets the userlang cookie on every response 2022-12-14 15:34:46 +01:00
Emmanuel Engelhart 2d42d6dc60
Gzip compress HTTP response for Web fonts 2022-12-07 19:21:27 +01:00
Veloman Yunkan 4966f4155d Fixed handling of backslashes in suggestions 2022-11-17 11:51:53 +04:00
Veloman Yunkan 0f0ae1cfed A small refactoring 2022-11-17 11:51:53 +04:00
Veloman Yunkan da78aae62b kiwix::Suggestions gives up its temporary pedigree 2022-11-17 11:51:53 +04:00
Veloman Yunkan abcd4ade99 kiwix::Suggestions::getJSON() 2022-11-17 11:51:53 +04:00
Veloman Yunkan 7a9780eb90 kiwix::Suggestions::addFTSearchSuggestion() 2022-11-17 11:51:53 +04:00
Veloman Yunkan 51bd881211 kiwix::Suggestions::add() 2022-11-17 11:51:53 +04:00
Veloman Yunkan f36f1661d5 Got rid of result count tracker variable 2022-11-17 11:51:53 +04:00
Veloman Yunkan 18f4a58237 Conception of kiwix::Suggestions 2022-11-17 11:51:53 +04:00
Matthieu Gautier 8cec014691 Use new `zim::Archive::getMediaCount` from libzim.
As libzim also changed the behavior of `zim::Archive::getArticleCount`,
we don't need the hack, and we don't need the code to parse `M/Counter`.
2022-11-02 13:15:47 +01:00
Veloman Yunkan 7d69ece27d OPDS can be filtered using more than one language
From now on, the `lang` parameter of the /catalog/search,
/catalog/v2/entries, and /catalog/v2/partial_entries endpoints is
interpreted as a comma-separated list of languages.
2022-11-01 19:16:30 +01:00
Veloman Yunkan c87add1419 Removed an unused variable 2022-11-01 19:16:30 +01:00
Veloman Yunkan cb02dbd92a RequestContext preserves the exact query string
Before this change RequestContext::get_query() returned a reordered
query string (alphabetically sorted by the parameter names).

This fix facilitiates testing of responses where the request URL appears
in the response.
2022-10-31 13:28:21 +04:00
Veloman Yunkan 9409e8bd91 Preventing confusion of tongues in multizim search
Multizim search requires that all selected books be in the same
language.

No new URL query parameter was introduced for specifying the intended
search language - `books.filter.lang` can be used for that purpose.

The server_search unit-test was updated to use a slightly cheating
library xml file where the language of example.zim was tweaked from "en"
to "eng" in order to match that of zimfile.zim. Note that this change
drops from the tested server two other goofy ZIM files corner_cases.zim
and poor.zim that have been/are included in ServerTest.
2022-10-31 13:27:57 +04:00
Veloman Yunkan cd62b5dd91 Some clean-up 2022-10-31 13:22:15 +04:00
Veloman Yunkan 414d7ae4fe Fixed indentation 2022-10-31 13:22:15 +04:00
Veloman Yunkan 9d2cc35447 Extracted InternalServer::handle_search_request() 2022-10-31 13:22:15 +04:00
Veloman Yunkan 7167ca1e6a Adios kiwix::getArchiveId() 2022-10-31 13:22:15 +04:00
Matthieu Gautier e5b94fa1bb Make the opds_dumper respect the provided nameMapper used in the server.
Fix #828
2022-10-30 19:21:01 +01:00
Veloman Yunkan b9f60ecfe9 Handling of cacheid when serving static resources
During static resource preprocessing and compilation their cacheid
values are embedded into libkiwix and can be accessed at runtime.

If a static resource is requsted without specifying any cacheid
it is served as dynamic content (with short TTL and the library id
used for the ETag, though using the cacheid for the ETag would
be better).

If a cacheid is supplied in the request it must match the cacheid of the
resource (otherwise a 404 Not Found error is returned) whereupon the
resource is served as immutable content.

Known issues:

- One issue is caused by the fact that some static resources don't get a
  cacheid; this is resolved in the next commit.

- Interaction of this change with the support for dynamically customizing
  static resources (via KIWIX_SERVE_CUSTOMIZED_RESOURCES env var) was
  not addressed.
2022-10-19 19:26:04 +04:00
Veloman Yunkan ce8b2bf9d9 Library::removeBookById() updates the revision 2022-10-19 19:26:04 +04:00
Veloman Yunkan 9fd1423100 Small clean-up 2022-10-19 19:26:04 +04:00
Veloman Yunkan 6b8d6232f0 InternalServer::getLibraryId() 2022-10-19 19:26:02 +04:00
Veloman Yunkan c91df1cb26 Two private funcs of InternalServer became free 2022-10-19 19:21:28 +04:00
Veloman Yunkan b249edee60 ETags for ZIM content use the ZIM file UUID 2022-10-19 19:21:28 +04:00
Veloman Yunkan a31ccb6588 Decoupling ETags from the server id 2022-10-19 19:21:28 +04:00
Veloman Yunkan 190156e095 Setting Cache-Control: for three types of content
At this point the ETag value for ZIM content is still generated from the
timestamp of the server start-up time.
2022-10-19 19:21:28 +04:00
Veloman Yunkan 73191fb8f8 Made the /suggest endpoint concurrency-safe 2022-10-13 13:39:25 +04:00
Veloman Yunkan f13ca55ef6 Eliminated the endpointRoot parameter 2022-10-06 14:02:50 +04:00
Veloman Yunkan dc194683bb Split XML generation code for full & partial entries 2022-10-06 13:48:58 +04:00
Veloman Yunkan 0841472004 Separate templates for full & partial OPDS entries 2022-10-06 13:44:39 +04:00
Veloman Yunkan ebb713cb85 Got rid of an unjustified parameter
The XML header is injected in a more straightforward way in the single
location where it is needed.
2022-10-06 12:49:51 +04:00
Veloman Yunkan 582c8d868a New logic for generating HTTP-redirects
Before this fix the root URL for a book was assumed to resolve to the
main page.  This was not true for ZIM files containing an entry at an
empty path or with a path equal to "/", resulting in issue #826. The
logic behind this behaviour is found in `kiwix::getEntryFromPath()`.

The fix to that issue is a little more general and will result in an
HTTP redirect in any case where `kiwix::getEntryFromPath(zim, path)`
returns an entry with a real path different from the requested one. In
particular, this will affect the behaviour on ZIM files with the old
namespace scheme, where the requested resource - if not found - is also
looked up in the 'A', 'I', 'J', and/or '-' namespaces. Now instead of
returning the contents of that other resource an HTTP redirect response
will be sent.
2022-10-04 14:18:08 +04:00
Veloman Yunkan 60148717e1 Fixed search results for kiwix-desktop 2022-09-26 13:11:25 +04:00
Veloman Yunkan fa67b45f50 Got rid of unused *pendToFirstOccurence() funcs 2022-09-21 15:52:26 +04:00
Veloman Yunkan cac2d212c6 Respecting the --nosearchbar option of kiwix-serve
If `kiwix-serve` is run with the `--nosearchbar` option the toolbar is
disabled (hidden) in its viewer.

Note however that certain actions performed by the viewer merely with
the purpose of keeping the toolbar up-to-date are still carried out.
2022-09-21 15:41:40 +04:00
Veloman Yunkan da23e4eca4 Revert "Partly respecting the kiwix-serve --nosearchbar option"
This reverts commit 436d890893713c5eb98df6893d0e0b41b22e2472.
2022-09-21 15:41:40 +04:00
Veloman Yunkan 2be9ac342f Partly respecting the kiwix-serve --nosearchbar option
`--nosearchbar` option of `kiwix-serve` (despite its misleading name)
was used to disable the entire taskbar. This commit accounts for the
existence of that option only partially:

1. Links to books on the welcome/library page are affected - by default
   books are displayed in the viewer, but in a kiwix-serve instance run
   with --nosearchbar books are loaded in the top window.

2. The `/viewer` endpoint is enabled unconditionally, so if anyone
   enters the viewer URL in the address bar they will see books in the
   viewer.
2022-09-21 15:41:40 +04:00
Veloman Yunkan 369406fb5d Viewer settings
Made the viewer respect the `--blockexternal` and `--nolibrarybutton`
options of `kiwix-serve`. Those options are passed to the viewer
via the dynamically generated resource `/viewer_settings.js`.
2022-09-21 15:41:40 +04:00
Veloman Yunkan b81cb3a8e9 Got rid of raw mode in response generation 2022-09-21 15:41:40 +04:00
Veloman Yunkan 6cc677b8ad Dropped ContentResponse::contentDecorationAllowed() 2022-09-21 15:41:40 +04:00
Veloman Yunkan a674561110 Dropped root link injection
The only place that the root link is now used is in /skin/index.js,
so added it in static/templates/index.html. But it seems that nothing
prevents us from from switching from aboslute paths to relative paths
in /skin/index.js, which will eliminate the need for the root link
altogether.

As a result of this change content is never decorated by kiwix serve.
2022-09-21 15:41:40 +04:00
Veloman Yunkan 685e7f8ad4 Unconditional blocking of external links 2022-09-21 15:41:40 +04:00
Veloman Yunkan 0ce36e6246 Got rid of isHomePage in ContentResponse::build() 2022-09-21 15:41:40 +04:00
Veloman Yunkan eb0a45b13e Undefaulted bool params of ContentResponse::build()
This resulted in compiler aided discovery of all call sites where the
default values were used. For OPDS/catalog requests now passing true for the
`raw` parameter, since XML content isn't supposed to undergo any
transformations.
2022-09-21 15:41:40 +04:00
Veloman Yunkan c988511561 Removed unused param from ContentResponse::build()
Removed the isHomePage param from one of the variants of
`ContentResponse::build()`. The other overload is dangerous since
failing to review&update all of its call site may result in changed
semantics. Will do it in a couple of separate commits.
2022-09-21 15:41:40 +04:00
Veloman Yunkan c73e6f9a81 Dropped unused params from ContentResponse ctor 2022-09-21 15:41:40 +04:00
Veloman Yunkan 0cf4850a9b Dropped TaskbarInfo 2022-09-21 15:41:40 +04:00
Veloman Yunkan 40c496d401 Removed old-style taskbar injection
Double-toolbar in the viewer has gone.

Some clean-up has to be performed after this change.
2022-09-21 15:41:40 +04:00
Veloman Yunkan 4db443eca6 Embryo of iframe-based viewer 2022-09-21 15:41:40 +04:00
Emmanuel Engelhart 1062bd73a3
It's libkiwix, not kiwixlib 2022-09-11 16:05:25 +02:00
Veloman Yunkan e323dcf6c9 Redirecting /nonendpoint URLs to /content/nonendpoint 2022-08-11 18:04:05 +04:00
Veloman Yunkan 3b98987cb3 More robust handling of endpoint URLs
The next goal is to redirect old-style /book/path/to/entry URLs to
/content/book/path/to/entry, which seemed pretty trivial.

However, given the current handling of some endpoint URLs, more work was
required to ensure that invalid endpoint URLs (e.g.  "/random/number" or
"/suggest/fr") are not interpreted as content URLs. Previously, that was
not a user-observable issue, since the result would be an immediate 404
error (except in certain edge cases, like handling the request for
"/random/number" when there is a book with name "random" containing an
article at path "/number"). With redirection of URLs that were assumed
to refer to content a 404 error would be issued for the
transformed URL ("/content/random/number") which may be confusing.

Therefore this change is to ensure the correct routing of endpoint URL
handling.
2022-08-11 18:04:05 +04:00
Veloman Yunkan fd36d11ccf Search results now use the /content URL scheme 2022-08-11 18:04:05 +04:00
Veloman Yunkan 1b1c1e352e Introduced /content endpoint
Book content is now served under /content/book/...

The old access to book content via a top-level URL /book/... is so far
preserved for backward compatibility.

Redirects were changed to use the new URL scheme. Links in the search results
still use the old scheme.
2022-08-11 18:04:05 +04:00
Veloman Yunkan a4b18893aa Moved handling of the "/" URL 2022-08-11 18:04:05 +04:00
Veloman Yunkan cff143b4ec Included tags in free text catalog search 2022-08-06 07:39:45 +02:00
Veloman Yunkan 111aab0c23 Illustration URL uses the book UUID
If the server is initialized with a library.xml file, then the id
specified in the XML file is used (rather than the UUID recorded in the
ZIM file).

Note that in test/data/library.xml the book ids are fake and
different from the real ZIM IDs; that file was created for testing
of the /catalog endpoint which doesn't access ZIM content, so the
the same ZIM file zimfile.zim was added to library.xml three times as
three different books (with unique human-friendly ids). This explains
the diff in test/library_server.cpp.
2022-08-03 16:13:21 +02:00
Veloman Yunkan 28f8dbcf20 New unit-test stringTools.ICULanguageInfo 2022-07-07 16:13:49 +04:00
Matthieu Gautier 71e2df7406
Explicit std
Removed headers were `using namespace std`.
So we have to be explicit everywhere.
2022-07-02 16:33:32 +02:00
Matthieu Gautier 69931fb347
Remove libzim's wrapper.
It is time to remove them. They are deprecated since 10.0.0
2022-07-02 16:33:32 +02:00
Veloman Yunkan e3e4bfa533 Support for serving customized resources
During work on the kiwix-serve front-end, the edit-save-test cycle is
a multistep procedure:

1. build and install libkiwix
2. build kiwix-tools
3. run kiwix-serve
4. reload the web-page in the browser

When making changes in static resources that are served by kiwix-serve
unmodified, the steps 1-3 can be eliminated if kiwix-serve is capable of
serving resources from the file-system. This commit adds such a
functionality to kiwix-serve. Now, if during startup of kiwix-serve the
environment variable `KIWIX_SERVE_CUSTOMIZED_RESOURCES` is defined it is
assumed to point to a file where every line has the following format:

URL MIMETYPE RESOURCE_FILE_PATH

When a request is received by kiwix-serve and its URL matches any of the
URLs read from the customized resource file, then the resource data is
read from the respective file RESOURCE_FILE_PATH and served with
mime-type MIMETYPE.

Though this feature was introduced in order to facilitate the
development of the iframe-based content viewer, it can also be useful to
users who would like to customize the kiwix-serve front-end on their own
(without re-building all of kiwix-serve).

There is some overlap with a feature of the kiwix-compile-resources
script that also allows to override resources. The differences are:

1. The new way of customizing front-end resources has all such resources
   listed in a text file and there is a single environment variable
   from which the path of that file is read. kiwix-compile-resources
   associates a separate environment variable with each resource.

2. The new way uses regular paths to identify a resource. The
   kiwix-compile-resources method encodes the resource path by replacing
   any non-alphanumeric characters (including the path separator) with
   underscores (so that the resulting resource identifier can be used
   to construct the name of the environment variable controlling that
   resource).

3. The new method allows adding new front-end resources. The old method
   only allows to modify existing resources.

4. The new method allows (actually requires) to specify the URL at which
   the overriden resource should be served (similarly, the MIME-type can/must
   be specified, too). The old method only allows to override the contents of
   a resource.

5. The new method only allows to override front-end resources that are
   served without any preprocessing by kiwix-serve at runtime. The old
   method allows to override template resources as well (note that
   internationalization/translation resources cannot be overriden using the
   old method, either).
2022-06-22 10:59:41 +02:00
Matthieu Gautier b442e2371e Do not use deprecated constructor for Reader.
We have a specific private non deprecated constructor especially for that,
let's use it.
2022-06-10 10:41:31 +02:00
Matthieu Gautier 70382d15e2 Windows compiler complains about the implicit cast from double to size_t. 2022-06-09 15:21:06 +02:00
Matthieu Gautier 01c384bb64 Remove the java wrapper.
- The meson's `wrapper` option is removed.
- New meson's option `static-linkage` is added to tell meson to link
  with static library.
2022-06-09 10:23:02 +02:00
Matthieu Gautier bfcf317f09 Properly set "language" parameter in `opensearch::Query` tag. 2022-06-03 15:46:41 +02:00
Matthieu Gautier 7cb98f7f4e Make opensearch start parameter 1 indexed. 2022-06-03 15:46:41 +02:00
Matthieu Gautier cadd2a5cbb Make the HTTPErrorHtmlResponse not Html only. 2022-06-03 15:46:41 +02:00
Matthieu Gautier e51a5b9ebc Introduce `get_requested_format` helper 2022-06-03 15:46:41 +02:00
Matthieu Gautier 5d6b0ea96a Add searchdescription.xml endpoint 2022-06-03 15:46:41 +02:00
Matthieu Gautier e5df5e936f Render the search result using (opensearch/atom) xml format. 2022-06-03 15:46:41 +02:00
Matthieu Gautier fbc7656b3f Use proper argument order when building the SearchRenderer from a Searcher 2022-06-02 17:08:50 +02:00
Matthieu Gautier d196496802 Make the Searcher owning the stored Reader
If we keep a reference to a `Reader` it is better to (share) owning
the reference. Else the reader may be deleted after we create the searcher.

This is especially the case now we are creating the `Reader` at demand
and we don't store it in the library's cache.
2022-06-02 17:08:17 +02:00
Matthieu Gautier a7651d0e9b Check early that provided bookIds are valid 2022-06-02 12:37:52 +02:00
Matthieu Gautier 3bca43344f Correctly url encode querystring
Fix tests with querystring needed url encoding
(pattern=jazz&books.query.title=Ray%20Charles)
2022-06-02 12:37:52 +02:00
Matthieu Gautier b857293cfd Build the bookSelection query string when we parse the query.
We have to reuse the query the user give us to generate the
pagination links.
At search result rendering step we don't have access to the query object.
The best place to know which arguments are used to select books
(and so which arguments to keep in the pagination links) is when we
parse the query to select books.

Fix tests (pagination links) with book selector other than "books.id="
(pattern=jazz&books.query.lang=eng)
2022-06-02 12:37:52 +02:00
Matthieu Gautier b483a8e4e4 Make the request_context be able to generate a querystring for a subset.
The request_context can now take a filter to select arguments to
keep in the query string.
2022-06-02 12:37:52 +02:00
Matthieu Gautier e2ab7fd62e Add some more testing.
Note that some tests are failing and will be fixed in next commits.
2022-06-02 12:37:52 +02:00
Matthieu Gautier 1514661c26 Protect search from multi threading race condition.
libzim's search is not thread safe (mainly because xapian is not).
So we must protect our search objects from multi thread calls.

The best way to do this is to associate a mutex to the `zim::Searcher`
and lock the searcher each time we access object derivated from the
searcher (search, results, iterator, ...)
2022-06-02 12:37:52 +02:00
Matthieu Gautier e5ea210d2c Add a template specialization for ConcurrentCache storing shared_ptr
When ConcurrentCache store a shared_ptr we may have shared_ptr in used
while the ConcurrentCache has drop it.
When we "recreate" a value to put in the cache, we don't want to recreate
it, but copying the shared_ptr in use.

To do so we use a (unlimited) store of weak_ptr (aka `WeakStore`)
Every created shared_ptr added to the cache has a weak_ptr ref also stored
in the WeakStore, and we check the WeakStore before creating the value.
2022-06-02 12:37:52 +02:00
Matthieu Gautier 2b38d2cf1b Copy the lrucache test from libzim.
- Adapt lrucache.cpp for rigth include path
  and use `kiwix::lru_cache` instead of `zim::lru_cache`.
- Add missing `#include <set>` in lrucache.h
2022-06-02 12:37:52 +02:00
Matthieu Gautier 0081b4d8e7 Make the limit of zim files per search configurable.
The default value is 0, which means no limit.
2022-06-02 12:37:52 +02:00
Matthieu Gautier b74910b2af Limit the number of zim in multizim fulltext search.
We are currently limiting to 5 but it will be changed in next commit.
2022-06-02 12:37:50 +02:00
Matthieu Gautier cf30233358 Prefix env variable name with `KIWIX_` 2022-06-02 12:23:43 +02:00
Matthieu Gautier f0065fdd6f Introduce Error exception to do i18n 2022-06-02 12:23:42 +02:00
Matthieu Gautier c72132054d Move i18n helper functions 2022-06-02 12:22:28 +02:00
Matthieu Gautier 077ceac5a5 Make the search_rendered handle multizim search.
This introduce a intermediate mustache object to store information
about the request made by the user.
2022-06-02 12:22:28 +02:00
Matthieu Gautier 39d0a56be8 Use selectBooks in handle_search 2022-06-02 12:22:28 +02:00
Matthieu Gautier 76d5fafb72 Introduce `selectBooks`
`selectBooks` allow us to parse a query in a "standard" way to get
the book(s) on which the user want to work.
2022-06-02 12:22:28 +02:00
Matthieu Gautier 4438106c2f Add a prefix in get_search_filter
The prefix will be used to parse a "query to select book" in different context.
For now we have only one context : selecting books for the catalog search.
But we will want to select books to do fulltext search on them
(will be done in later commit)
2022-06-02 12:22:28 +02:00
Matthieu Gautier 76ebfd7ea4 Move get_search_filter and subrange. 2022-06-02 12:22:27 +02:00
Matthieu Gautier 22996e4a6b Allow user to select multiple books when doing search. 2022-06-02 12:22:27 +02:00
Matthieu Gautier 98c54b2279 Handle multiple arguments in RequestContext. 2022-06-02 12:22:27 +02:00
Matthieu Gautier 854623618c Use the newly introduced searcherCache for multizim searcher. 2022-06-02 12:22:25 +02:00
Matthieu Gautier fd0edbba80 Use a set of id as key for a the searcher Cache.
It will allow use to cache seacher for multiple zim files.
2022-05-24 14:55:48 +02:00
Matthieu Gautier f5af0633ec Move the searcher cache into the Library 2022-05-24 14:55:48 +02:00