Commit Graph

1147 Commits

Author SHA1 Message Date
Matthieu Gautier e108fb0e47 Add `/catalog/v2/illustration` endpoint 2022-01-04 14:16:46 +01:00
Matthieu Gautier 9482bfb95b Add a method to get the a book illustration for a specific size. 2022-01-04 14:16:46 +01:00
Matthieu Gautier 66c40817ee Fix the OPDS stream to handle custom ROOT prefix
As we render the entry's xml in a separated steps, we need to pass the
rootLocation to all the internal rendering.

Testing with and without root is not so easy.
I've simply made all server tests using a ROOT prefix.
We can assume that if the ROOT is present everywhere we need it, it will not
when we don't need. (As long as we don't hardcode "ROOT" in the server.)
2022-01-04 11:15:18 +01:00
Matthieu Gautier 22e5327dcf Do not create a dummy illustration if library.xml doesn't contain one.
Fix #644
2022-01-04 11:12:32 +01:00
Nikhil Tanwar 8bdcb90818
Make aria2 secret a random value
Apps using this service will not have a default aria secret (previously 'kiwixariarpc')
2022-01-03 09:35:04 +01:00
Emmanuel Engelhart f36d8e9851 New kiwix::getVersions() and printVersions() 2022-01-02 12:22:11 +01:00
Matthieu Gautier f1035fa472 Fix win32 compilation.
WSASocket return a `INVALID_SOCKET` if something goes wrong,
not SOCKET_ERROR.
2021-12-23 18:32:43 +01:00
Nikhil Tanwar 9554ab5db0 Make getNetworkInterfaces() and getBestPublicIp() available via tools.h
Remove HTTP URL helper line - should be done in kiwix-serve
Add getters at server level - getAddress and getPort
2021-12-22 22:38:16 +05:30
Nikhil Tanwar 4b563e567e Provide HTTP URL for the server
Added a line to display the IP (use best if nothing is provided) along with port.
2021-12-22 22:08:25 +05:30
Veloman Yunkan ed2f914e10 Minor cleanup
The code for obtaining the archive now looks the same for the /meta,
/suggest, /search and /random endpoints.
2021-12-22 17:12:34 +01:00
Veloman Yunkan 872ddd9cb3 Cleaned up InternalServer::handle_suggest()
As a result of this clean-up the /suggest endpoint too stopped
generating confusing 404 Not Found errors (which, like in /meta's case
is not that important). Another functional change is that the "term"
parameter became optional.
2021-12-22 17:12:34 +01:00
Veloman Yunkan 20b5a2b971 Less confusing 404 errors from /meta endpoint
Before this fix the /meta endpoint could return a 404 Not Found page
saying

  The requested URL "/meta" was not found on this server.

Error cases producing such a result were:

- `/meta?content=NON-EXISTING-BOOK&name=metaname`

- `/meta?content=book&name=BAD-META-NAME`

Now a proper message is shown for each of those cases.

This fix is being done just for consistency (the /meta endpoint is not
a user-facing one and the scripts don't bother about error texts).
2021-12-22 17:12:34 +01:00
Veloman Yunkan d8c525289b Changed the signature of Response::build_404()
Now Response::build_404() takes the URL instead of the entire
RequestContext object. An empty url suppresses the

 The requested URL "url" was not found on this server.

part of the error text.
2021-12-22 17:12:34 +01:00
Veloman Yunkan f7b853373c Less confusing 404 errors from /random endpoint
Before this fix the /random endpoint could return a 404 Not Found page
saying

  The requested URL "/random" was not found on this server.

Error cases producing such a result were:

- `/random?content=NON-EXISTING-BOOK` (can happen when a server is
restarted or the library is reloaded and the current book is no longer
available).

- Failure of the libkiwix routine for picking a random article.

Now a proper message is shown for each of those cases.
2021-12-22 17:12:34 +01:00
Veloman Yunkan 250f46c7f9 fixup! Searcher::add_reader() rejects duplicate readers 2021-12-16 16:51:03 +01:00
Veloman Yunkan 0be00b791f Searcher::add_reader() rejects duplicate readers
A O(N) linear search was added to `Searcher::add_reader()` deliberately.
This doesn't seem to be an operation that may lead to performance
problems.
2021-12-16 16:51:03 +01:00
Emmanuel Engelhart 9f3459f3f3 Better libkiwix version variable name 2021-12-13 18:22:40 +01:00
Veloman Yunkan e1db9164c8 Fixed deadlock in Library::writeBookmarksToFile() 2021-12-05 20:31:21 +04:00
Veloman Yunkan 7161db8e2a Manager::reload() also removes books from Library 2021-11-30 18:20:27 +04:00
Veloman Yunkan 262e13845c Enter Library::removeBooksNotUpdatedSince() 2021-11-30 18:20:27 +04:00
Veloman Yunkan 1d5383435d Noted a potential bug in Library::addBook() 2021-11-30 18:20:27 +04:00
Veloman Yunkan ad2eb52553 Thread safe dumping of the OPDS feed 2021-11-30 18:20:27 +04:00
Veloman Yunkan 473d2d2a69 Introduced Library::getBookByIdThreadSafe() 2021-11-30 18:20:27 +04:00
Veloman Yunkan 02b9e32d18 Library became almost thread-safe
Library became thread-safe with the exception of `getBookById()`
and `getBookByPath()` methods - thread safety in those accessors is
rendered meaningless by their return type (they return a reference
to a book which can be removed any time later by another thread).
2021-11-30 18:20:27 +04:00
Veloman Yunkan c2927ce6f7 Library got a yet unused mutex
Introducing a mutex in `Library` necessitates manually implementing the
move constructor and assignment operator. It's better to still delegate
that work to the compiler to eliminate any possibility of bugs when new
data members are added to `Library`. The trick is to move the data into
an auxiliary class `LibraryBase` and derive `Library` from it.
2021-11-30 18:20:27 +04:00
Veloman Yunkan b712c732f2 Dropped Library::getBookBy*() non-const functions 2021-11-30 18:20:27 +04:00
Veloman Yunkan 298247ca9b Renamed NameMapperProxy -> UpdatableNameMapper 2021-11-30 18:20:27 +04:00
Veloman Yunkan 3aeeeeee76 Manager::reload() 2021-11-30 18:20:27 +04:00
Veloman Yunkan 226dac2604 LibraryManipulator is now merely a notifier
Originally `LibraryManipulator` was an abstract class completely decoupled
from `Library`. Its `addBookToLibrary()` and `addBookmarkToLibrary()`
methods could be defined in an arbitrary way. Now `LibraryManipulator` has to be
bound to a library object, those methods are no longer virtual, they always
update the library and allow for some additional actions via virtual
functions `bookWasAddedToLibrary()` and `bookmarkWasAddedToLibrary()`.
2021-11-30 18:20:27 +04:00
Veloman Yunkan 76a5e3a877 Library::addBook() updates the reader cache 2021-11-30 18:20:27 +04:00
Veloman Yunkan 6199c11505 NameMapperProxy respects the withAlias flag 2021-11-30 18:18:16 +04:00
Veloman Yunkan 8fffa59974 Added NameMapperProxy from kiwix/kiwix-desktop#714
The right place for NameMapperProxy introduced by kiwix/kiwix-desktop#714 is in
libkiwix (so that it can be reused in kiwix-serve).
2021-11-30 18:18:16 +04:00
Veloman Yunkan 5f3c34ed93 NameMapper's API is now const 2021-11-22 21:06:27 +04:00
Veloman Yunkan 339f845fb0 Bugfix in Book::getHumanReadableIdFromPath() 2021-11-22 20:54:44 +04:00
Veloman Yunkan 571e417d1e Manager is now safe to copy 2021-11-20 20:38:39 +04:00
Veloman Yunkan 0e48baf9f9 Simplified Library::getReaderById()
Reused `Library::getArchiveById()` in `Library::getReaderById()`.
2021-11-19 20:17:12 +04:00
Veloman Yunkan 4a01081e83 Thread-safe Book::Illustration::getData() 2021-11-19 16:44:25 +04:00
Veloman Yunkan eb6a0d6456 Enter Book::getIllustrations() 2021-11-18 14:39:00 +04:00
Veloman Yunkan e2544799a1 Shorter Book::update() 2021-11-18 14:39:00 +04:00
Veloman Yunkan 9f42884507 Book's illustrations are now immutable 2021-11-18 14:39:00 +04:00
Veloman Yunkan 8a6adddc16 Non-throwing Book::getDefaultIllustration() 2021-11-18 14:39:00 +04:00
Veloman Yunkan c8da5eea2b Dropped Book::getMutableDefaultIllustration()
Now a Book is created without a default illustration.
2021-11-18 14:38:00 +04:00
Veloman Yunkan bd29c4c7ef Book::updateFromOpds() resets Book::m_illustrations 2021-11-18 14:37:12 +04:00
Veloman Yunkan e52a4a646b Book::updateFromXml() resets Book::m_illustrations 2021-11-18 14:36:42 +04:00
Veloman Yunkan 537ba7e6b9 Book::update() reads illustrations from ZIM file 2021-11-18 14:35:49 +04:00
Veloman Yunkan f4bc3c8ced Book::Illustration got dimensions 2021-11-18 14:34:51 +04:00
Veloman Yunkan 5263f6880c Internally Book supports multiple illustrations 2021-11-18 14:34:51 +04:00
Veloman Yunkan c129952605 Added a couple of notes on data consistency 2021-11-18 14:34:48 +04:00
Veloman Yunkan 9f0db6b7fa Book::Illustration::getData() 2021-11-18 14:33:50 +04:00
Veloman Yunkan 7d8a83cc97 Encapsulated access to Book::m_illustration 2021-11-18 14:32:52 +04:00
Veloman Yunkan ec5a423924 Enter Book::Illustration
`Book::m_favicon` and its 2 friends are replaced with a single
`Book::m_illustration` data member.
2021-11-18 13:31:08 +04:00
Veloman Yunkan 811b73a4f1 Moved 2 small method definitions to cpp 2021-11-18 13:27:27 +04:00
Manan Jethwani 30e4c549e4 exposed fileExist, getMimeTypeForFile and getFileCoontent functions 2021-10-12 19:44:38 +05:30
Manan Jethwani b7b385d87b added custom index template 2021-10-12 19:44:05 +05:30
Matthieu Gautier cd9fb541fc Fix method call for new libzim API.
`add_archive` is now `addArchive`.
2021-09-29 11:55:22 +02:00
Veloman Yunkan c0bda426b4 Removed duplication across two mustache templates
Deduplicated the mustache templates static/templates/catalog_v2_entries.xml
and static/templates/catalog_v2_complete_entry.xml (the latter was
renamed to static/templates/catalog_v2_entry.xml).
2021-09-09 12:19:22 +04:00
Veloman Yunkan b3f7556096 Added partial entries feed to the OPDS root feed 2021-09-09 12:19:22 +04:00
Veloman Yunkan 4c657c082e /catalog/v2/partial_entries OPDS API endpoint 2021-09-09 12:19:22 +04:00
Veloman Yunkan e15a0f4338 /catalog/v2/entry/<entry_id> OPDS API endpoint 2021-09-09 12:19:22 +04:00
Veloman Yunkan 12d9b69806 OPDSDumper::dumpOPDSCompleteEntry() 2021-09-09 12:19:22 +04:00
Veloman Yunkan 027854e4f4 Extracted getSingleBookData() in opds_dumper.cpp 2021-09-09 12:19:22 +04:00
Maneesh P M 61209ea0d7 Allow kiwix-serve to get suggestions of custom range
This will allow handle_suggest API to accept two arguments `start` and
`suggestionLength` that will allow handle_suggest to retrieve
suggestions in the given range rather than the default 0-10 range.
2021-08-19 21:05:39 +05:30
Maneesh P M 8a4080baba Update libkiwix with new libzim api 2021-08-14 22:26:39 +05:30
Veloman Yunkan 452283cfe6 Handling of /meta?name=Illustration_WxH@1 requests 2021-08-05 22:28:09 +04:00
Veloman Yunkan e5168d8b3d Support for multiple illustrations in OPDS entry 2021-08-05 22:21:13 +04:00
Maneesh P M 9addd82d2d Fix usage of zim::Searcher::getResults() in libkiwix
The correct usage does not require the user to calculate an `end` using
the `pageLength`. We can directly use getResults(start, pageLength)
2021-08-04 19:20:50 +05:30
Maneesh P M 19afe9442f Remove OriginId functions since they are not useful right now 2021-08-03 11:42:58 +02:00
Maneesh P M a3ba7619df Update Manager to use Archive instead of Reader
kiwix::Manager uses Reader to import a zim file, it should be using
zim::Archive directly.
2021-08-03 11:42:58 +02:00
Maneesh P M 8b12434ff2 Update kiwix::book to use libzim structure
Some methods in kiwix::Book uses wrapper structure reader. This usage should
be extended from the native libzim structure zim::Archive
2021-08-03 11:42:58 +02:00
Veloman Yunkan ab3095745e Languages OPDS feed includes book counts 2021-08-03 11:32:38 +02:00
Veloman Yunkan 45adda44b3 Got rid of <content> node in languages OPDS entry 2021-08-03 11:32:38 +02:00
Veloman Yunkan 96cf7e78a5 OPDSDumper::categoriesOPDSFeed() with no args 2021-08-03 11:32:38 +02:00
Veloman Yunkan dd118df612 Got rid of langMap in opds_dumper.cpp
Language code to human friendly name translation is now done with the
help of the ICU library. It works if the line

```
-include $(LANGSRCDIR)/resfiles.mk
```

in the file `source/data/Makefile.in` of the icu4c dependency is not
commented out. Currently, the said line is commented out (along with
some other include's) by the `icu4c_custom_data.patch` patch of the
`kiwix-build` tool.
2021-08-03 11:32:38 +02:00
Veloman Yunkan 5f90f5ee2a Preliminary version of /catalog/v2/languages 2021-08-03 11:32:38 +02:00
Veloman Yunkan 18871b4b15 Helper function Library::getBookPropValueSet()
Introduced a helper function `Library::getBookPropValueSet()` and
deduplicated Library::getBooks{Languages,Creators,Publishers}() methods.
2021-08-03 11:32:38 +02:00
Veloman Yunkan b2027b397c List of languages entry in /catalog/v2/root.xml
Added a new entry in /catalog/v2/root.xml that points to a
not-yet-existing list of languages navigation feed.
2021-08-03 11:32:38 +02:00
Matthieu Gautier 0b6b6716de Rename split argument from `trimEmpty` to `dropEmpty`. 2021-07-07 14:43:13 +02:00
Matthieu Gautier b70c92cade Move back used helper functions to the public API.
- Add docstring
- Move the declaration in kiwix namespace.
- Adapt our include to include the right headers.
2021-07-07 14:43:13 +02:00
Matthieu Gautier fa83a61a54 Move all public *Tools.h in src.
This by definition remove all the tool functions from the public API.
2021-07-07 14:43:13 +02:00
Maneesh P M a94a03cd22 Remove unwanted reader functions
Removing the functions in InternalServer that are no longer needed.
2021-07-03 14:07:14 +05:30
Maneesh P M bc821638da Drop wrapper structures from handle_search
Since we now have SearcherRenderer that can work with native libzim
structure, we will drop the wrapper and use them instead.
2021-07-03 14:07:12 +05:30
Maneesh P M bcece66960 Add SearchRenderer handles for libzim structures
Introduces a new member mp_search that houses the zim::Search object,
adds a new constructor for this purpose. This commit also add an
overload for getHtml that takes start and end integers as arguments
since they are not part of the search object we include.
2021-07-03 14:05:50 +05:30
Maneesh P M c046f64d83 Drop Reader and Entry wrappers from handle_content 2021-07-03 14:05:50 +05:30
Maneesh P M 75b4d311d7 Drop Reader from InternalServer::handle_random 2021-07-03 14:04:04 +05:30
Maneesh P M a236751c74 Drop usage of Reader from InternalServer::handle_suggest 2021-07-03 14:04:04 +05:30
Maneesh P M 7d68926539 Drop usage of Reader from InternalServer::handle_meta
This is essentially a code move of meta handlers from using Reader
functions to directly using Archive.
2021-07-03 14:04:02 +05:30
Maneesh P M 940368b8ac Add m_archives and getArchiveById to Library
These members will mirror the functionality offered by equivalent usage
of Reader class.
2021-07-03 14:02:31 +05:30
Veloman Yunkan b5c1b26761 OpdsCatalog::getSearchUrl() 2021-06-30 18:27:00 +02:00
Maneesh P M f3c96b23fd Use getIllustrationItem instead of getFaviconEntry method
With openzim/libzim#540 we now have a new function to get
illustration(previously favicon in 48x48 size and unity scale) in
multiple sizes. We need to replace getFaviconEntry with this new
getIllustrationItem method.
2021-06-19 10:23:24 +05:30
Vertigo 8d39b2c4c1 Added content ZIM home button on 404 2021-06-17 12:51:27 +05:30
Veloman Yunkan 78083f1f4a Moved OPDS templates under static/templates 2021-06-08 20:37:00 +04:00
Veloman Yunkan dd60235010 Fixed the self link in the output of /catalog/v2/entries 2021-06-08 20:37:00 +04:00
Veloman Yunkan e799f2ff1e OPDSDumper::dumpOPDSFeed() works via mustache
This changes the output of `/catalog/search` as follows:

- Entire search query (rather than only the value of the `q` parameter)
  is put in the <title> node.

- Search performed with an empty query presents itself as "All zims".

- The feed id remains stable for identical searches on the same
  library.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 312f2cb560 Moved handle_catalog_v2*() methods into a new file 2021-06-08 20:37:00 +04:00
Veloman Yunkan fa42cbc48f Pagination info in /catalog/v2/entries 2021-06-08 20:37:00 +04:00
Veloman Yunkan f1797993af Reused InternalServer::search_catalog() 2021-06-08 20:37:00 +04:00
Veloman Yunkan f886c8c07b Root url is normalized once in the constructor 2021-06-08 20:37:00 +04:00
Veloman Yunkan 9ca6bd006f /catalog/v2/categories goes through OPDSDumper too 2021-06-08 20:37:00 +04:00
Veloman Yunkan cdacc0caf1 /catalog/v2/entries going through OPDSDumper
OPDSDumper sensed threats to its job security, so it lobbied to be
involved in handling the /catalog/v2 endpoints, too.
2021-06-08 20:37:00 +04:00
Veloman Yunkan dfad1c3815 /catalog/v2/searchdescription.xml 2021-06-08 20:37:00 +04:00
Veloman Yunkan 07252a127a /catalog/v2/entries is also a search endpoint 2021-06-08 20:37:00 +04:00
Veloman Yunkan b60e3ffb26 RequestContext::get_optional_param() 2021-06-08 20:37:00 +04:00
Veloman Yunkan 70d42aec98 A small simplification 2021-06-08 20:37:00 +04:00
Veloman Yunkan 4aa3c792aa Extracted get_search_filter() 2021-06-08 20:37:00 +04:00
Veloman Yunkan 208dece7e3 Reordered several statements
Reordered several statements so that the next couple of commits are a
little simpler.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 19b59fd72f Serving /catalog/v2/entries
/catalog/v2/entries is intended to play the combined role of
/catalog/root.xml and /catalog/search of the old OPDS API. Currently,
the latter role is not yet implemented.

Implementation note: instead of tweaking and reusing
`OPDSDumper::dumpOPDSFeed()`, the generation of the OPDS feed is done via `mustache`
and a new template `static/catalog_v2_entries.xml`.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 92c2de8d46 Enter InternalServer::m_library_id
The new field is intended to serve as a seed for generating semi-stable
OPDS feed ids that only need to change when the library is updated.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 2e53b51696 Serving /catalog/v2/categories 2021-06-08 20:37:00 +04:00
Veloman Yunkan b259afa408 Library::getBooksCategories()
Note: no unit test added
2021-06-08 20:37:00 +04:00
Veloman Yunkan 3c3cf08a1a Serving /catalog/v2/root.xml
Note: This commit somewhat relaxes validation of non variable
`<updated>` elements in the OPDS feed - the contents of any `<updated>`
element is replaced with the YYYY-MM-DDThh:mm:ssZ placeholder.
2021-06-08 16:03:43 +04:00
Veloman Yunkan 54b78eaf56 Moved gen_date_str() to tools/otherTools.cpp 2021-06-08 16:03:43 +04:00
Veloman Yunkan 1e0ff1fbb0 Fixed the double colon in OPDS date string 2021-06-08 16:03:43 +04:00
Veloman Yunkan 5b272ac49c Fixed handling of /catalogBLABLA/root.xml & alike
Also removed an unneeded namespace qualifier.
2021-06-08 16:03:43 +04:00
Manan Jethwani bb92f26b60 added filter functionality 2021-06-07 15:37:20 +02:00
Manan Jethwani 063bb8cd65 added dynamic and subset loading of zim-files in kiwix-serve 2021-06-01 19:33:42 +05:30
Maneesh P M e2f6d91d51 Remove get_readerIndex in favor of get_zimId
The function get_readerIndex was used to get the zimId using an ordered
vector of readers. Now we can use get_zimId directly.
2021-05-26 14:45:25 +02:00
Maneesh P M c35f6f9142 Add `get_zimId` method to Result
get_zimId method allows the user to get the uuid of the archive from
which a result is retrieved directly from the search result itself.
2021-05-26 14:45:25 +02:00
Maneesh P M 5567d8ca49 Replace std::vector<std::string> with SuggestionItem
Each sugestions used to be stored as vector of strings to hold various values
such as title, path etc inside them. With this commit, we use the new
dedicated class `SuggestionItem` to do the same.
2021-05-26 10:53:39 +02:00
Maneesh P M 56434de79e Set label to title snippet if present
With openzim/libzim#545 we now support snippet generation of titles
which can be used as the display label on the ui for highlighted titles
via the "label" field.
The old version used plain title which is still available in the value
field.
2021-05-26 10:52:58 +02:00
Maneesh P M e5fac30cee Update libkiwix with search iterator rename in libzim
Search iterator API in libzim has been shifted to use camel case naming.
This has to be accomodated in libkiwix as well.
2021-05-26 08:39:13 +02:00
Matthieu Gautier 2736a46cfe
Revert "Kiwix Serve welcome page dynamic and subset loading (OPDS based)" 2021-05-25 17:30:05 +02:00
Manan Jethwani 012973d14a added dynamic and subset loading of zim-files in kiwix-serve 2021-05-25 02:41:12 +05:30
Emmanuel Engelhart d4e35c7067 Rename kiwix-lib in libkiwix 2021-05-23 21:46:52 +02:00
Veloman Yunkan cd02b4de3b Dummy application of new libzim search API
Didn't take any advantage of the new libzim search API. Just fixed the
libkiwix build in the most straightforward way.
2021-05-15 23:34:51 +04:00
Emmanuel Engelhart 05cc3d015f Insert root link only if html content 2021-05-14 14:49:28 +02:00
Veloman Yunkan 68189de162 /catalog/search handles out-of-bounds pagination 2021-05-10 11:25:06 +02:00
Veloman Yunkan 41276341d0 Empty query acts as a match-all query
After switching to Xapian-based search in the library/catalog, an empty
query stopped acting as a match-all query. This commit restores the old
behaviour in that regard.
2021-05-09 15:14:43 +02:00
Maneesh P M be6b58c6ad Revert "added 204 code for empty return of search"
Returning status code 204 in case of an empty results doesn't show the
empty results page as described in #466. Reverting the changes in #396
fixes the issue.
2021-05-09 10:47:18 +05:30
Emmanuel Engelhart 950e742116 No metalink file on fs 2021-05-04 13:15:43 +02:00
Veloman Yunkan 3879b82112 const-correct kiwix::Library
- Made most methods of kiwix::Library const.
- Also added const versions of getBookById() and getBookByPath()
  methods.
2021-04-28 11:42:55 +04:00
Veloman Yunkan 63e9a09259 Cleaned up/beautified Library::updateBookDB() 2021-04-27 16:59:21 +04:00
Veloman Yunkan 4178c169dd Xapian documents in book DB store only the book id 2021-04-27 16:59:21 +04:00
Veloman Yunkan f751aff2fb Full case/diacritics insensitivity in catalog filtering
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
2021-04-27 16:59:21 +04:00
Veloman Yunkan 87dc9d2723 Made catalog filtering by query diacritics insensitive
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.

Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
2021-04-27 16:59:21 +04:00
Veloman Yunkan 9c7366890d Catalog filtering by tags works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 19e195cb7d Filter::Tags typedef 2021-04-27 16:59:21 +04:00
Veloman Yunkan 3d5fd8f585 Catalog filtering by creator works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan d3d5abe14d Handling of non-words in publisher query
This change fixes the failure of the LibraryTest.filterByPublisher
unit-test broken by the previous commit.

The previous approach used in `publisherQuery()` for building a phrase
query enforcing the specified prefix for all terms fails if

1. the input phrase contains a non-word term that Xapian's query parser
   doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc);
2. the input phrase contains at least three terms that Xapian's query
   parser has no issue with.

Using the `quest` tool (coming with xapian-tools under Ubuntu) the
issue can be demonstrated as follows:

```
$ quest -o phrase -d some_xapian_db "Energy & security"
Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2))
Exactly 0 matches
MSet:

$ quest -o phrase -d some_xapian_db "Energy & security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db 'Energy 1/2 security act'
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db "Energy a#1 security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
```

The problem comes from parsing the query with the default operation set
to `OP_PHRASE` (exemplified by the `-o phrase` option in above
invocations of `quest`). A workaround is to parse the phrase with a
default operation of `OP_OR` and then combine all the terms with
`OP_PHRASE`.

Besides stemming should be disabled in order to target an exact phrase
match (save for the non-word terms, if any, that are ignored by the
query parser).
2021-04-27 16:59:21 +04:00
Veloman Yunkan a759ab989f Catalog filtering by publisher works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 7ccd9ffcce Catalog filtering by language works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 0c0a37073b Catalog filtering by category works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 415c65cf03 Catalog filtering by book name works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 8287f351e7 Final logic of Library::filterViaBookDB()
Moved the `filter.hasQuery()` check inside `buildXapianQuery()`.
`Library::filterViaBookDB()` only cares if the query that is going to be
run on the book DB would match all documents. The rest of changes
related to enhancing the usage of Xapian for the catalog search will
happen inside `buildXapianQuery()` and `updateBookDB()`.
2021-04-27 16:59:21 +04:00
Veloman Yunkan ea779ac200 Extracted buildXapianQuery() 2021-04-27 16:59:21 +04:00
Veloman Yunkan 80cd1fc989 Renamed 2 functions in Filter and Library 2021-04-27 16:59:21 +04:00
Veloman Yunkan 2d76f8395e Dropped unused functions from Filter's private API
This should have been done back in PR #460
2021-04-27 16:59:21 +04:00
Manan Jethwani 965b9622c2 removed redirect to articles in search 2021-04-20 20:23:42 +05:30
Veloman Yunkan 9d4370403b get_url() was renamed in zim::search_iterator 2021-04-16 13:30:36 +04:00
Vertigo 611146aa37 Added Search Link for bad bookName/articleName on 404 2021-04-12 21:31:47 +05:30
Veloman Yunkan b54215f146 Manager::readOpds() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan 9033f2f28e Manager::readXml() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan ec9186b174 Library::removeBookById() updates the search DB
This fix makes the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test pass.
2021-04-09 17:06:45 +04:00
Veloman Yunkan aaaa5a637e Library::filter() doesn't create empty books
This changes how the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test fails.
2021-04-09 17:06:45 +04:00
Veloman Yunkan 24ed96a38c Library.removeBookById() drops the reader too
This fix makes the `XmlLibraryTest.removeBookByIdDropsTheReader`
unit-test pass.
2021-04-09 17:05:56 +04:00
Manan Jethwani 5cb276a933 adding kind and path attributes to suggest response object and using it in autocomplete 2021-04-07 21:04:33 +05:30
Veloman Yunkan aa2a031ba4 Xapian headers are not exposed through libkiwix 2021-04-07 18:24:33 +04:00
Manan Jethwani 7872734f44 changed method of injecting root link 2021-03-24 14:17:58 +05:30
Manan Jethwani c557bb271b injecting root link directly and renamed head_part to head_taskbar 2021-03-24 02:10:16 +05:30
Manan Jethwani 93264f7409 added root functionality for block external link feature 2021-03-23 03:17:14 +05:30
Veloman Yunkan e214efecd4 Language code conversion via ICU
Language code is converted from ISO 639-3 to ISO 639 (which is
understood by Xapian) via ICU. The previous approach via an explicit
map had its advantages since Xapian has more than one stemmer
implementations for some languages (selectable via Xapian-specific
identifiers). This commit relies on the defaults associated with the
ISO 639 language codes.
2021-03-17 14:32:03 +01:00
Veloman Yunkan 09233bf4f3 Support for partial queries in catalog search
The search text in the catalog query is interpreted as partial by
default, but partial query mode can be disabled in C++. The latter
possibility is not exposed via the /catalog/search kiwix-serve endpoint,
though.
2021-03-17 14:32:03 +01:00
Veloman Yunkan a599fb3892 Initial version of Xapian-based catalog search 2021-03-17 14:32:03 +01:00
Veloman Yunkan a17fc0ef2d Library::getBooksByTitleOrDescription() 2021-03-17 14:32:03 +01:00
Veloman Yunkan db06b2c7ca Library::BookIdCollection typedef 2021-03-17 14:32:03 +01:00
Veloman Yunkan a20f9e2ce1 Library::filter() works in two stages
1. Get the subset of books matching the q (title/description) parameter
   of the search

2. Filter out books not matching the other parameters of the search.

Stage 1. currently works in the old way, but will be replaced by Xapian
based search in subsequent commits.
2021-03-17 14:32:03 +01:00
Veloman Yunkan b7b0bdbdd8 Both Book::update() methods update the category 2021-03-17 14:10:57 +04:00
Veloman Yunkan 4abc4f8518 Support for book category attribute in library.xml 2021-03-17 14:10:57 +04:00
Veloman Yunkan 6b2067c236 Reading category element from OPDS stream 2021-03-17 14:10:57 +04:00
Veloman Yunkan e55bf514e8 Dedicated 'category' parameter in catalog search 2021-03-17 14:10:57 +04:00
Veloman Yunkan 80d4f7e349 Extracted InternalServer::search_catalog() 2021-03-17 14:10:57 +04:00
Veloman Yunkan 58186ffb26 kiwix::Book::getCategory() 2021-03-17 14:09:48 +04:00
Veloman Yunkan ae32ff40c0 Dropped an extra colon from book <updated> dates 2021-03-17 14:02:27 +04:00
Veloman Yunkan 26331b401e Fixed the month in OPDS feed <updated> date
`tm::tm_mon` varies in the [0, 11] range.
2021-03-17 14:02:27 +04:00
Matthieu Gautier 67caae6c32 Use the new libzim's getRandomEntry instead of implementing it ourselves. 2021-03-02 14:16:09 +01:00
Veloman Yunkan 839fc10a4f Fixed the Windows build
Opening ZIM archives by file descriptor (as well as embedded
ZIM archives) is not supported under Windows.
2021-02-10 14:19:47 +01:00
Veloman Yunkan 5a8b825c70 Testing of JNIKiwixReader.getDirectAccessInformation() 2021-02-10 14:19:47 +01:00
Veloman Yunkan 7a465e66d7 Renamed org.kiwix.kiwixlib.{Pair->DirectAccessInfo} 2021-02-10 14:19:47 +01:00
Veloman Yunkan 5a99634dfd Java wrapper test checks favicon.png too 2021-02-10 14:19:47 +01:00
Veloman Yunkan e028bcbb04 Android's java.io.FileDescriptor is different 2021-02-10 14:19:47 +01:00
Veloman Yunkan 9cdf7a44c0 JNIKiwixReader can open an embedded ZIM archive 2021-02-10 14:19:47 +01:00
Veloman Yunkan 4d23e44de7 JNIKiwixReader ctor taking a file descriptor
... and a corresponding unit test
2021-02-10 14:19:47 +01:00
Veloman Yunkan 98d69ef59b Added testReader unit-test for the java wrapper 2021-02-10 14:19:47 +01:00
Veloman Yunkan e40827fbac Renamed the java wrapper unit test runner script 2021-02-10 14:19:47 +01:00
Veloman Yunkan a798e0c0a1 Made the java wrapper unit test run & pass
The kiwixlib java wrapper unit test can be run manually via the
src/wrapper/java/org/kiwix/testing/compile_test.sh script.

The test ZIM files in src/wrapper/java/org/kiwix/testing were created
using the create_test_zimfiles. They must be updated/re-generated and
committed in git whenever their source data or the create_test_zimfiles
script changes. Note: small.zim.embedded is not used at this point, it
was created for testing the enhancement coming in a few commits.
2021-02-10 14:19:47 +01:00
Matthieu Gautier 24b2e6e585 Remove unnecessary include. 2021-01-26 17:53:25 +01:00
Matthieu Gautier 3fd1310008 Use c++11 std::thread instead of pthread. 2021-01-26 17:53:25 +01:00
Matthieu Gautier 4749656828 Do not crash if zim file has no `Counter` metadata. 2021-01-26 15:15:27 +01:00
Emmanuel Engelhart 84895c4036 Better </head> detection regex 2021-01-18 13:16:56 +01:00
Emmanuel Engelhart a8bf9dd5b4 Better Kiwix Serve Taskbar insertion (after charset definition) 2021-01-18 11:18:53 +01:00
Emmanuel Engelhart a61c94ef10 Add GPLv3 header 2021-01-18 10:54:33 +01:00
Emmanuel Engelhart 8c43fd8d36 Fix taskbar insertion in case of '<head>' attributes 2021-01-11 14:37:19 +01:00
Emmanuel Engelhart 3e2810dff4 Support 'video/*' * 'audio/*' mimetypes in getMediaCount() 2021-01-07 12:32:32 +01:00
Emmanuel Engelhart 44c4aa931a Better use kiwix::startsWith() 2021-01-03 15:17:03 +01:00
Emmanuel Engelhart 95b32b168d More robust getMediaCount() 2021-01-01 17:05:32 +01:00
Matthieu Gautier 1002c15e0d Remove unnecessary checks.
`Reader` cannot be created with a null `zimArchive`.
We don't have to check for zimArchive being not null.
2020-12-09 14:25:02 +01:00
Matthieu Gautier d51000c4a9 Use new libzim method `hasFulltextIndex` to check for fulltext index. 2020-12-09 14:25:02 +01:00
Matthieu Gautier ba302bed33 Use new libzim method `getFaviconEntry` to get the favicon. 2020-12-09 14:25:02 +01:00
Steve Wills 6900b4e506 fix build on FreeBSD
With this header, sockaddr_in and INADDR_ANY are not defined
2020-12-07 09:38:46 -05:00
Matthieu Gautier 1a5a2e7a8e Adapt kiwix-lib to the new libzim api. 2020-12-02 12:16:48 +01:00
Matthieu Gautier d87079ec13 Remove deprecated method in the reader. 2020-11-24 19:00:52 +01:00