Commit Graph

1339 Commits

Author SHA1 Message Date
Nikhil Tanwar 4b563e567e Provide HTTP URL for the server
Added a line to display the IP (use best if nothing is provided) along with port.
2021-12-22 22:08:25 +05:30
Veloman Yunkan ed2f914e10 Minor cleanup
The code for obtaining the archive now looks the same for the /meta,
/suggest, /search and /random endpoints.
2021-12-22 17:12:34 +01:00
Veloman Yunkan 872ddd9cb3 Cleaned up InternalServer::handle_suggest()
As a result of this clean-up the /suggest endpoint too stopped
generating confusing 404 Not Found errors (which, like in /meta's case
is not that important). Another functional change is that the "term"
parameter became optional.
2021-12-22 17:12:34 +01:00
Veloman Yunkan 20b5a2b971 Less confusing 404 errors from /meta endpoint
Before this fix the /meta endpoint could return a 404 Not Found page
saying

  The requested URL "/meta" was not found on this server.

Error cases producing such a result were:

- `/meta?content=NON-EXISTING-BOOK&name=metaname`

- `/meta?content=book&name=BAD-META-NAME`

Now a proper message is shown for each of those cases.

This fix is being done just for consistency (the /meta endpoint is not
a user-facing one and the scripts don't bother about error texts).
2021-12-22 17:12:34 +01:00
Veloman Yunkan d8c525289b Changed the signature of Response::build_404()
Now Response::build_404() takes the URL instead of the entire
RequestContext object. An empty url suppresses the

 The requested URL "url" was not found on this server.

part of the error text.
2021-12-22 17:12:34 +01:00
Veloman Yunkan f7b853373c Less confusing 404 errors from /random endpoint
Before this fix the /random endpoint could return a 404 Not Found page
saying

  The requested URL "/random" was not found on this server.

Error cases producing such a result were:

- `/random?content=NON-EXISTING-BOOK` (can happen when a server is
restarted or the library is reloaded and the current book is no longer
available).

- Failure of the libkiwix routine for picking a random article.

Now a proper message is shown for each of those cases.
2021-12-22 17:12:34 +01:00
Veloman Yunkan 250f46c7f9 fixup! Searcher::add_reader() rejects duplicate readers 2021-12-16 16:51:03 +01:00
Veloman Yunkan 0be00b791f Searcher::add_reader() rejects duplicate readers
A O(N) linear search was added to `Searcher::add_reader()` deliberately.
This doesn't seem to be an operation that may lead to performance
problems.
2021-12-16 16:51:03 +01:00
Emmanuel Engelhart 9f3459f3f3 Better libkiwix version variable name 2021-12-13 18:22:40 +01:00
Veloman Yunkan e1db9164c8 Fixed deadlock in Library::writeBookmarksToFile() 2021-12-05 20:31:21 +04:00
Veloman Yunkan 7161db8e2a Manager::reload() also removes books from Library 2021-11-30 18:20:27 +04:00
Veloman Yunkan 262e13845c Enter Library::removeBooksNotUpdatedSince() 2021-11-30 18:20:27 +04:00
Veloman Yunkan 1d5383435d Noted a potential bug in Library::addBook() 2021-11-30 18:20:27 +04:00
Veloman Yunkan ad2eb52553 Thread safe dumping of the OPDS feed 2021-11-30 18:20:27 +04:00
Veloman Yunkan 473d2d2a69 Introduced Library::getBookByIdThreadSafe() 2021-11-30 18:20:27 +04:00
Veloman Yunkan 02b9e32d18 Library became almost thread-safe
Library became thread-safe with the exception of `getBookById()`
and `getBookByPath()` methods - thread safety in those accessors is
rendered meaningless by their return type (they return a reference
to a book which can be removed any time later by another thread).
2021-11-30 18:20:27 +04:00
Veloman Yunkan c2927ce6f7 Library got a yet unused mutex
Introducing a mutex in `Library` necessitates manually implementing the
move constructor and assignment operator. It's better to still delegate
that work to the compiler to eliminate any possibility of bugs when new
data members are added to `Library`. The trick is to move the data into
an auxiliary class `LibraryBase` and derive `Library` from it.
2021-11-30 18:20:27 +04:00
Veloman Yunkan b712c732f2 Dropped Library::getBookBy*() non-const functions 2021-11-30 18:20:27 +04:00
Veloman Yunkan 298247ca9b Renamed NameMapperProxy -> UpdatableNameMapper 2021-11-30 18:20:27 +04:00
Veloman Yunkan 3aeeeeee76 Manager::reload() 2021-11-30 18:20:27 +04:00
Veloman Yunkan 226dac2604 LibraryManipulator is now merely a notifier
Originally `LibraryManipulator` was an abstract class completely decoupled
from `Library`. Its `addBookToLibrary()` and `addBookmarkToLibrary()`
methods could be defined in an arbitrary way. Now `LibraryManipulator` has to be
bound to a library object, those methods are no longer virtual, they always
update the library and allow for some additional actions via virtual
functions `bookWasAddedToLibrary()` and `bookmarkWasAddedToLibrary()`.
2021-11-30 18:20:27 +04:00
Veloman Yunkan 76a5e3a877 Library::addBook() updates the reader cache 2021-11-30 18:20:27 +04:00
Veloman Yunkan 6199c11505 NameMapperProxy respects the withAlias flag 2021-11-30 18:18:16 +04:00
Veloman Yunkan 8fffa59974 Added NameMapperProxy from kiwix/kiwix-desktop#714
The right place for NameMapperProxy introduced by kiwix/kiwix-desktop#714 is in
libkiwix (so that it can be reused in kiwix-serve).
2021-11-30 18:18:16 +04:00
Veloman Yunkan 5f3c34ed93 NameMapper's API is now const 2021-11-22 21:06:27 +04:00
Veloman Yunkan 339f845fb0 Bugfix in Book::getHumanReadableIdFromPath() 2021-11-22 20:54:44 +04:00
Veloman Yunkan 571e417d1e Manager is now safe to copy 2021-11-20 20:38:39 +04:00
Veloman Yunkan 0e48baf9f9 Simplified Library::getReaderById()
Reused `Library::getArchiveById()` in `Library::getReaderById()`.
2021-11-19 20:17:12 +04:00
Veloman Yunkan 4a01081e83 Thread-safe Book::Illustration::getData() 2021-11-19 16:44:25 +04:00
Veloman Yunkan eb6a0d6456 Enter Book::getIllustrations() 2021-11-18 14:39:00 +04:00
Veloman Yunkan e2544799a1 Shorter Book::update() 2021-11-18 14:39:00 +04:00
Veloman Yunkan 9f42884507 Book's illustrations are now immutable 2021-11-18 14:39:00 +04:00
Veloman Yunkan 8a6adddc16 Non-throwing Book::getDefaultIllustration() 2021-11-18 14:39:00 +04:00
Veloman Yunkan c8da5eea2b Dropped Book::getMutableDefaultIllustration()
Now a Book is created without a default illustration.
2021-11-18 14:38:00 +04:00
Veloman Yunkan bd29c4c7ef Book::updateFromOpds() resets Book::m_illustrations 2021-11-18 14:37:12 +04:00
Veloman Yunkan e52a4a646b Book::updateFromXml() resets Book::m_illustrations 2021-11-18 14:36:42 +04:00
Veloman Yunkan 537ba7e6b9 Book::update() reads illustrations from ZIM file 2021-11-18 14:35:49 +04:00
Veloman Yunkan f4bc3c8ced Book::Illustration got dimensions 2021-11-18 14:34:51 +04:00
Veloman Yunkan 5263f6880c Internally Book supports multiple illustrations 2021-11-18 14:34:51 +04:00
Veloman Yunkan c129952605 Added a couple of notes on data consistency 2021-11-18 14:34:48 +04:00
Veloman Yunkan 9f0db6b7fa Book::Illustration::getData() 2021-11-18 14:33:50 +04:00
Veloman Yunkan 7d8a83cc97 Encapsulated access to Book::m_illustration 2021-11-18 14:32:52 +04:00
Veloman Yunkan ec5a423924 Enter Book::Illustration
`Book::m_favicon` and its 2 friends are replaced with a single
`Book::m_illustration` data member.
2021-11-18 13:31:08 +04:00
Veloman Yunkan 811b73a4f1 Moved 2 small method definitions to cpp 2021-11-18 13:27:27 +04:00
Manan Jethwani 30e4c549e4 exposed fileExist, getMimeTypeForFile and getFileCoontent functions 2021-10-12 19:44:38 +05:30
Manan Jethwani b7b385d87b added custom index template 2021-10-12 19:44:05 +05:30
Matthieu Gautier cd9fb541fc Fix method call for new libzim API.
`add_archive` is now `addArchive`.
2021-09-29 11:55:22 +02:00
Veloman Yunkan c0bda426b4 Removed duplication across two mustache templates
Deduplicated the mustache templates static/templates/catalog_v2_entries.xml
and static/templates/catalog_v2_complete_entry.xml (the latter was
renamed to static/templates/catalog_v2_entry.xml).
2021-09-09 12:19:22 +04:00
Veloman Yunkan b3f7556096 Added partial entries feed to the OPDS root feed 2021-09-09 12:19:22 +04:00
Veloman Yunkan 4c657c082e /catalog/v2/partial_entries OPDS API endpoint 2021-09-09 12:19:22 +04:00
Veloman Yunkan e15a0f4338 /catalog/v2/entry/<entry_id> OPDS API endpoint 2021-09-09 12:19:22 +04:00
Veloman Yunkan 12d9b69806 OPDSDumper::dumpOPDSCompleteEntry() 2021-09-09 12:19:22 +04:00
Veloman Yunkan 027854e4f4 Extracted getSingleBookData() in opds_dumper.cpp 2021-09-09 12:19:22 +04:00
Maneesh P M 61209ea0d7 Allow kiwix-serve to get suggestions of custom range
This will allow handle_suggest API to accept two arguments `start` and
`suggestionLength` that will allow handle_suggest to retrieve
suggestions in the given range rather than the default 0-10 range.
2021-08-19 21:05:39 +05:30
Maneesh P M 8a4080baba Update libkiwix with new libzim api 2021-08-14 22:26:39 +05:30
Veloman Yunkan 452283cfe6 Handling of /meta?name=Illustration_WxH@1 requests 2021-08-05 22:28:09 +04:00
Veloman Yunkan e5168d8b3d Support for multiple illustrations in OPDS entry 2021-08-05 22:21:13 +04:00
Maneesh P M 9addd82d2d Fix usage of zim::Searcher::getResults() in libkiwix
The correct usage does not require the user to calculate an `end` using
the `pageLength`. We can directly use getResults(start, pageLength)
2021-08-04 19:20:50 +05:30
Maneesh P M 19afe9442f Remove OriginId functions since they are not useful right now 2021-08-03 11:42:58 +02:00
Maneesh P M a3ba7619df Update Manager to use Archive instead of Reader
kiwix::Manager uses Reader to import a zim file, it should be using
zim::Archive directly.
2021-08-03 11:42:58 +02:00
Maneesh P M 8b12434ff2 Update kiwix::book to use libzim structure
Some methods in kiwix::Book uses wrapper structure reader. This usage should
be extended from the native libzim structure zim::Archive
2021-08-03 11:42:58 +02:00
Veloman Yunkan ab3095745e Languages OPDS feed includes book counts 2021-08-03 11:32:38 +02:00
Veloman Yunkan 45adda44b3 Got rid of <content> node in languages OPDS entry 2021-08-03 11:32:38 +02:00
Veloman Yunkan 96cf7e78a5 OPDSDumper::categoriesOPDSFeed() with no args 2021-08-03 11:32:38 +02:00
Veloman Yunkan dd118df612 Got rid of langMap in opds_dumper.cpp
Language code to human friendly name translation is now done with the
help of the ICU library. It works if the line

```
-include $(LANGSRCDIR)/resfiles.mk
```

in the file `source/data/Makefile.in` of the icu4c dependency is not
commented out. Currently, the said line is commented out (along with
some other include's) by the `icu4c_custom_data.patch` patch of the
`kiwix-build` tool.
2021-08-03 11:32:38 +02:00
Veloman Yunkan 5f90f5ee2a Preliminary version of /catalog/v2/languages 2021-08-03 11:32:38 +02:00
Veloman Yunkan 18871b4b15 Helper function Library::getBookPropValueSet()
Introduced a helper function `Library::getBookPropValueSet()` and
deduplicated Library::getBooks{Languages,Creators,Publishers}() methods.
2021-08-03 11:32:38 +02:00
Veloman Yunkan b2027b397c List of languages entry in /catalog/v2/root.xml
Added a new entry in /catalog/v2/root.xml that points to a
not-yet-existing list of languages navigation feed.
2021-08-03 11:32:38 +02:00
Matthieu Gautier 0b6b6716de Rename split argument from `trimEmpty` to `dropEmpty`. 2021-07-07 14:43:13 +02:00
Matthieu Gautier b70c92cade Move back used helper functions to the public API.
- Add docstring
- Move the declaration in kiwix namespace.
- Adapt our include to include the right headers.
2021-07-07 14:43:13 +02:00
Matthieu Gautier fa83a61a54 Move all public *Tools.h in src.
This by definition remove all the tool functions from the public API.
2021-07-07 14:43:13 +02:00
Maneesh P M a94a03cd22 Remove unwanted reader functions
Removing the functions in InternalServer that are no longer needed.
2021-07-03 14:07:14 +05:30
Maneesh P M bc821638da Drop wrapper structures from handle_search
Since we now have SearcherRenderer that can work with native libzim
structure, we will drop the wrapper and use them instead.
2021-07-03 14:07:12 +05:30
Maneesh P M bcece66960 Add SearchRenderer handles for libzim structures
Introduces a new member mp_search that houses the zim::Search object,
adds a new constructor for this purpose. This commit also add an
overload for getHtml that takes start and end integers as arguments
since they are not part of the search object we include.
2021-07-03 14:05:50 +05:30
Maneesh P M c046f64d83 Drop Reader and Entry wrappers from handle_content 2021-07-03 14:05:50 +05:30
Maneesh P M 75b4d311d7 Drop Reader from InternalServer::handle_random 2021-07-03 14:04:04 +05:30
Maneesh P M a236751c74 Drop usage of Reader from InternalServer::handle_suggest 2021-07-03 14:04:04 +05:30
Maneesh P M 7d68926539 Drop usage of Reader from InternalServer::handle_meta
This is essentially a code move of meta handlers from using Reader
functions to directly using Archive.
2021-07-03 14:04:02 +05:30
Maneesh P M 940368b8ac Add m_archives and getArchiveById to Library
These members will mirror the functionality offered by equivalent usage
of Reader class.
2021-07-03 14:02:31 +05:30
Veloman Yunkan b5c1b26761 OpdsCatalog::getSearchUrl() 2021-06-30 18:27:00 +02:00
Maneesh P M f3c96b23fd Use getIllustrationItem instead of getFaviconEntry method
With openzim/libzim#540 we now have a new function to get
illustration(previously favicon in 48x48 size and unity scale) in
multiple sizes. We need to replace getFaviconEntry with this new
getIllustrationItem method.
2021-06-19 10:23:24 +05:30
Vertigo 8d39b2c4c1 Added content ZIM home button on 404 2021-06-17 12:51:27 +05:30
Veloman Yunkan 78083f1f4a Moved OPDS templates under static/templates 2021-06-08 20:37:00 +04:00
Veloman Yunkan dd60235010 Fixed the self link in the output of /catalog/v2/entries 2021-06-08 20:37:00 +04:00
Veloman Yunkan e799f2ff1e OPDSDumper::dumpOPDSFeed() works via mustache
This changes the output of `/catalog/search` as follows:

- Entire search query (rather than only the value of the `q` parameter)
  is put in the <title> node.

- Search performed with an empty query presents itself as "All zims".

- The feed id remains stable for identical searches on the same
  library.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 312f2cb560 Moved handle_catalog_v2*() methods into a new file 2021-06-08 20:37:00 +04:00
Veloman Yunkan fa42cbc48f Pagination info in /catalog/v2/entries 2021-06-08 20:37:00 +04:00
Veloman Yunkan f1797993af Reused InternalServer::search_catalog() 2021-06-08 20:37:00 +04:00
Veloman Yunkan f886c8c07b Root url is normalized once in the constructor 2021-06-08 20:37:00 +04:00
Veloman Yunkan 9ca6bd006f /catalog/v2/categories goes through OPDSDumper too 2021-06-08 20:37:00 +04:00
Veloman Yunkan cdacc0caf1 /catalog/v2/entries going through OPDSDumper
OPDSDumper sensed threats to its job security, so it lobbied to be
involved in handling the /catalog/v2 endpoints, too.
2021-06-08 20:37:00 +04:00
Veloman Yunkan dfad1c3815 /catalog/v2/searchdescription.xml 2021-06-08 20:37:00 +04:00
Veloman Yunkan 07252a127a /catalog/v2/entries is also a search endpoint 2021-06-08 20:37:00 +04:00
Veloman Yunkan b60e3ffb26 RequestContext::get_optional_param() 2021-06-08 20:37:00 +04:00
Veloman Yunkan 70d42aec98 A small simplification 2021-06-08 20:37:00 +04:00
Veloman Yunkan 4aa3c792aa Extracted get_search_filter() 2021-06-08 20:37:00 +04:00
Veloman Yunkan 208dece7e3 Reordered several statements
Reordered several statements so that the next couple of commits are a
little simpler.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 19b59fd72f Serving /catalog/v2/entries
/catalog/v2/entries is intended to play the combined role of
/catalog/root.xml and /catalog/search of the old OPDS API. Currently,
the latter role is not yet implemented.

Implementation note: instead of tweaking and reusing
`OPDSDumper::dumpOPDSFeed()`, the generation of the OPDS feed is done via `mustache`
and a new template `static/catalog_v2_entries.xml`.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 92c2de8d46 Enter InternalServer::m_library_id
The new field is intended to serve as a seed for generating semi-stable
OPDS feed ids that only need to change when the library is updated.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 2e53b51696 Serving /catalog/v2/categories 2021-06-08 20:37:00 +04:00
Veloman Yunkan b259afa408 Library::getBooksCategories()
Note: no unit test added
2021-06-08 20:37:00 +04:00
Veloman Yunkan 3c3cf08a1a Serving /catalog/v2/root.xml
Note: This commit somewhat relaxes validation of non variable
`<updated>` elements in the OPDS feed - the contents of any `<updated>`
element is replaced with the YYYY-MM-DDThh:mm:ssZ placeholder.
2021-06-08 16:03:43 +04:00
Veloman Yunkan 54b78eaf56 Moved gen_date_str() to tools/otherTools.cpp 2021-06-08 16:03:43 +04:00
Veloman Yunkan 1e0ff1fbb0 Fixed the double colon in OPDS date string 2021-06-08 16:03:43 +04:00
Veloman Yunkan 5b272ac49c Fixed handling of /catalogBLABLA/root.xml & alike
Also removed an unneeded namespace qualifier.
2021-06-08 16:03:43 +04:00
Manan Jethwani bb92f26b60 added filter functionality 2021-06-07 15:37:20 +02:00
Manan Jethwani 063bb8cd65 added dynamic and subset loading of zim-files in kiwix-serve 2021-06-01 19:33:42 +05:30
Maneesh P M e2f6d91d51 Remove get_readerIndex in favor of get_zimId
The function get_readerIndex was used to get the zimId using an ordered
vector of readers. Now we can use get_zimId directly.
2021-05-26 14:45:25 +02:00
Maneesh P M c35f6f9142 Add `get_zimId` method to Result
get_zimId method allows the user to get the uuid of the archive from
which a result is retrieved directly from the search result itself.
2021-05-26 14:45:25 +02:00
Maneesh P M 5567d8ca49 Replace std::vector<std::string> with SuggestionItem
Each sugestions used to be stored as vector of strings to hold various values
such as title, path etc inside them. With this commit, we use the new
dedicated class `SuggestionItem` to do the same.
2021-05-26 10:53:39 +02:00
Maneesh P M 56434de79e Set label to title snippet if present
With openzim/libzim#545 we now support snippet generation of titles
which can be used as the display label on the ui for highlighted titles
via the "label" field.
The old version used plain title which is still available in the value
field.
2021-05-26 10:52:58 +02:00
Maneesh P M e5fac30cee Update libkiwix with search iterator rename in libzim
Search iterator API in libzim has been shifted to use camel case naming.
This has to be accomodated in libkiwix as well.
2021-05-26 08:39:13 +02:00
Matthieu Gautier 2736a46cfe
Revert "Kiwix Serve welcome page dynamic and subset loading (OPDS based)" 2021-05-25 17:30:05 +02:00
Manan Jethwani 012973d14a added dynamic and subset loading of zim-files in kiwix-serve 2021-05-25 02:41:12 +05:30
Emmanuel Engelhart d4e35c7067 Rename kiwix-lib in libkiwix 2021-05-23 21:46:52 +02:00
Veloman Yunkan cd02b4de3b Dummy application of new libzim search API
Didn't take any advantage of the new libzim search API. Just fixed the
libkiwix build in the most straightforward way.
2021-05-15 23:34:51 +04:00
Emmanuel Engelhart 05cc3d015f Insert root link only if html content 2021-05-14 14:49:28 +02:00
Veloman Yunkan 68189de162 /catalog/search handles out-of-bounds pagination 2021-05-10 11:25:06 +02:00
Veloman Yunkan 41276341d0 Empty query acts as a match-all query
After switching to Xapian-based search in the library/catalog, an empty
query stopped acting as a match-all query. This commit restores the old
behaviour in that regard.
2021-05-09 15:14:43 +02:00
Maneesh P M be6b58c6ad Revert "added 204 code for empty return of search"
Returning status code 204 in case of an empty results doesn't show the
empty results page as described in #466. Reverting the changes in #396
fixes the issue.
2021-05-09 10:47:18 +05:30
Emmanuel Engelhart 950e742116 No metalink file on fs 2021-05-04 13:15:43 +02:00
Veloman Yunkan 3879b82112 const-correct kiwix::Library
- Made most methods of kiwix::Library const.
- Also added const versions of getBookById() and getBookByPath()
  methods.
2021-04-28 11:42:55 +04:00
Veloman Yunkan 63e9a09259 Cleaned up/beautified Library::updateBookDB() 2021-04-27 16:59:21 +04:00
Veloman Yunkan 4178c169dd Xapian documents in book DB store only the book id 2021-04-27 16:59:21 +04:00
Veloman Yunkan f751aff2fb Full case/diacritics insensitivity in catalog filtering
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
2021-04-27 16:59:21 +04:00
Veloman Yunkan 87dc9d2723 Made catalog filtering by query diacritics insensitive
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.

Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
2021-04-27 16:59:21 +04:00
Veloman Yunkan 9c7366890d Catalog filtering by tags works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 19e195cb7d Filter::Tags typedef 2021-04-27 16:59:21 +04:00
Veloman Yunkan 3d5fd8f585 Catalog filtering by creator works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan d3d5abe14d Handling of non-words in publisher query
This change fixes the failure of the LibraryTest.filterByPublisher
unit-test broken by the previous commit.

The previous approach used in `publisherQuery()` for building a phrase
query enforcing the specified prefix for all terms fails if

1. the input phrase contains a non-word term that Xapian's query parser
   doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc);
2. the input phrase contains at least three terms that Xapian's query
   parser has no issue with.

Using the `quest` tool (coming with xapian-tools under Ubuntu) the
issue can be demonstrated as follows:

```
$ quest -o phrase -d some_xapian_db "Energy & security"
Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2))
Exactly 0 matches
MSet:

$ quest -o phrase -d some_xapian_db "Energy & security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db 'Energy 1/2 security act'
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db "Energy a#1 security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
```

The problem comes from parsing the query with the default operation set
to `OP_PHRASE` (exemplified by the `-o phrase` option in above
invocations of `quest`). A workaround is to parse the phrase with a
default operation of `OP_OR` and then combine all the terms with
`OP_PHRASE`.

Besides stemming should be disabled in order to target an exact phrase
match (save for the non-word terms, if any, that are ignored by the
query parser).
2021-04-27 16:59:21 +04:00
Veloman Yunkan a759ab989f Catalog filtering by publisher works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 7ccd9ffcce Catalog filtering by language works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 0c0a37073b Catalog filtering by category works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 415c65cf03 Catalog filtering by book name works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 8287f351e7 Final logic of Library::filterViaBookDB()
Moved the `filter.hasQuery()` check inside `buildXapianQuery()`.
`Library::filterViaBookDB()` only cares if the query that is going to be
run on the book DB would match all documents. The rest of changes
related to enhancing the usage of Xapian for the catalog search will
happen inside `buildXapianQuery()` and `updateBookDB()`.
2021-04-27 16:59:21 +04:00
Veloman Yunkan ea779ac200 Extracted buildXapianQuery() 2021-04-27 16:59:21 +04:00
Veloman Yunkan 80cd1fc989 Renamed 2 functions in Filter and Library 2021-04-27 16:59:21 +04:00
Veloman Yunkan 2d76f8395e Dropped unused functions from Filter's private API
This should have been done back in PR #460
2021-04-27 16:59:21 +04:00
Manan Jethwani 965b9622c2 removed redirect to articles in search 2021-04-20 20:23:42 +05:30
Veloman Yunkan 9d4370403b get_url() was renamed in zim::search_iterator 2021-04-16 13:30:36 +04:00
Vertigo 611146aa37 Added Search Link for bad bookName/articleName on 404 2021-04-12 21:31:47 +05:30
Veloman Yunkan b54215f146 Manager::readOpds() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan 9033f2f28e Manager::readXml() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan ec9186b174 Library::removeBookById() updates the search DB
This fix makes the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test pass.
2021-04-09 17:06:45 +04:00
Veloman Yunkan aaaa5a637e Library::filter() doesn't create empty books
This changes how the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test fails.
2021-04-09 17:06:45 +04:00
Veloman Yunkan 24ed96a38c Library.removeBookById() drops the reader too
This fix makes the `XmlLibraryTest.removeBookByIdDropsTheReader`
unit-test pass.
2021-04-09 17:05:56 +04:00
Manan Jethwani 5cb276a933 adding kind and path attributes to suggest response object and using it in autocomplete 2021-04-07 21:04:33 +05:30
Veloman Yunkan aa2a031ba4 Xapian headers are not exposed through libkiwix 2021-04-07 18:24:33 +04:00
Manan Jethwani 7872734f44 changed method of injecting root link 2021-03-24 14:17:58 +05:30
Manan Jethwani c557bb271b injecting root link directly and renamed head_part to head_taskbar 2021-03-24 02:10:16 +05:30
Manan Jethwani 93264f7409 added root functionality for block external link feature 2021-03-23 03:17:14 +05:30
Veloman Yunkan e214efecd4 Language code conversion via ICU
Language code is converted from ISO 639-3 to ISO 639 (which is
understood by Xapian) via ICU. The previous approach via an explicit
map had its advantages since Xapian has more than one stemmer
implementations for some languages (selectable via Xapian-specific
identifiers). This commit relies on the defaults associated with the
ISO 639 language codes.
2021-03-17 14:32:03 +01:00
Veloman Yunkan 09233bf4f3 Support for partial queries in catalog search
The search text in the catalog query is interpreted as partial by
default, but partial query mode can be disabled in C++. The latter
possibility is not exposed via the /catalog/search kiwix-serve endpoint,
though.
2021-03-17 14:32:03 +01:00
Veloman Yunkan a599fb3892 Initial version of Xapian-based catalog search 2021-03-17 14:32:03 +01:00
Veloman Yunkan a17fc0ef2d Library::getBooksByTitleOrDescription() 2021-03-17 14:32:03 +01:00
Veloman Yunkan db06b2c7ca Library::BookIdCollection typedef 2021-03-17 14:32:03 +01:00
Veloman Yunkan a20f9e2ce1 Library::filter() works in two stages
1. Get the subset of books matching the q (title/description) parameter
   of the search

2. Filter out books not matching the other parameters of the search.

Stage 1. currently works in the old way, but will be replaced by Xapian
based search in subsequent commits.
2021-03-17 14:32:03 +01:00
Veloman Yunkan b7b0bdbdd8 Both Book::update() methods update the category 2021-03-17 14:10:57 +04:00
Veloman Yunkan 4abc4f8518 Support for book category attribute in library.xml 2021-03-17 14:10:57 +04:00
Veloman Yunkan 6b2067c236 Reading category element from OPDS stream 2021-03-17 14:10:57 +04:00
Veloman Yunkan e55bf514e8 Dedicated 'category' parameter in catalog search 2021-03-17 14:10:57 +04:00
Veloman Yunkan 80d4f7e349 Extracted InternalServer::search_catalog() 2021-03-17 14:10:57 +04:00
Veloman Yunkan 58186ffb26 kiwix::Book::getCategory() 2021-03-17 14:09:48 +04:00
Veloman Yunkan ae32ff40c0 Dropped an extra colon from book <updated> dates 2021-03-17 14:02:27 +04:00
Veloman Yunkan 26331b401e Fixed the month in OPDS feed <updated> date
`tm::tm_mon` varies in the [0, 11] range.
2021-03-17 14:02:27 +04:00
Matthieu Gautier 67caae6c32 Use the new libzim's getRandomEntry instead of implementing it ourselves. 2021-03-02 14:16:09 +01:00
Veloman Yunkan 839fc10a4f Fixed the Windows build
Opening ZIM archives by file descriptor (as well as embedded
ZIM archives) is not supported under Windows.
2021-02-10 14:19:47 +01:00
Veloman Yunkan 5a8b825c70 Testing of JNIKiwixReader.getDirectAccessInformation() 2021-02-10 14:19:47 +01:00
Veloman Yunkan 7a465e66d7 Renamed org.kiwix.kiwixlib.{Pair->DirectAccessInfo} 2021-02-10 14:19:47 +01:00
Veloman Yunkan 5a99634dfd Java wrapper test checks favicon.png too 2021-02-10 14:19:47 +01:00
Veloman Yunkan e028bcbb04 Android's java.io.FileDescriptor is different 2021-02-10 14:19:47 +01:00
Veloman Yunkan 9cdf7a44c0 JNIKiwixReader can open an embedded ZIM archive 2021-02-10 14:19:47 +01:00
Veloman Yunkan 4d23e44de7 JNIKiwixReader ctor taking a file descriptor
... and a corresponding unit test
2021-02-10 14:19:47 +01:00
Veloman Yunkan 98d69ef59b Added testReader unit-test for the java wrapper 2021-02-10 14:19:47 +01:00
Veloman Yunkan e40827fbac Renamed the java wrapper unit test runner script 2021-02-10 14:19:47 +01:00
Veloman Yunkan a798e0c0a1 Made the java wrapper unit test run & pass
The kiwixlib java wrapper unit test can be run manually via the
src/wrapper/java/org/kiwix/testing/compile_test.sh script.

The test ZIM files in src/wrapper/java/org/kiwix/testing were created
using the create_test_zimfiles. They must be updated/re-generated and
committed in git whenever their source data or the create_test_zimfiles
script changes. Note: small.zim.embedded is not used at this point, it
was created for testing the enhancement coming in a few commits.
2021-02-10 14:19:47 +01:00
Matthieu Gautier 24b2e6e585 Remove unnecessary include. 2021-01-26 17:53:25 +01:00
Matthieu Gautier 3fd1310008 Use c++11 std::thread instead of pthread. 2021-01-26 17:53:25 +01:00
Matthieu Gautier 4749656828 Do not crash if zim file has no `Counter` metadata. 2021-01-26 15:15:27 +01:00
Emmanuel Engelhart 84895c4036 Better </head> detection regex 2021-01-18 13:16:56 +01:00
Emmanuel Engelhart a8bf9dd5b4 Better Kiwix Serve Taskbar insertion (after charset definition) 2021-01-18 11:18:53 +01:00
Emmanuel Engelhart a61c94ef10 Add GPLv3 header 2021-01-18 10:54:33 +01:00
Emmanuel Engelhart 8c43fd8d36 Fix taskbar insertion in case of '<head>' attributes 2021-01-11 14:37:19 +01:00
Emmanuel Engelhart 3e2810dff4 Support 'video/*' * 'audio/*' mimetypes in getMediaCount() 2021-01-07 12:32:32 +01:00
Emmanuel Engelhart 44c4aa931a Better use kiwix::startsWith() 2021-01-03 15:17:03 +01:00
Emmanuel Engelhart 95b32b168d More robust getMediaCount() 2021-01-01 17:05:32 +01:00
Matthieu Gautier 1002c15e0d Remove unnecessary checks.
`Reader` cannot be created with a null `zimArchive`.
We don't have to check for zimArchive being not null.
2020-12-09 14:25:02 +01:00
Matthieu Gautier d51000c4a9 Use new libzim method `hasFulltextIndex` to check for fulltext index. 2020-12-09 14:25:02 +01:00
Matthieu Gautier ba302bed33 Use new libzim method `getFaviconEntry` to get the favicon. 2020-12-09 14:25:02 +01:00
Steve Wills 6900b4e506 fix build on FreeBSD
With this header, sockaddr_in and INADDR_ANY are not defined
2020-12-07 09:38:46 -05:00
Matthieu Gautier 1a5a2e7a8e Adapt kiwix-lib to the new libzim api. 2020-12-02 12:16:48 +01:00
Matthieu Gautier d87079ec13 Remove deprecated method in the reader. 2020-11-24 19:00:52 +01:00
Veloman Yunkan 0f8fe1f63f Alternative implementation of parseMimetypeCounter() 2020-10-29 14:11:27 +04:00
Matthieu Gautier 08464f23bc Better parsing of `M/Counter`
Mimetype may contain a parameters.
Then, the mimetype would be something like "text/html;foo=bar;foz=baz"

It will contains a `;` and `=` and it conflicts with the same operators
we use to separate the items in our list.

We have to use a more advanced algorithm which takes the context into
account.

Fix #416
2020-10-28 16:03:18 +01:00
Matthieu Gautier ef42abea4b Add some tests of `parseMimetypeCounter` 2020-10-28 14:44:23 +01:00
Matthieu Gautier 4407dd12bd Move mimetypeCounter parsing in its own function. 2020-10-28 14:08:06 +01:00
Matthieu Gautier 632583ede2 Add missing include 2020-10-07 18:43:57 +02:00
Matthieu Gautier 61f9d4ab3a Stop the internal server only if it exists. 2020-10-07 14:36:45 +02:00
Matthieu Gautier 470bfc3f1f Better variable name for outStream. 2020-08-28 15:27:03 +02:00
Matthieu Gautier ea3180cb8c Better error printing. 2020-08-28 15:27:03 +02:00
Matthieu Gautier 72d3f8f8e2 Fix segmentation fault with curl requests.
Use a heap allocated buffer (with lifetime of Aria2 class) instead of
a stack allocated one.

Original fix made by @ZaWertun. Kudos to him.

Fix #kiwix/kiwix-desktop#123, kiwix/kiwix-desktop#513
and kiwix/kiwix-desktop#423
2020-08-26 12:42:16 +02:00
Matthieu Gautier af9e03904c Use std::mutex and std::unique_lock instead of pthread mutex/lock.
It simplify a bit the code and ensure that mutex is correctly unlock
even in case of exception.
2020-08-26 12:30:56 +02:00
Matthieu Gautier 39611cbd60 Wait for waitingThread to exit before destroying the subprocess memory.
WaitingThread read some shared memory with the SubProcess
(`mutex`, `m_running`).
When we destroy the SubProcess, we must be sure that WaitingThread has
correctly finished else we may have invalid read/write on freed memory.
2020-08-26 12:26:04 +02:00
Matthieu Gautier 6f0d3003ac Remove `m_compress` member. 2020-08-13 11:16:41 +02:00
Matthieu Gautier ee17b0739a Fix compilation on CI native dyn.
On the CI, the native_dyn docker image is setup with a packaged version
on libmicrohttpd for which `MHD_HTTP_RANGE_NOT_SATISFIABLE` is not
defined.

When the CI will be fixed, we can revert this commit.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 47436f7bdd Move some header setting in response's constructors.
It make easier to understand what is somehow constant and what depends
of the context.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 3352c95314 Remove the `RedirectResponse` and use a basic `Response` with header. 2020-08-13 11:16:41 +02:00
Matthieu Gautier 77123ac74c Move the adding of 304 headers in 304 factory.
This avoid us to create a ContentResponse just to have some correct
headers.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 9078f0ac6e Remove `ResponseMode`. 2020-08-13 11:16:41 +02:00
Matthieu Gautier 8d6567d067 Create a utility builder for 416 response.
Also add a map in the response to store specific headers.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 6d5cddca12 Fix android compilation
Android clang complains about the fact it cannot move the
`std::unique_ptr<ContentResponse>` into a `std::unique_ptr<Response>&&`
(for the implicit `std::unique_ptr<Response>` constructor).
Let's help him a bit.
2020-08-13 11:16:41 +02:00
Matthieu Gautier a3939e9a05 Move all the content code in the ContentResponse. 2020-08-13 11:16:41 +02:00
Matthieu Gautier eee621d15b Move small utilities method to create response in Response class. 2020-08-13 11:16:41 +02:00
Matthieu Gautier 7b2ee37437 Move the entry response to its own class. 2020-08-13 11:16:41 +02:00
Matthieu Gautier f014fb2895 Introduce a ContentResponse.
This is only an "interface" for now as other type of response (entry) may
be "transformed" to a ContentResponse.
We cannot move all the code in the class.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 1011d1ff0b Move the redirection response in its own class.
The redirection is the easiest to move, let's start with this one.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 9e351b279e Remove `get_default_response` in favor of a static Response method.
We want to build different kind of response depending of the context.
2020-08-13 11:16:41 +02:00
Matthieu Gautier a0bdc0821c Move internalServer code into its own source files. 2020-08-13 11:16:41 +02:00
Matthieu Gautier a819d9e3e0 Make the server handle pointer to response instead of plain response.
This is a preparatory work.
We will specialize the response and so we need a pointer to response
instead of plain response.
2020-08-13 11:16:41 +02:00
manan jethwani c74b935a9b added pageLength for search_pagination 2020-08-12 02:08:02 +05:30
Matthieu Gautier a55d504017 Fix getArticleCount.
With #403, the article mimetype may be different than "text/html".
It can also be "text/html; raw=true".
(And in fact it already could have any kind of optional argument).
2020-08-11 18:27:54 +02:00
Matthieu Gautier 87b5adcaf4 Make the response responsible to detect if we must introduce taskbar.
The response detect if taskbar must be added depending of the mimetype.

Now, `set_taskbar` can be call unconditionally
(no need to check for the mimetype)

And we don't need to call set_taskbar if we have no information to set.
2020-08-11 18:27:54 +02:00
Veloman Yunkan c4e6313c90 x in a --> a.contains(x) in meson.build files 2020-08-11 18:17:18 +02:00
renaud gaudin 3f25a3d005 Fixed #391: prevent taskbar and blocker at article level
Some HTML articles are meant to be displayed through a viewer. In this case,
we know we don't want the server to inject the taskbar nor the link blocker
because the content is not a user-ready web page but a partial element of it.

Such articles still need to be `text/html` to be parsed properly by browsers.

This changes the way we decide to display the tasbar or not.
Previously, we were adding it to every article with a MIME __starting with__ `text/html`.
Now, we're additionally preventing it on `text/html` MIME if there is a `;raw=true` string inside.

This leaves articles with MIME `text/html;raw=true` (warc2zim convention) outside
of the taskbar target.

For similar reasons, the external-link blocker is set to apply to the same set of articles.
Previously, it was applied to all articles which was an (unoticable) mistake.
2020-08-07 09:26:24 +02:00
MananJethwani 599aaa4c1b added code for status code 204 for empty return of search. 2020-08-01 01:45:42 +05:30
Veloman Yunkan 3d425f44de Request header case is ignored
Originally reported against case sensitivity of the Range header
(see issue #387), this fix applies to all request headers (since
according to RFC 7230 all header fields are case-insensitive, see
https://tools.ietf.org/html/rfc7230#section-3.2). However, a
corresponding unit-test was added only for the Range header.
2020-07-30 16:01:51 +02:00
Matthieu Gautier 7ece383004 Add support for samba path on windows.
Fix kiwix/kiwix-desktop#429
2020-07-15 11:40:02 +02:00
Kelson cf8e8b94eb Fix compilation with libmicrohttpd v0.97.1 2020-07-08 14:42:46 +02:00
Matthieu Gautier 4d307e18eb Add new thread safe suggestion API.
Previous API were using an internal vector to store the suggestions search
results.

The new API takes a vector as out argument. So user can call the functions
without having to protect the search.

We should change the android API to reflect the change but it is a bit
more complex to do at JNI level. As android do not call it multithreaded
we are safe for now. And we need the new API asap for kiwix-desktop.

So we keep the same API on android for now, the new api will be made
in next version.
2020-07-01 17:16:13 +02:00
Kunal Mehta fb79cde729 Pass -latomic for architectures that need it
Some architectures, specifically armel, mipsel, m68k & powerpc in
Debian, need to explicitly link to atomic.

Use meson to see if the target's CPU family is one of those, and if so,
pass -latomic to the linker.

Tested on armel and mipsel machines to verify passing -latomic works, and
on armhf and amd64 to ensure normal builds aren't broken.

Fixes #371.
2020-06-29 00:18:13 -07:00
Matthieu Gautier ff605873ed Include missing `algorithm` header.
`min` and `max` functions are defined here.
2020-06-10 15:27:51 +02:00
Veloman Yunkan 05ef5d5f51
Assertion in ByteRange allows 0-sized content
The assertion in the ByteRange constructor was written under the assumption that the content must have non-zero size. Now it allows that corner case.
2020-06-02 21:53:47 +04:00
Veloman Yunkan f52b220d01 Dropped RequestContext::has_range() 2020-05-26 14:10:26 +04:00
Veloman Yunkan 50a850f3a9 Fixed a comment 2020-05-26 14:04:18 +04:00
Veloman Yunkan 886ae17274 Fixed a CodeFactor issue 2020-05-26 13:59:47 +04:00
Veloman Yunkan 85d6daabac Rolled back minor unneeded changes 2020-05-26 13:10:50 +04:00
Veloman Yunkan 5f1918d005 Split a long line 2020-05-26 13:04:03 +04:00
Veloman Yunkan 16bd79fa1b Final clean-up of byte_range.{h,cpp} 2020-05-26 12:50:08 +04:00
Veloman Yunkan c2ebdefe8d Handling of unsatisfiable ranges 2020-05-26 02:11:26 +04:00
Veloman Yunkan 37032892a4 Fixed compilation error under win32_*
ERROR is a macro under Windows
2020-05-26 01:58:17 +04:00
Veloman Yunkan 6b43438b74 Fixed compilation error under native_dyn
MHD_HTTP_RANGE_NOT_SATISFIABLE is not defined in the older version of
libmicrohttpd (that is used under CI/Linux native_dyn).
2020-05-26 01:54:36 +04:00
Veloman Yunkan 7301bf89bb Some refactoring of byte-range parsing 2020-05-26 01:50:29 +04:00
Veloman Yunkan ff23b28e7c Removed unnecessary qualifier 2020-05-26 01:41:37 +04:00
Veloman Yunkan 931e95f391 Invalid byte ranges result in 416 responses 2020-05-26 01:40:07 +04:00
Veloman Yunkan f7571b5b69 Content-Range header is set only for partial content 2020-05-25 17:42:18 +04:00
Veloman Yunkan 801ad18a89 ByteRange::resolve() 2020-05-25 17:27:35 +04:00
Veloman Yunkan 67a347c0c4 Moved byte-range parsing to byte_range.cpp 2020-05-25 17:21:10 +04:00
Veloman Yunkan 693905eb68 Default constructed ByteRange is a full range 2020-05-25 17:17:56 +04:00
Veloman Yunkan f3e79c6b4c Introduced src/server/byte_range.cpp 2020-05-25 16:43:44 +04:00
Veloman Yunkan 52f207eaa6 Support for single-ended byte ranges 2020-05-25 16:37:01 +04:00
Veloman Yunkan 67294217a8 ByteRange::Kind 2020-05-25 16:23:44 +04:00
Veloman Yunkan d111a40ce8 Response::m_byteRange 2020-05-23 20:35:22 +04:00
Veloman Yunkan 0c5bb3fcfe Moved ByteRange to a header file of its own 2020-05-23 20:08:53 +04:00
Veloman Yunkan 3fba8c20a0 Converted RequestContext::ByteRange to a class
Also renamed the `range_pair` data member of `RequestContext` to `byteRange_`
2020-05-23 19:59:47 +04:00
Veloman Yunkan 54db6049b7 Byte-range parsing not exposed in the header file 2020-05-23 18:58:19 +04:00
Veloman Yunkan 81c38d6b2b parse_byte_range() without side-effects 2020-05-23 18:53:16 +04:00
Veloman Yunkan e6a86c02ae Got rid of RequestContext::accept_range 2020-05-23 17:15:42 +04:00
Veloman Yunkan a0f7f32570 Re-ordered function definitions 2020-05-23 17:11:26 +04:00
Veloman Yunkan c39fce8839 RequestContext::parse_byte_range() 2020-05-23 17:09:51 +04:00
Veloman Yunkan de37489c53 Range header starts with a unit spec
After this commit valid ranges of the form "bytes=firstbyte-lastbyte" should
be handled correctly.
2020-05-22 17:17:31 +04:00
Veloman Yunkan 2a35a86de6 Fixed the size value used creating a response
In case of a partial response the size of the response is different
from the served entry size.
2020-05-22 16:49:35 +04:00
Veloman Yunkan 0a30a77c08 Handling of out of bound byte ranges 2020-05-22 16:46:38 +04:00
Veloman Yunkan 1a99bacfe3 Byte ranges are inclusive
The second component of a byte range, if present, designates the
index of the last byte to be included in the partial response.
2020-05-22 16:30:43 +04:00
Kelson 94c2ab4395 Add two OPDS related mime-types to compress for HTTP 2020-05-18 08:19:51 +02:00
Kelson 0f07cab920 Small HTTP header beautification 2020-05-17 20:19:19 +02:00
Veloman Yunkan 5f0a9d0b08 Added a comment clarifying a non-obvious case 2020-05-15 15:17:04 +04:00
Veloman Yunkan 54f5dbbd35 Handling of If-None-Match conditional requests 2020-05-14 17:01:22 +04:00
Veloman Yunkan 95a5cde359 ETags are set in the response as needed
Also added server-unit tests related to ETags in the response.
2020-05-14 17:01:22 +04:00
Veloman Yunkan 3d08ef43f2 HEAD request is not rejected
libmicrohttpd handles HEAD requests by dropping the body of the response
(if any). Hence letting a HEAD request through into the code that
processes GET requests is safe.

Also added server unit-tests related to the handling of HEAD requests.
2020-05-14 17:01:22 +04:00
Veloman Yunkan bfa51c2d87 Refactoring: got rid of duplicate get_mime_type() 2020-04-29 18:33:25 +04:00
Veloman Yunkan 81e781133d Refactoring: utilized is_compressible_mime_type() 2020-04-29 18:33:01 +04:00
Veloman Yunkan 9ec7757efe Refactoring: smart Response::set_entry()
Response::set_entry() was upgraded from a simple setter to a method
performing certain business logic that was previously taken care of by
InternalServer::handle_content().
2020-04-29 18:22:15 +04:00
Veloman Yunkan 7bd7ec4937 Refactoring: preparing to move some code 2020-04-29 18:22:15 +04:00
Veloman Yunkan 14d8583c83 Refactoring in InternalServer::handle_content()
Deduplicated common code found in the two branches of the last
if(){}else{} statement in InternalServer::handle_content().
2020-04-29 18:22:15 +04:00
Veloman Yunkan a004d96cd7 Refactoring: extracted get_range_len() 2020-04-29 18:22:15 +04:00
Veloman Yunkan 21c6de2f80 Refactoring: split Response::create_mhd_response()
The changes are easier to understand in ignore-white-space mode
(git diff -w, git show -w).
2020-04-29 18:22:15 +04:00
Veloman Yunkan a8e78f27e1 Refactoring: extracted Response::create_mhd_response() 2020-04-29 18:22:15 +04:00
Veloman Yunkan 6c7ab6ff54 Refactoring: moved local variable declarations 2020-04-29 18:21:40 +04:00
Veloman Yunkan 659ee6ba71 Refactoring: extracted InternalServer::build_redirect() 2020-04-29 16:08:10 +04:00
Veloman Yunkan 83ee8dec15 Made InternalServer::get_default_response() const 2020-04-29 16:08:10 +04:00
Veloman Yunkan 87cbbed9e3 Refactoring: extracted is_compressible_mime_type() 2020-04-29 16:08:10 +04:00
Veloman Yunkan a058520628 Refactoring: extracted get_mime_type() 2020-04-29 16:08:10 +04:00
Veloman Yunkan 1ef5ebfb52 Refactoring: extracted InternalServer::get_reader() 2020-04-29 16:08:10 +04:00
Veloman Yunkan bbc06931ad Refactoring: extracted get_book_name() 2020-04-29 16:08:10 +04:00
Veloman Yunkan 2d3bf9b981 Refactoring: extracted InternalServer::homepage_data()
Also typedef'ed kainjow::mustache::data as MustacheData
2020-04-29 16:08:10 +04:00
Veloman Yunkan fd80f2a89f Refactoring: extracted fullURL2LocalURL()
Also dropped RequestContex::valid_url
2020-04-29 16:08:10 +04:00
Veloman Yunkan abb3dec700 Refactoring: extracted str2RequestMethod() 2020-04-29 16:08:10 +04:00
luddens 0586ef6d41 fix open external zim
Check if the parameter `pathToSave` is empty before use it otherwise the
book path is empty too, which causes crash on opening external zim files
2020-04-20 15:22:36 +02:00
Matthieu Gautier 9d8bf8ddcb Create the dataDirectory before returning its path. 2020-04-15 08:24:55 +02:00
Matthieu Gautier 4c8aad0e68 Do not use std::fstream has it doesn't support wchar path.
This is surprising, but C++11 fstream doesn't have a constructor
that take wchar as path.
So, on windows, we cannot open a stream on a path containing non ascii
char. VC++ provide an extension for that, but it is not standard and
g++ mingwin doesn't provide it.

So move all our write/read tools function to the plain old c versions,
using _wopen to open wide path on windows.
2020-04-14 18:13:35 +02:00
Matthieu Gautier eb6f0f710c Correctly detetect the dataDir on windows.
We must use the wide version of the getenv to correctly handle the case
we have accents in the user directory.

This also change the default dataDirectory on windows from $APPDATA to
$APPDATA/kiwix.
2020-04-14 12:12:34 +02:00
Matthieu Gautier cbf5bd57a8 Adapt to new libzim api.
It is not possible to create a iterator without argument anymore.
2020-04-13 16:06:17 +02:00
Matthieu Gautier 533541cf45 Write the articleCount and mediaCount metadata in the OPDS stream.
This is not standard OPDS. But clients need this information.
2020-04-07 12:22:44 +02:00
renaud gaudin 7155c788e2 attach taskbar to `<head>` instead of `<head>\n`
Fixed a regression introduced in block-external-links feature.

For cleaner source, the taskbar (and the block-external JS file) were both
attached to `<head>\n`.
Unfortunately, this isn't safe enough as some ZIM files might have all kinds of HTML
syntax. Sotoki for instance have no CR after head, rendering the attachment impossible.

Note: realizing this method is somehow fragile as any HTML content with extra attribute
on the `<head>` tag or without a `<head>` tag would break the taskbar and the block external feature.
2020-04-03 16:53:43 +02:00
renaud gaudin 4709a42f4f disable external links blocking on 500 handler 2020-03-30 14:42:37 +00:00
renaud gaudin d04d9bf7f3 Unblock external link in catch page in JS code
Instead of disabling the blocking for the handler, the JS code detects it is
displaying the handler and allows external links to go through
2020-03-27 12:26:22 +00:00
renaud gaudin 412f0d9c61 moved blockExternalLink outside of taskbar
- `setBlockExternalLinks()` on server
- zero-dependency JS code
- JS script added in `inject_externallinks_blocker()`
- changed URL to `/catch/external?source=<source>`
2020-03-27 11:25:39 +00:00
renaud gaudin 0ad8bf45fc Add external links blocking in serve
In many use cases, it is not wanted to have user accidentaly click on external links
and leave the served ZIM content.
This could be because the result is unpredictible (reader not implementing this properly)
or because the serve user knows there's no backup internet connexion or because there is
an induced cost behind external links that doesn't affect served content.

using a new flag (`blockExternalLinks`) on `Response`/`setTaskBar`, a piece of JS code
is injected into the taskbar code.
This code adds a JS handler on all link click events and verifies the destination.
If the destination appears to be an external link (1), the link target is changed to
a specific URL:

```
/external?source=<original_uri>
```

(1) external is a link that's not on the same origin and starts with either `http:` `https:` or `//`.

Server implements a new handler on `/external` that displays a new page (`captured_external.html`)
which returns a generic message explaining the situation and offering to click on the link
again should the user really want to.
This is done by specifically asking `set_taskbar` to not block external requests on that page.

This approach allows integrators using a reverse proxy to handle that endpoint differently (rebrand it)

1. `Server` now has an `m_blockExternalLinks` defaulting to `false`
1. `Server.setTaskbar` is extended to support an additional bool to set the variable.
1. `Response` now has an `m_blockExternalLinks`
1. `Response` constr expects an additional bool for `blockExternalLinks`.
1. `Response.set_taskbar` is extended to support an additional bool to set the variable.
1. JNI/Java Wrapper reflects the extensions.
1. New resource file `templates/block_external.js` (included in head_part). Should it be in skin?
1. New resource file `templates/captured_external.html` for `handle_captured_external()`
1. Added a comment on `head_part.html` to help with JS insertion at the right place
1. `introduce_taskbar()` conditionnaly inserts the JS inside the taskbar
2020-03-26 12:06:36 +00:00
Matthieu Gautier 064d5f3fa6 Make the search argument constant. 2020-03-06 12:08:05 +01:00
Matthieu Gautier 46626a3f98 Add the method get bookByPath in library. 2020-03-06 12:08:05 +01:00
Matthieu Gautier 76c293e403 [JAVA] Add a method to get the size of an article.
Fix #327
2020-03-04 16:40:03 +01:00
Matthieu Gautier 2e60a088ab [JAVA] Use a long to store the offset of a article in the zim file.
Fixes kiwix/kiwix-android#1769
2020-02-19 14:27:51 +01:00
Matthieu Gautier ea29557a33 [Java] Add a wrapper on method to update book from another book or reader. 2020-02-11 17:44:14 +01:00
Matthieu Gautier b53f531f2b Fix typo getTagStr in the wrapper 2020-02-10 17:25:31 +01:00
Matthieu Gautier 1632e9c55b Fix launching command with path containing spaces.
We must correctly quote path with space on windows.
This is needed as we can't launch command using a array of string on
windows but by giving only one string using space as separator.

Fix kiwix/kiwix-desktop#268
2020-02-06 11:46:33 +01:00
Kelson 6a975994cc Include stdexcept to fix GCC v10 compilation 2020-02-01 13:52:27 +01:00
Matthieu Gautier ce6e956434 [OPDS] Add the url argument to filter by size and name.
Fix kiwix/kiwix-tools#231
2020-01-30 19:02:33 +01:00
Matthieu Gautier f560a1f815 Be able to filter the books by name. 2020-01-30 19:02:33 +01:00
Matthieu Gautier 34257cfc1f Trust the library.xml information by default.
Do not try to read the zim file and update the book when parsing a
library.xml.
Needed by kiwix/kiwix-tools#319
2020-01-30 18:22:07 +01:00
Matthieu Gautier a756e7f8f3 Add flavour attribute to book.
Fix #259
Fix kiwix/kiwix-tools#316
2020-01-30 17:48:56 +01:00
Matthieu Gautier bc257d2d6d Add method to get value of tag from a book.
Only the `getTagStr` method is available on android because we need a
proper exception handling on wrapping side.

Fix #298
2020-01-30 17:48:56 +01:00
Matthieu Gautier 7275f9b8e3 Move function to convert and use tags inside otherTools. 2020-01-30 17:48:56 +01:00
Matthieu Gautier 2881face70 Add missing setting of attribute. 2020-01-30 15:42:54 +01:00
Matthieu Gautier 77ba09c310 Reorder setting and dumper of book attribute.
No real change. Reordering setting and dumping of attribute in the same
order (mostly) they are declared in book.h make it easier to detect missing
attribute.
2020-01-30 15:42:54 +01:00
Matthieu Gautier 49aa0fbb9f Use a macro to write the filters. 2020-01-30 15:42:54 +01:00
Matthieu Gautier 7846b45bef Fix opds filtering by tag. 2020-01-30 15:42:54 +01:00
Matthieu Gautier 7e26f3502d Fix opds dumper/parser.
Add missing attributes.
2020-01-30 15:42:54 +01:00
luddens 4b6c26bd0b add parameter option to start a download with aria2
Downloader::startDownload has a new parameter option which is a vector of
pair that represents the options that can be set for adding a uri with aria2
with the function Aria2::addUri.

Aria2::addUri uses this parameter to set the struct of parameters for the
aria2 command
2020-01-29 15:38:09 +01:00
luddens 6bcecc2677 remove useless error handling functions
Some of the changes from https://github.com/kiwix/kiwix-lib/pull/292
that doesn't work properly are removed
2020-01-28 16:58:55 +01:00
Matthieu Gautier 82afb804e1 Add a small java test on the kiwix-lib.
Will the compilation should be made by meson.
It seems it is not possible to specify a existing jar
to link with. Use a custom script for now.
2020-01-28 12:08:18 +01:00
Matthieu Gautier 0951546356 Create the jar library when creating the java wrapper. 2020-01-28 12:08:18 +01:00
Matthieu Gautier f09c739c1f Be able to create a wrapper for java.
Android is a specific wrapper.
Java is another one.
2020-01-28 12:08:18 +01:00
Matthieu Gautier df9ddd5451 Use correct mutex on android and java. 2020-01-28 12:08:18 +01:00
Matthieu Gautier fe513951d3 Use a macro to print error log.
This allow use to compile the JNI wrapper not for
android.
2020-01-28 12:08:18 +01:00
Matthieu Gautier 7f0d509a88 Add the filter functionality. 2020-01-28 12:08:18 +01:00
Matthieu Gautier 54f671b2f1 Add some methods to get information from the library.
Else, the library is useless.
2020-01-28 12:08:18 +01:00
Matthieu Gautier c2c89c6c86 Rename the JNIKiwixLibrary class to Library.
This mainly use the "new memory system". No need to call dispose function.
Rename the class to Library to conform with the naming semantics
(JNIKiwix* use old memory system)
2020-01-28 12:08:18 +01:00
Matthieu Gautier 75652d0e9f WIP Add a wrapper around the kiwix::manager class.
The JNIKiwixManager is used to manage (insertion of book in) the library.

It is created, as needed, using an existing Library as input.
It is then used to add books, parse library.xml or opds content.
Then it can be destruct (and must be) with the `dispose` method.

```java
library = JNILibrary(...);
manager = JNIManager(library);
manager.parseOpds(opdscontent);
manager.dispose();
// library contains the books declared in the opds content.
// Use the library methods to get the books' info.
```
2020-01-28 12:08:18 +01:00
Matthieu Gautier 6535dc2e38 Add a wrapper for the Book class. 2020-01-28 12:08:18 +01:00
Matthieu Gautier 6b2f768c8f Use template function c2jni to convert c++ type to jni. 2020-01-28 12:08:18 +01:00
luddens ff21a095cb update book even if the members aren't empty
remove the conditions to always update the book
2020-01-27 17:50:44 +01:00
Matthieu Gautier 91db055d86 Remove function to read file using a native path.
All path must be utf8. This is already the case in all our project.
(If this not the case, this is a bug)

So we don't need to have a version with a native and utf8 path.
2020-01-13 16:59:58 +01:00
Matthieu Gautier 5540149e2b Correctly open the library path on windows.
We need to convert the path to wstring on Windows to handle directory
with accented characters.

Fix kiwix-desktop#269
2020-01-13 16:54:09 +01:00
Matthieu Gautier 071e9e3fec Correctly filter the catalog when we don't what to filter.
Set the different filter's fields only when we are requested to filter
them. Else, we ends to requests that some fields are empty.

If the request has no argument, we raise an exception (catched) and so
we don't set the corresponding field in the filter.

Fix #303
2020-01-07 17:29:43 +01:00
Matthieu Gautier 4a01303438 Correctly set the id of the opds stream.
Set the id of the stream *after* the uuid is generated.
2020-01-07 17:29:29 +01:00
Kelson b7c5e5f339 Remove trailing spaces 2019-12-08 11:52:16 +01:00
Kelson 52e165cf78 Reintroduce kiwix-serve taskbar 2019-11-26 11:54:00 +01:00
Kelson 9c4867a95a Update Changelog 2019-11-20 13:06:24 +01:00
Emmanuel Engelhart de7b7c34b5 Remove absolute internal URL support 2019-11-07 18:05:58 +01:00
luddens 20a2c78733 add get aria2 launch cmd method 2019-11-01 15:27:21 +01:00
luddens 9850be7267 add Curl error message 2019-11-01 15:27:21 +01:00
luddens 0dd996c6a3 add try catch around aria2 first commands 2019-11-01 15:27:21 +01:00
luddens c9a15c9961 Add a parameter to getBookmarks fct to get valid bookmarks only
The default value of this parameter is false, in this case all the bookmarks
are returned, otherwise only those who are related to books of the library.
2019-10-31 14:05:21 +02:00
luddens 9975e0b369 add setPort() method 2019-10-28 15:56:49 +01:00
Aditya-Sood 2af9ba4eab Readd original getNextSuggestion() 2019-10-01 13:30:35 +05:30
Aditya-Sood c007373b46 Re-add comment 2019-10-01 13:30:34 +05:30
Aditya-Sood e1acf9acff Code & local repository cleanup 2019-10-01 13:30:34 +05:30
Aditya-Sood daaadf3e1c Comment out previous definitions 2019-10-01 13:30:34 +05:30
Aditya-Sood 74bd482335 Preliminary work 2019-10-01 13:30:34 +05:30
Matthieu Gautier 67170709bb Convert path get from windows environment to utf8.
Fix kiwix/kiwix-desktop#203
2019-09-25 18:07:42 +02:00
Matthieu Gautier 0db06d98a8 Add missing implementation of android's getArticleCount and getMediaCount.
Fix #281
2019-09-24 11:47:05 +02:00
Matthieu Gautier 598dd3c175 [API Break] Fix pathTools (and a bit stringTools).
Api changes :
 - removeLastPathElement do not takes extra arguments
   `removePreSeparator` and `removePostSeparator`.
   This is not needed as path do not need special tailing separator.
 - Only one function `split`. Arguments can be implicitly convert to
   string. No need for overloading functions to explicitly cast them.
 - `split` function takes another argument `trimEmpty`. If true, empty
   element are removed.

Path manipulation now almost pass trough a vector<string> to store each
path's part.

Most of the complex works is now made in the normalizeParts function.
2019-09-19 18:16:06 +02:00
Matthieu Gautier 2f4636e2df Fix stringTools join function. 2019-09-17 16:22:28 +02:00
Matthieu Gautier 9b4419f3fc [ABI Break] Correctly detect the executable path in appimage.
There are two executable path :
- The user one (the appimage path)
- The real one (in the appimage archive)

When we search of `library.xml` we need the user one.
But when we search of `aria2c` or `kiwix-serve` we need the real one.

Fix kiwix/kiwix-desktop#256
2019-09-17 11:23:16 +02:00
Matthieu Gautier 6ee174b546 Add a method to get the value of a specific tag.
Fix #258
2019-09-17 10:37:53 +02:00
Matthieu Gautier 2a6772b76d [API Change] Convert tags to the new convention.
Use the new convention describe here : https://wiki.openzim.org/wiki/Tags
2019-09-17 10:30:24 +02:00
Matthieu Gautier 660d5d7fb7 [API Change] Rename getMatatag to getMetadata. 2019-09-16 10:36:04 +02:00
Matthieu Gautier 157c1c939c Add a string tool to join a list of strings together. 2019-09-16 09:42:10 +02:00
Matthieu Gautier bd91e89785 Add missing method to get the zim metadata.
According to https://wiki.openzim.org/wiki/Metadata
2019-09-12 15:33:07 +02:00
Matthieu Gautier 1245d4e467 Use a macro to get the content of the metadata. 2019-09-12 15:26:53 +02:00
Matthieu Gautier 420be55bfa Reorder methods to get metadata.
Use the same order than https://wiki.openzim.org/wiki/Metadata
2019-09-12 15:24:17 +02:00
Matthieu Gautier e42e061d45 Add a way to specify a library to use with kiwix-serve.
If kiwix-desktop use a `library.xml` in the same directory than the
executable, we need to use it instead of the default one.

Instead of detect again the `library.xml` to use, let `kiwix-desktop` set
the library to use.

This also fix a issue when `/` is not a valid path separator in windows.
2019-09-11 14:04:21 +02:00
Matthieu Gautier 3294508d87 Correctly cast double to int.
Ms cl compiler complains about the implicit conversion.
2019-09-10 14:10:40 +02:00
Matthieu Gautier a32363e6a2 Correctly detect the executable path if we use a AppImage.
AppImage works by decompressing the "program" in a temporary directory.
So the executable path is not the path of the AppImage file.

By using the environment variables set by appimage we can find the correct
"path" of the executable.

Fix kiwix/kiwix-desktop#46
2019-09-09 18:27:53 +02:00
Matthieu Gautier 87dc145dc7 Correctly set searcher information even if resultStart equals resultEnd. 2019-09-09 14:43:51 +02:00
Matthieu Gautier a13244dc0e Rename `hasResult` to `hasResults` 2019-09-09 14:43:51 +02:00
Matthieu Gautier 78dbd66522 [HTML Rendering] Do not render page navigation buttons if only one page. 2019-09-09 14:43:51 +02:00
Matthieu Gautier fdc291b7c2 [HTML Rendering] Do not do division by zero.
We must correctly handle the case if resultStart is equal to resultEnd.
2019-09-09 14:43:51 +02:00
Kelson d0833bdcd4 Fix fulltext search link in kiwix-serve suggestions 2019-09-04 17:07:05 +02:00
Matthieu Gautier 12a93c3e29 Fix use of strtok on windows.
On windows, strtok_r is called strto_s.
2019-08-19 17:35:22 +02:00
Emmanuel Engelhart cea201b394 Specs says ZIM favicon should be at '-/favicon', should be tried first 2019-08-19 16:27:55 +02:00
mhutti1 8672aede97 Add JNIKiwixString constructors 2019-08-13 17:55:48 +02:00
Matthieu Gautier 6ab52e2b9e Add a JNI method to check if an url exists and is a redirection. 2019-08-13 11:19:39 +02:00
Matthieu Gautier d90a27af11 Fix nameMapper initialization. 2019-08-12 16:16:26 +02:00
Matthieu Gautier a65e192f0f [JNI] Allow android to know that an article is a redirect.
Android need to handle the redirection by doing a redirection in the web
view, not by providing the content of the targeted article.

This is already what we do in kiwix-serve or ios.

The API should be far better by returning a Entry but for now,
we just change the given url if the article is a redirection.
2019-08-12 13:03:20 +02:00
Matthieu Gautier 513cc9c90f [JNI] Fix log typo. 2019-08-12 12:43:52 +02:00
Matthieu Gautier 231ae095f6 Correctly set that book's path is valid when updating it from a reader. 2019-08-12 12:43:52 +02:00
Matthieu Gautier 9a0c6da018 Library construction doesn't take argument 2019-08-12 12:43:52 +02:00
Matthieu Gautier 52299ef767 Fix computeAbsolutePath.
Correctly delete the duplicated string.
Use strtok_r to be thread safe.
2019-08-12 12:43:52 +02:00
Matthieu Gautier c4963268ba Fix regexTools.
The buildMatcher must not take a rvalue as it will keep a reference
to it.
2019-08-12 12:05:51 +02:00
Matthieu Gautier 73a29ccb24 [JNI] Fix implementation of setDataDirectory.
Now that setDataDirectory is a static method, we need to take an
jclass instead of a jobject.
2019-08-11 15:09:26 +02:00
Matthieu Gautier 7060afae66 [SERVER] Catch any error and return a 500 response instead of crashing.
The server will be running some code on the behalf of the calling code.
We really don't what to crash the library (and the binary) because
of a wrong request.
2019-08-11 11:30:43 +02:00
Matthieu Gautier 4d3df4e889 Add JNI wrapper around the library and the server. 2019-08-11 11:30:43 +02:00
Matthieu Gautier cd050ddcc8 Use camelCase. 2019-08-11 11:30:43 +02:00
Matthieu Gautier 635d4438e5 Make the server take a pointer to the library instead of a reference. 2019-08-11 11:30:43 +02:00
Matthieu Gautier ce09375c6c Reduce complexity of handle_search. 2019-08-11 11:30:43 +02:00
Matthieu Gautier fae0918f49 Reduce complexity of handle_catalog. 2019-08-11 11:30:43 +02:00
Matthieu Gautier d90f8b0f05 Add a name_mapper mapping the HumanReadable name to the id. 2019-08-11 11:30:43 +02:00
Matthieu Gautier abb5db0193 Clean up includes in server.cpp 2019-08-11 11:30:43 +02:00
Matthieu Gautier e452f5cf36 Ensure that the root is correctly formatted. 2019-08-11 11:30:43 +02:00
Matthieu Gautier c890e1c87e Add support of a binding to a specific ip address. 2019-08-11 11:30:43 +02:00
Matthieu Gautier e5ef3780db Rename humanReadableBookId to bookName.
`humanReadableBookId` is a bit long and doesn't represent what it is
(this is not a id).
`bookName` is far better.
2019-08-11 11:30:43 +02:00
Matthieu Gautier 2aeed65205 Better handling of invalid request.
Do no crash if we can get a book or a reader for the requested content.
2019-08-11 11:30:43 +02:00
Matthieu Gautier c1faf55ae8 Introduce the server functionality in the kiwix-lib.
This code is mainly copied from kiwix-tools.

But :
- Move all the response thing in a new class Response.
- This Response class is responsible to handle all the MHD_response
  configuration. This way the server handle a global object and do
  no call to MHD_response*
- Server uses a lot more the templating system with mustache.
  There are still few regex operations (because we need to
  change a content already existing).
- By default, the server serves the content using the id as name.
- Server creates a new Searcher per request. This way, we don't have
  to protect the search for multi-thread and we can do several search
  in the same time.
- search results are not cached, this will allow future improvement in the
  search algorithm.
- the home page is not cached.
- Few more verbose information (number of request served, time spend to
  respond to a request).

TOOD:
 - Readd interface selection.
 - Do Android wrapper.
 - Remove KiwixServer (who use a external process).
 -
2019-08-11 11:30:43 +02:00
Matthieu Gautier 64dfea2547 Move the search html renderer in a different class than the searcher.
This is two different functionnalies, we don't need to polute the searcher
api with things to render the html.
2019-08-11 10:19:48 +02:00
Matthieu Gautier cca5980b27 Remove limitation of the search len in the Searcher.
The limitation should be made elsewhere (the code using the searcher).
2019-08-11 10:19:48 +02:00
Matthieu Gautier ce8fff0b42 Make the library create the reader. 2019-08-11 10:19:48 +02:00
Matthieu Gautier e56335109c Make appendToFirstOccurence take argument by reference. 2019-08-11 10:19:48 +02:00
Matthieu Gautier 656bf183b7 Make getHumanReadableFromPath method const. 2019-08-11 10:19:48 +02:00
Matthieu Gautier cbe8e20118 Fix include in otherTools.h 2019-08-11 10:19:48 +02:00