Commit Graph

1043 Commits

Author SHA1 Message Date
Veloman Yunkan 208dece7e3 Reordered several statements
Reordered several statements so that the next couple of commits are a
little simpler.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 19b59fd72f Serving /catalog/v2/entries
/catalog/v2/entries is intended to play the combined role of
/catalog/root.xml and /catalog/search of the old OPDS API. Currently,
the latter role is not yet implemented.

Implementation note: instead of tweaking and reusing
`OPDSDumper::dumpOPDSFeed()`, the generation of the OPDS feed is done via `mustache`
and a new template `static/catalog_v2_entries.xml`.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 92c2de8d46 Enter InternalServer::m_library_id
The new field is intended to serve as a seed for generating semi-stable
OPDS feed ids that only need to change when the library is updated.
2021-06-08 20:37:00 +04:00
Veloman Yunkan 2e53b51696 Serving /catalog/v2/categories 2021-06-08 20:37:00 +04:00
Veloman Yunkan b259afa408 Library::getBooksCategories()
Note: no unit test added
2021-06-08 20:37:00 +04:00
Veloman Yunkan 3c3cf08a1a Serving /catalog/v2/root.xml
Note: This commit somewhat relaxes validation of non variable
`<updated>` elements in the OPDS feed - the contents of any `<updated>`
element is replaced with the YYYY-MM-DDThh:mm:ssZ placeholder.
2021-06-08 16:03:43 +04:00
Veloman Yunkan 54b78eaf56 Moved gen_date_str() to tools/otherTools.cpp 2021-06-08 16:03:43 +04:00
Veloman Yunkan 1e0ff1fbb0 Fixed the double colon in OPDS date string 2021-06-08 16:03:43 +04:00
Veloman Yunkan 5b272ac49c Fixed handling of /catalogBLABLA/root.xml & alike
Also removed an unneeded namespace qualifier.
2021-06-08 16:03:43 +04:00
Manan Jethwani bb92f26b60 added filter functionality 2021-06-07 15:37:20 +02:00
Manan Jethwani 063bb8cd65 added dynamic and subset loading of zim-files in kiwix-serve 2021-06-01 19:33:42 +05:30
Maneesh P M e2f6d91d51 Remove get_readerIndex in favor of get_zimId
The function get_readerIndex was used to get the zimId using an ordered
vector of readers. Now we can use get_zimId directly.
2021-05-26 14:45:25 +02:00
Maneesh P M c35f6f9142 Add `get_zimId` method to Result
get_zimId method allows the user to get the uuid of the archive from
which a result is retrieved directly from the search result itself.
2021-05-26 14:45:25 +02:00
Maneesh P M 5567d8ca49 Replace std::vector<std::string> with SuggestionItem
Each sugestions used to be stored as vector of strings to hold various values
such as title, path etc inside them. With this commit, we use the new
dedicated class `SuggestionItem` to do the same.
2021-05-26 10:53:39 +02:00
Maneesh P M 56434de79e Set label to title snippet if present
With openzim/libzim#545 we now support snippet generation of titles
which can be used as the display label on the ui for highlighted titles
via the "label" field.
The old version used plain title which is still available in the value
field.
2021-05-26 10:52:58 +02:00
Maneesh P M e5fac30cee Update libkiwix with search iterator rename in libzim
Search iterator API in libzim has been shifted to use camel case naming.
This has to be accomodated in libkiwix as well.
2021-05-26 08:39:13 +02:00
Matthieu Gautier 2736a46cfe
Revert "Kiwix Serve welcome page dynamic and subset loading (OPDS based)" 2021-05-25 17:30:05 +02:00
Manan Jethwani 012973d14a added dynamic and subset loading of zim-files in kiwix-serve 2021-05-25 02:41:12 +05:30
Emmanuel Engelhart d4e35c7067 Rename kiwix-lib in libkiwix 2021-05-23 21:46:52 +02:00
Veloman Yunkan cd02b4de3b Dummy application of new libzim search API
Didn't take any advantage of the new libzim search API. Just fixed the
libkiwix build in the most straightforward way.
2021-05-15 23:34:51 +04:00
Emmanuel Engelhart 05cc3d015f Insert root link only if html content 2021-05-14 14:49:28 +02:00
Veloman Yunkan 68189de162 /catalog/search handles out-of-bounds pagination 2021-05-10 11:25:06 +02:00
Veloman Yunkan 41276341d0 Empty query acts as a match-all query
After switching to Xapian-based search in the library/catalog, an empty
query stopped acting as a match-all query. This commit restores the old
behaviour in that regard.
2021-05-09 15:14:43 +02:00
Maneesh P M be6b58c6ad Revert "added 204 code for empty return of search"
Returning status code 204 in case of an empty results doesn't show the
empty results page as described in #466. Reverting the changes in #396
fixes the issue.
2021-05-09 10:47:18 +05:30
Emmanuel Engelhart 950e742116 No metalink file on fs 2021-05-04 13:15:43 +02:00
Veloman Yunkan 3879b82112 const-correct kiwix::Library
- Made most methods of kiwix::Library const.
- Also added const versions of getBookById() and getBookByPath()
  methods.
2021-04-28 11:42:55 +04:00
Veloman Yunkan 63e9a09259 Cleaned up/beautified Library::updateBookDB() 2021-04-27 16:59:21 +04:00
Veloman Yunkan 4178c169dd Xapian documents in book DB store only the book id 2021-04-27 16:59:21 +04:00
Veloman Yunkan f751aff2fb Full case/diacritics insensitivity in catalog filtering
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
2021-04-27 16:59:21 +04:00
Veloman Yunkan 87dc9d2723 Made catalog filtering by query diacritics insensitive
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.

Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
2021-04-27 16:59:21 +04:00
Veloman Yunkan 9c7366890d Catalog filtering by tags works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 19e195cb7d Filter::Tags typedef 2021-04-27 16:59:21 +04:00
Veloman Yunkan 3d5fd8f585 Catalog filtering by creator works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan d3d5abe14d Handling of non-words in publisher query
This change fixes the failure of the LibraryTest.filterByPublisher
unit-test broken by the previous commit.

The previous approach used in `publisherQuery()` for building a phrase
query enforcing the specified prefix for all terms fails if

1. the input phrase contains a non-word term that Xapian's query parser
   doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc);
2. the input phrase contains at least three terms that Xapian's query
   parser has no issue with.

Using the `quest` tool (coming with xapian-tools under Ubuntu) the
issue can be demonstrated as follows:

```
$ quest -o phrase -d some_xapian_db "Energy & security"
Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2))
Exactly 0 matches
MSet:

$ quest -o phrase -d some_xapian_db "Energy & security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db 'Energy 1/2 security act'
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db "Energy a#1 security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
```

The problem comes from parsing the query with the default operation set
to `OP_PHRASE` (exemplified by the `-o phrase` option in above
invocations of `quest`). A workaround is to parse the phrase with a
default operation of `OP_OR` and then combine all the terms with
`OP_PHRASE`.

Besides stemming should be disabled in order to target an exact phrase
match (save for the non-word terms, if any, that are ignored by the
query parser).
2021-04-27 16:59:21 +04:00
Veloman Yunkan a759ab989f Catalog filtering by publisher works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 7ccd9ffcce Catalog filtering by language works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 0c0a37073b Catalog filtering by category works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 415c65cf03 Catalog filtering by book name works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 8287f351e7 Final logic of Library::filterViaBookDB()
Moved the `filter.hasQuery()` check inside `buildXapianQuery()`.
`Library::filterViaBookDB()` only cares if the query that is going to be
run on the book DB would match all documents. The rest of changes
related to enhancing the usage of Xapian for the catalog search will
happen inside `buildXapianQuery()` and `updateBookDB()`.
2021-04-27 16:59:21 +04:00
Veloman Yunkan ea779ac200 Extracted buildXapianQuery() 2021-04-27 16:59:21 +04:00
Veloman Yunkan 80cd1fc989 Renamed 2 functions in Filter and Library 2021-04-27 16:59:21 +04:00
Veloman Yunkan 2d76f8395e Dropped unused functions from Filter's private API
This should have been done back in PR #460
2021-04-27 16:59:21 +04:00
Manan Jethwani 965b9622c2 removed redirect to articles in search 2021-04-20 20:23:42 +05:30
Veloman Yunkan 9d4370403b get_url() was renamed in zim::search_iterator 2021-04-16 13:30:36 +04:00
Vertigo 611146aa37 Added Search Link for bad bookName/articleName on 404 2021-04-12 21:31:47 +05:30
Veloman Yunkan b54215f146 Manager::readOpds() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan 9033f2f28e Manager::readXml() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan ec9186b174 Library::removeBookById() updates the search DB
This fix makes the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test pass.
2021-04-09 17:06:45 +04:00
Veloman Yunkan aaaa5a637e Library::filter() doesn't create empty books
This changes how the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test fails.
2021-04-09 17:06:45 +04:00
Veloman Yunkan 24ed96a38c Library.removeBookById() drops the reader too
This fix makes the `XmlLibraryTest.removeBookByIdDropsTheReader`
unit-test pass.
2021-04-09 17:05:56 +04:00
Manan Jethwani 5cb276a933 adding kind and path attributes to suggest response object and using it in autocomplete 2021-04-07 21:04:33 +05:30
Veloman Yunkan aa2a031ba4 Xapian headers are not exposed through libkiwix 2021-04-07 18:24:33 +04:00
Manan Jethwani 7872734f44 changed method of injecting root link 2021-03-24 14:17:58 +05:30
Manan Jethwani c557bb271b injecting root link directly and renamed head_part to head_taskbar 2021-03-24 02:10:16 +05:30
Manan Jethwani 93264f7409 added root functionality for block external link feature 2021-03-23 03:17:14 +05:30
Veloman Yunkan e214efecd4 Language code conversion via ICU
Language code is converted from ISO 639-3 to ISO 639 (which is
understood by Xapian) via ICU. The previous approach via an explicit
map had its advantages since Xapian has more than one stemmer
implementations for some languages (selectable via Xapian-specific
identifiers). This commit relies on the defaults associated with the
ISO 639 language codes.
2021-03-17 14:32:03 +01:00
Veloman Yunkan 09233bf4f3 Support for partial queries in catalog search
The search text in the catalog query is interpreted as partial by
default, but partial query mode can be disabled in C++. The latter
possibility is not exposed via the /catalog/search kiwix-serve endpoint,
though.
2021-03-17 14:32:03 +01:00
Veloman Yunkan a599fb3892 Initial version of Xapian-based catalog search 2021-03-17 14:32:03 +01:00
Veloman Yunkan a17fc0ef2d Library::getBooksByTitleOrDescription() 2021-03-17 14:32:03 +01:00
Veloman Yunkan db06b2c7ca Library::BookIdCollection typedef 2021-03-17 14:32:03 +01:00
Veloman Yunkan a20f9e2ce1 Library::filter() works in two stages
1. Get the subset of books matching the q (title/description) parameter
   of the search

2. Filter out books not matching the other parameters of the search.

Stage 1. currently works in the old way, but will be replaced by Xapian
based search in subsequent commits.
2021-03-17 14:32:03 +01:00
Veloman Yunkan b7b0bdbdd8 Both Book::update() methods update the category 2021-03-17 14:10:57 +04:00
Veloman Yunkan 4abc4f8518 Support for book category attribute in library.xml 2021-03-17 14:10:57 +04:00
Veloman Yunkan 6b2067c236 Reading category element from OPDS stream 2021-03-17 14:10:57 +04:00
Veloman Yunkan e55bf514e8 Dedicated 'category' parameter in catalog search 2021-03-17 14:10:57 +04:00
Veloman Yunkan 80d4f7e349 Extracted InternalServer::search_catalog() 2021-03-17 14:10:57 +04:00
Veloman Yunkan 58186ffb26 kiwix::Book::getCategory() 2021-03-17 14:09:48 +04:00
Veloman Yunkan ae32ff40c0 Dropped an extra colon from book <updated> dates 2021-03-17 14:02:27 +04:00
Veloman Yunkan 26331b401e Fixed the month in OPDS feed <updated> date
`tm::tm_mon` varies in the [0, 11] range.
2021-03-17 14:02:27 +04:00
Matthieu Gautier 67caae6c32 Use the new libzim's getRandomEntry instead of implementing it ourselves. 2021-03-02 14:16:09 +01:00
Veloman Yunkan 839fc10a4f Fixed the Windows build
Opening ZIM archives by file descriptor (as well as embedded
ZIM archives) is not supported under Windows.
2021-02-10 14:19:47 +01:00
Veloman Yunkan 5a8b825c70 Testing of JNIKiwixReader.getDirectAccessInformation() 2021-02-10 14:19:47 +01:00
Veloman Yunkan 7a465e66d7 Renamed org.kiwix.kiwixlib.{Pair->DirectAccessInfo} 2021-02-10 14:19:47 +01:00
Veloman Yunkan 5a99634dfd Java wrapper test checks favicon.png too 2021-02-10 14:19:47 +01:00
Veloman Yunkan e028bcbb04 Android's java.io.FileDescriptor is different 2021-02-10 14:19:47 +01:00
Veloman Yunkan 9cdf7a44c0 JNIKiwixReader can open an embedded ZIM archive 2021-02-10 14:19:47 +01:00
Veloman Yunkan 4d23e44de7 JNIKiwixReader ctor taking a file descriptor
... and a corresponding unit test
2021-02-10 14:19:47 +01:00
Veloman Yunkan 98d69ef59b Added testReader unit-test for the java wrapper 2021-02-10 14:19:47 +01:00
Veloman Yunkan e40827fbac Renamed the java wrapper unit test runner script 2021-02-10 14:19:47 +01:00
Veloman Yunkan a798e0c0a1 Made the java wrapper unit test run & pass
The kiwixlib java wrapper unit test can be run manually via the
src/wrapper/java/org/kiwix/testing/compile_test.sh script.

The test ZIM files in src/wrapper/java/org/kiwix/testing were created
using the create_test_zimfiles. They must be updated/re-generated and
committed in git whenever their source data or the create_test_zimfiles
script changes. Note: small.zim.embedded is not used at this point, it
was created for testing the enhancement coming in a few commits.
2021-02-10 14:19:47 +01:00
Matthieu Gautier 24b2e6e585 Remove unnecessary include. 2021-01-26 17:53:25 +01:00
Matthieu Gautier 3fd1310008 Use c++11 std::thread instead of pthread. 2021-01-26 17:53:25 +01:00
Matthieu Gautier 4749656828 Do not crash if zim file has no `Counter` metadata. 2021-01-26 15:15:27 +01:00
Emmanuel Engelhart 84895c4036 Better </head> detection regex 2021-01-18 13:16:56 +01:00
Emmanuel Engelhart a8bf9dd5b4 Better Kiwix Serve Taskbar insertion (after charset definition) 2021-01-18 11:18:53 +01:00
Emmanuel Engelhart a61c94ef10 Add GPLv3 header 2021-01-18 10:54:33 +01:00
Emmanuel Engelhart 8c43fd8d36 Fix taskbar insertion in case of '<head>' attributes 2021-01-11 14:37:19 +01:00
Emmanuel Engelhart 3e2810dff4 Support 'video/*' * 'audio/*' mimetypes in getMediaCount() 2021-01-07 12:32:32 +01:00
Emmanuel Engelhart 44c4aa931a Better use kiwix::startsWith() 2021-01-03 15:17:03 +01:00
Emmanuel Engelhart 95b32b168d More robust getMediaCount() 2021-01-01 17:05:32 +01:00
Matthieu Gautier 1002c15e0d Remove unnecessary checks.
`Reader` cannot be created with a null `zimArchive`.
We don't have to check for zimArchive being not null.
2020-12-09 14:25:02 +01:00
Matthieu Gautier d51000c4a9 Use new libzim method `hasFulltextIndex` to check for fulltext index. 2020-12-09 14:25:02 +01:00
Matthieu Gautier ba302bed33 Use new libzim method `getFaviconEntry` to get the favicon. 2020-12-09 14:25:02 +01:00
Steve Wills 6900b4e506 fix build on FreeBSD
With this header, sockaddr_in and INADDR_ANY are not defined
2020-12-07 09:38:46 -05:00
Matthieu Gautier 1a5a2e7a8e Adapt kiwix-lib to the new libzim api. 2020-12-02 12:16:48 +01:00
Matthieu Gautier d87079ec13 Remove deprecated method in the reader. 2020-11-24 19:00:52 +01:00
Veloman Yunkan 0f8fe1f63f Alternative implementation of parseMimetypeCounter() 2020-10-29 14:11:27 +04:00
Matthieu Gautier 08464f23bc Better parsing of `M/Counter`
Mimetype may contain a parameters.
Then, the mimetype would be something like "text/html;foo=bar;foz=baz"

It will contains a `;` and `=` and it conflicts with the same operators
we use to separate the items in our list.

We have to use a more advanced algorithm which takes the context into
account.

Fix #416
2020-10-28 16:03:18 +01:00
Matthieu Gautier ef42abea4b Add some tests of `parseMimetypeCounter` 2020-10-28 14:44:23 +01:00
Matthieu Gautier 4407dd12bd Move mimetypeCounter parsing in its own function. 2020-10-28 14:08:06 +01:00
Matthieu Gautier 632583ede2 Add missing include 2020-10-07 18:43:57 +02:00
Matthieu Gautier 61f9d4ab3a Stop the internal server only if it exists. 2020-10-07 14:36:45 +02:00
Matthieu Gautier 470bfc3f1f Better variable name for outStream. 2020-08-28 15:27:03 +02:00
Matthieu Gautier ea3180cb8c Better error printing. 2020-08-28 15:27:03 +02:00
Matthieu Gautier 72d3f8f8e2 Fix segmentation fault with curl requests.
Use a heap allocated buffer (with lifetime of Aria2 class) instead of
a stack allocated one.

Original fix made by @ZaWertun. Kudos to him.

Fix #kiwix/kiwix-desktop#123, kiwix/kiwix-desktop#513
and kiwix/kiwix-desktop#423
2020-08-26 12:42:16 +02:00
Matthieu Gautier af9e03904c Use std::mutex and std::unique_lock instead of pthread mutex/lock.
It simplify a bit the code and ensure that mutex is correctly unlock
even in case of exception.
2020-08-26 12:30:56 +02:00
Matthieu Gautier 39611cbd60 Wait for waitingThread to exit before destroying the subprocess memory.
WaitingThread read some shared memory with the SubProcess
(`mutex`, `m_running`).
When we destroy the SubProcess, we must be sure that WaitingThread has
correctly finished else we may have invalid read/write on freed memory.
2020-08-26 12:26:04 +02:00
Matthieu Gautier 6f0d3003ac Remove `m_compress` member. 2020-08-13 11:16:41 +02:00
Matthieu Gautier ee17b0739a Fix compilation on CI native dyn.
On the CI, the native_dyn docker image is setup with a packaged version
on libmicrohttpd for which `MHD_HTTP_RANGE_NOT_SATISFIABLE` is not
defined.

When the CI will be fixed, we can revert this commit.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 47436f7bdd Move some header setting in response's constructors.
It make easier to understand what is somehow constant and what depends
of the context.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 3352c95314 Remove the `RedirectResponse` and use a basic `Response` with header. 2020-08-13 11:16:41 +02:00
Matthieu Gautier 77123ac74c Move the adding of 304 headers in 304 factory.
This avoid us to create a ContentResponse just to have some correct
headers.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 9078f0ac6e Remove `ResponseMode`. 2020-08-13 11:16:41 +02:00
Matthieu Gautier 8d6567d067 Create a utility builder for 416 response.
Also add a map in the response to store specific headers.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 6d5cddca12 Fix android compilation
Android clang complains about the fact it cannot move the
`std::unique_ptr<ContentResponse>` into a `std::unique_ptr<Response>&&`
(for the implicit `std::unique_ptr<Response>` constructor).
Let's help him a bit.
2020-08-13 11:16:41 +02:00
Matthieu Gautier a3939e9a05 Move all the content code in the ContentResponse. 2020-08-13 11:16:41 +02:00
Matthieu Gautier eee621d15b Move small utilities method to create response in Response class. 2020-08-13 11:16:41 +02:00
Matthieu Gautier 7b2ee37437 Move the entry response to its own class. 2020-08-13 11:16:41 +02:00
Matthieu Gautier f014fb2895 Introduce a ContentResponse.
This is only an "interface" for now as other type of response (entry) may
be "transformed" to a ContentResponse.
We cannot move all the code in the class.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 1011d1ff0b Move the redirection response in its own class.
The redirection is the easiest to move, let's start with this one.
2020-08-13 11:16:41 +02:00
Matthieu Gautier 9e351b279e Remove `get_default_response` in favor of a static Response method.
We want to build different kind of response depending of the context.
2020-08-13 11:16:41 +02:00
Matthieu Gautier a0bdc0821c Move internalServer code into its own source files. 2020-08-13 11:16:41 +02:00
Matthieu Gautier a819d9e3e0 Make the server handle pointer to response instead of plain response.
This is a preparatory work.
We will specialize the response and so we need a pointer to response
instead of plain response.
2020-08-13 11:16:41 +02:00
manan jethwani c74b935a9b added pageLength for search_pagination 2020-08-12 02:08:02 +05:30
Matthieu Gautier a55d504017 Fix getArticleCount.
With #403, the article mimetype may be different than "text/html".
It can also be "text/html; raw=true".
(And in fact it already could have any kind of optional argument).
2020-08-11 18:27:54 +02:00
Matthieu Gautier 87b5adcaf4 Make the response responsible to detect if we must introduce taskbar.
The response detect if taskbar must be added depending of the mimetype.

Now, `set_taskbar` can be call unconditionally
(no need to check for the mimetype)

And we don't need to call set_taskbar if we have no information to set.
2020-08-11 18:27:54 +02:00
Veloman Yunkan c4e6313c90 x in a --> a.contains(x) in meson.build files 2020-08-11 18:17:18 +02:00
renaud gaudin 3f25a3d005 Fixed #391: prevent taskbar and blocker at article level
Some HTML articles are meant to be displayed through a viewer. In this case,
we know we don't want the server to inject the taskbar nor the link blocker
because the content is not a user-ready web page but a partial element of it.

Such articles still need to be `text/html` to be parsed properly by browsers.

This changes the way we decide to display the tasbar or not.
Previously, we were adding it to every article with a MIME __starting with__ `text/html`.
Now, we're additionally preventing it on `text/html` MIME if there is a `;raw=true` string inside.

This leaves articles with MIME `text/html;raw=true` (warc2zim convention) outside
of the taskbar target.

For similar reasons, the external-link blocker is set to apply to the same set of articles.
Previously, it was applied to all articles which was an (unoticable) mistake.
2020-08-07 09:26:24 +02:00
MananJethwani 599aaa4c1b added code for status code 204 for empty return of search. 2020-08-01 01:45:42 +05:30
Veloman Yunkan 3d425f44de Request header case is ignored
Originally reported against case sensitivity of the Range header
(see issue #387), this fix applies to all request headers (since
according to RFC 7230 all header fields are case-insensitive, see
https://tools.ietf.org/html/rfc7230#section-3.2). However, a
corresponding unit-test was added only for the Range header.
2020-07-30 16:01:51 +02:00
Matthieu Gautier 7ece383004 Add support for samba path on windows.
Fix kiwix/kiwix-desktop#429
2020-07-15 11:40:02 +02:00
Kelson cf8e8b94eb Fix compilation with libmicrohttpd v0.97.1 2020-07-08 14:42:46 +02:00
Matthieu Gautier 4d307e18eb Add new thread safe suggestion API.
Previous API were using an internal vector to store the suggestions search
results.

The new API takes a vector as out argument. So user can call the functions
without having to protect the search.

We should change the android API to reflect the change but it is a bit
more complex to do at JNI level. As android do not call it multithreaded
we are safe for now. And we need the new API asap for kiwix-desktop.

So we keep the same API on android for now, the new api will be made
in next version.
2020-07-01 17:16:13 +02:00
Kunal Mehta fb79cde729 Pass -latomic for architectures that need it
Some architectures, specifically armel, mipsel, m68k & powerpc in
Debian, need to explicitly link to atomic.

Use meson to see if the target's CPU family is one of those, and if so,
pass -latomic to the linker.

Tested on armel and mipsel machines to verify passing -latomic works, and
on armhf and amd64 to ensure normal builds aren't broken.

Fixes #371.
2020-06-29 00:18:13 -07:00
Matthieu Gautier ff605873ed Include missing `algorithm` header.
`min` and `max` functions are defined here.
2020-06-10 15:27:51 +02:00
Veloman Yunkan 05ef5d5f51
Assertion in ByteRange allows 0-sized content
The assertion in the ByteRange constructor was written under the assumption that the content must have non-zero size. Now it allows that corner case.
2020-06-02 21:53:47 +04:00
Veloman Yunkan f52b220d01 Dropped RequestContext::has_range() 2020-05-26 14:10:26 +04:00
Veloman Yunkan 50a850f3a9 Fixed a comment 2020-05-26 14:04:18 +04:00
Veloman Yunkan 886ae17274 Fixed a CodeFactor issue 2020-05-26 13:59:47 +04:00
Veloman Yunkan 85d6daabac Rolled back minor unneeded changes 2020-05-26 13:10:50 +04:00
Veloman Yunkan 5f1918d005 Split a long line 2020-05-26 13:04:03 +04:00
Veloman Yunkan 16bd79fa1b Final clean-up of byte_range.{h,cpp} 2020-05-26 12:50:08 +04:00
Veloman Yunkan c2ebdefe8d Handling of unsatisfiable ranges 2020-05-26 02:11:26 +04:00
Veloman Yunkan 37032892a4 Fixed compilation error under win32_*
ERROR is a macro under Windows
2020-05-26 01:58:17 +04:00
Veloman Yunkan 6b43438b74 Fixed compilation error under native_dyn
MHD_HTTP_RANGE_NOT_SATISFIABLE is not defined in the older version of
libmicrohttpd (that is used under CI/Linux native_dyn).
2020-05-26 01:54:36 +04:00
Veloman Yunkan 7301bf89bb Some refactoring of byte-range parsing 2020-05-26 01:50:29 +04:00
Veloman Yunkan ff23b28e7c Removed unnecessary qualifier 2020-05-26 01:41:37 +04:00
Veloman Yunkan 931e95f391 Invalid byte ranges result in 416 responses 2020-05-26 01:40:07 +04:00
Veloman Yunkan f7571b5b69 Content-Range header is set only for partial content 2020-05-25 17:42:18 +04:00
Veloman Yunkan 801ad18a89 ByteRange::resolve() 2020-05-25 17:27:35 +04:00
Veloman Yunkan 67a347c0c4 Moved byte-range parsing to byte_range.cpp 2020-05-25 17:21:10 +04:00
Veloman Yunkan 693905eb68 Default constructed ByteRange is a full range 2020-05-25 17:17:56 +04:00
Veloman Yunkan f3e79c6b4c Introduced src/server/byte_range.cpp 2020-05-25 16:43:44 +04:00
Veloman Yunkan 52f207eaa6 Support for single-ended byte ranges 2020-05-25 16:37:01 +04:00
Veloman Yunkan 67294217a8 ByteRange::Kind 2020-05-25 16:23:44 +04:00
Veloman Yunkan d111a40ce8 Response::m_byteRange 2020-05-23 20:35:22 +04:00
Veloman Yunkan 0c5bb3fcfe Moved ByteRange to a header file of its own 2020-05-23 20:08:53 +04:00
Veloman Yunkan 3fba8c20a0 Converted RequestContext::ByteRange to a class
Also renamed the `range_pair` data member of `RequestContext` to `byteRange_`
2020-05-23 19:59:47 +04:00
Veloman Yunkan 54db6049b7 Byte-range parsing not exposed in the header file 2020-05-23 18:58:19 +04:00
Veloman Yunkan 81c38d6b2b parse_byte_range() without side-effects 2020-05-23 18:53:16 +04:00
Veloman Yunkan e6a86c02ae Got rid of RequestContext::accept_range 2020-05-23 17:15:42 +04:00
Veloman Yunkan a0f7f32570 Re-ordered function definitions 2020-05-23 17:11:26 +04:00
Veloman Yunkan c39fce8839 RequestContext::parse_byte_range() 2020-05-23 17:09:51 +04:00
Veloman Yunkan de37489c53 Range header starts with a unit spec
After this commit valid ranges of the form "bytes=firstbyte-lastbyte" should
be handled correctly.
2020-05-22 17:17:31 +04:00
Veloman Yunkan 2a35a86de6 Fixed the size value used creating a response
In case of a partial response the size of the response is different
from the served entry size.
2020-05-22 16:49:35 +04:00
Veloman Yunkan 0a30a77c08 Handling of out of bound byte ranges 2020-05-22 16:46:38 +04:00
Veloman Yunkan 1a99bacfe3 Byte ranges are inclusive
The second component of a byte range, if present, designates the
index of the last byte to be included in the partial response.
2020-05-22 16:30:43 +04:00
Kelson 94c2ab4395 Add two OPDS related mime-types to compress for HTTP 2020-05-18 08:19:51 +02:00
Kelson 0f07cab920 Small HTTP header beautification 2020-05-17 20:19:19 +02:00
Veloman Yunkan 5f0a9d0b08 Added a comment clarifying a non-obvious case 2020-05-15 15:17:04 +04:00
Veloman Yunkan 54f5dbbd35 Handling of If-None-Match conditional requests 2020-05-14 17:01:22 +04:00
Veloman Yunkan 95a5cde359 ETags are set in the response as needed
Also added server-unit tests related to ETags in the response.
2020-05-14 17:01:22 +04:00
Veloman Yunkan 3d08ef43f2 HEAD request is not rejected
libmicrohttpd handles HEAD requests by dropping the body of the response
(if any). Hence letting a HEAD request through into the code that
processes GET requests is safe.

Also added server unit-tests related to the handling of HEAD requests.
2020-05-14 17:01:22 +04:00
Veloman Yunkan bfa51c2d87 Refactoring: got rid of duplicate get_mime_type() 2020-04-29 18:33:25 +04:00
Veloman Yunkan 81e781133d Refactoring: utilized is_compressible_mime_type() 2020-04-29 18:33:01 +04:00
Veloman Yunkan 9ec7757efe Refactoring: smart Response::set_entry()
Response::set_entry() was upgraded from a simple setter to a method
performing certain business logic that was previously taken care of by
InternalServer::handle_content().
2020-04-29 18:22:15 +04:00
Veloman Yunkan 7bd7ec4937 Refactoring: preparing to move some code 2020-04-29 18:22:15 +04:00
Veloman Yunkan 14d8583c83 Refactoring in InternalServer::handle_content()
Deduplicated common code found in the two branches of the last
if(){}else{} statement in InternalServer::handle_content().
2020-04-29 18:22:15 +04:00
Veloman Yunkan a004d96cd7 Refactoring: extracted get_range_len() 2020-04-29 18:22:15 +04:00
Veloman Yunkan 21c6de2f80 Refactoring: split Response::create_mhd_response()
The changes are easier to understand in ignore-white-space mode
(git diff -w, git show -w).
2020-04-29 18:22:15 +04:00
Veloman Yunkan a8e78f27e1 Refactoring: extracted Response::create_mhd_response() 2020-04-29 18:22:15 +04:00
Veloman Yunkan 6c7ab6ff54 Refactoring: moved local variable declarations 2020-04-29 18:21:40 +04:00
Veloman Yunkan 659ee6ba71 Refactoring: extracted InternalServer::build_redirect() 2020-04-29 16:08:10 +04:00
Veloman Yunkan 83ee8dec15 Made InternalServer::get_default_response() const 2020-04-29 16:08:10 +04:00
Veloman Yunkan 87cbbed9e3 Refactoring: extracted is_compressible_mime_type() 2020-04-29 16:08:10 +04:00
Veloman Yunkan a058520628 Refactoring: extracted get_mime_type() 2020-04-29 16:08:10 +04:00
Veloman Yunkan 1ef5ebfb52 Refactoring: extracted InternalServer::get_reader() 2020-04-29 16:08:10 +04:00
Veloman Yunkan bbc06931ad Refactoring: extracted get_book_name() 2020-04-29 16:08:10 +04:00
Veloman Yunkan 2d3bf9b981 Refactoring: extracted InternalServer::homepage_data()
Also typedef'ed kainjow::mustache::data as MustacheData
2020-04-29 16:08:10 +04:00
Veloman Yunkan fd80f2a89f Refactoring: extracted fullURL2LocalURL()
Also dropped RequestContex::valid_url
2020-04-29 16:08:10 +04:00
Veloman Yunkan abb3dec700 Refactoring: extracted str2RequestMethod() 2020-04-29 16:08:10 +04:00
luddens 0586ef6d41 fix open external zim
Check if the parameter `pathToSave` is empty before use it otherwise the
book path is empty too, which causes crash on opening external zim files
2020-04-20 15:22:36 +02:00
Matthieu Gautier 9d8bf8ddcb Create the dataDirectory before returning its path. 2020-04-15 08:24:55 +02:00
Matthieu Gautier 4c8aad0e68 Do not use std::fstream has it doesn't support wchar path.
This is surprising, but C++11 fstream doesn't have a constructor
that take wchar as path.
So, on windows, we cannot open a stream on a path containing non ascii
char. VC++ provide an extension for that, but it is not standard and
g++ mingwin doesn't provide it.

So move all our write/read tools function to the plain old c versions,
using _wopen to open wide path on windows.
2020-04-14 18:13:35 +02:00
Matthieu Gautier eb6f0f710c Correctly detetect the dataDir on windows.
We must use the wide version of the getenv to correctly handle the case
we have accents in the user directory.

This also change the default dataDirectory on windows from $APPDATA to
$APPDATA/kiwix.
2020-04-14 12:12:34 +02:00
Matthieu Gautier cbf5bd57a8 Adapt to new libzim api.
It is not possible to create a iterator without argument anymore.
2020-04-13 16:06:17 +02:00
Matthieu Gautier 533541cf45 Write the articleCount and mediaCount metadata in the OPDS stream.
This is not standard OPDS. But clients need this information.
2020-04-07 12:22:44 +02:00
renaud gaudin 7155c788e2 attach taskbar to `<head>` instead of `<head>\n`
Fixed a regression introduced in block-external-links feature.

For cleaner source, the taskbar (and the block-external JS file) were both
attached to `<head>\n`.
Unfortunately, this isn't safe enough as some ZIM files might have all kinds of HTML
syntax. Sotoki for instance have no CR after head, rendering the attachment impossible.

Note: realizing this method is somehow fragile as any HTML content with extra attribute
on the `<head>` tag or without a `<head>` tag would break the taskbar and the block external feature.
2020-04-03 16:53:43 +02:00
renaud gaudin 4709a42f4f disable external links blocking on 500 handler 2020-03-30 14:42:37 +00:00
renaud gaudin d04d9bf7f3 Unblock external link in catch page in JS code
Instead of disabling the blocking for the handler, the JS code detects it is
displaying the handler and allows external links to go through
2020-03-27 12:26:22 +00:00