Commit Graph

948 Commits

Author SHA1 Message Date
dfad1c3815 /catalog/v2/searchdescription.xml 2021-06-08 20:37:00 +04:00
07252a127a /catalog/v2/entries is also a search endpoint 2021-06-08 20:37:00 +04:00
b60e3ffb26 RequestContext::get_optional_param() 2021-06-08 20:37:00 +04:00
70d42aec98 A small simplification 2021-06-08 20:37:00 +04:00
4aa3c792aa Extracted get_search_filter() 2021-06-08 20:37:00 +04:00
208dece7e3 Reordered several statements
Reordered several statements so that the next couple of commits are a
little simpler.
2021-06-08 20:37:00 +04:00
19b59fd72f Serving /catalog/v2/entries
/catalog/v2/entries is intended to play the combined role of
/catalog/root.xml and /catalog/search of the old OPDS API. Currently,
the latter role is not yet implemented.

Implementation note: instead of tweaking and reusing
`OPDSDumper::dumpOPDSFeed()`, the generation of the OPDS feed is done via `mustache`
and a new template `static/catalog_v2_entries.xml`.
2021-06-08 20:37:00 +04:00
92c2de8d46 Enter InternalServer::m_library_id
The new field is intended to serve as a seed for generating semi-stable
OPDS feed ids that only need to change when the library is updated.
2021-06-08 20:37:00 +04:00
2e53b51696 Serving /catalog/v2/categories 2021-06-08 20:37:00 +04:00
b259afa408 Library::getBooksCategories()
Note: no unit test added
2021-06-08 20:37:00 +04:00
3c3cf08a1a Serving /catalog/v2/root.xml
Note: This commit somewhat relaxes validation of non variable
`<updated>` elements in the OPDS feed - the contents of any `<updated>`
element is replaced with the YYYY-MM-DDThh:mm:ssZ placeholder.
2021-06-08 16:03:43 +04:00
54b78eaf56 Moved gen_date_str() to tools/otherTools.cpp 2021-06-08 16:03:43 +04:00
1e0ff1fbb0 Fixed the double colon in OPDS date string 2021-06-08 16:03:43 +04:00
5b272ac49c Fixed handling of /catalogBLABLA/root.xml & alike
Also removed an unneeded namespace qualifier.
2021-06-08 16:03:43 +04:00
bb92f26b60 added filter functionality 2021-06-07 15:37:20 +02:00
063bb8cd65 added dynamic and subset loading of zim-files in kiwix-serve 2021-06-01 19:33:42 +05:30
e2f6d91d51 Remove get_readerIndex in favor of get_zimId
The function get_readerIndex was used to get the zimId using an ordered
vector of readers. Now we can use get_zimId directly.
2021-05-26 14:45:25 +02:00
c35f6f9142 Add get_zimId method to Result
get_zimId method allows the user to get the uuid of the archive from
which a result is retrieved directly from the search result itself.
2021-05-26 14:45:25 +02:00
5567d8ca49 Replace std::vector<std::string> with SuggestionItem
Each sugestions used to be stored as vector of strings to hold various values
such as title, path etc inside them. With this commit, we use the new
dedicated class `SuggestionItem` to do the same.
2021-05-26 10:53:39 +02:00
56434de79e Set label to title snippet if present
With openzim/libzim#545 we now support snippet generation of titles
which can be used as the display label on the ui for highlighted titles
via the "label" field.
The old version used plain title which is still available in the value
field.
2021-05-26 10:52:58 +02:00
e5fac30cee Update libkiwix with search iterator rename in libzim
Search iterator API in libzim has been shifted to use camel case naming.
This has to be accomodated in libkiwix as well.
2021-05-26 08:39:13 +02:00
2736a46cfe Revert "Kiwix Serve welcome page dynamic and subset loading (OPDS based)" 2021-05-25 17:30:05 +02:00
012973d14a added dynamic and subset loading of zim-files in kiwix-serve 2021-05-25 02:41:12 +05:30
d4e35c7067 Rename kiwix-lib in libkiwix 2021-05-23 21:46:52 +02:00
cd02b4de3b Dummy application of new libzim search API
Didn't take any advantage of the new libzim search API. Just fixed the
libkiwix build in the most straightforward way.
2021-05-15 23:34:51 +04:00
05cc3d015f Insert root link only if html content 2021-05-14 14:49:28 +02:00
68189de162 /catalog/search handles out-of-bounds pagination 2021-05-10 11:25:06 +02:00
41276341d0 Empty query acts as a match-all query
After switching to Xapian-based search in the library/catalog, an empty
query stopped acting as a match-all query. This commit restores the old
behaviour in that regard.
2021-05-09 15:14:43 +02:00
be6b58c6ad Revert "added 204 code for empty return of search"
Returning status code 204 in case of an empty results doesn't show the
empty results page as described in #466. Reverting the changes in #396
fixes the issue.
2021-05-09 10:47:18 +05:30
950e742116 No metalink file on fs 2021-05-04 13:15:43 +02:00
3879b82112 const-correct kiwix::Library
- Made most methods of kiwix::Library const.
- Also added const versions of getBookById() and getBookByPath()
  methods.
2021-04-28 11:42:55 +04:00
63e9a09259 Cleaned up/beautified Library::updateBookDB() 2021-04-27 16:59:21 +04:00
4178c169dd Xapian documents in book DB store only the book id 2021-04-27 16:59:21 +04:00
f751aff2fb Full case/diacritics insensitivity in catalog filtering
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
2021-04-27 16:59:21 +04:00
87dc9d2723 Made catalog filtering by query diacritics insensitive
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.

Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
2021-04-27 16:59:21 +04:00
9c7366890d Catalog filtering by tags works via Xapian 2021-04-27 16:59:21 +04:00
19e195cb7d Filter::Tags typedef 2021-04-27 16:59:21 +04:00
3d5fd8f585 Catalog filtering by creator works via Xapian 2021-04-27 16:59:21 +04:00
d3d5abe14d Handling of non-words in publisher query
This change fixes the failure of the LibraryTest.filterByPublisher
unit-test broken by the previous commit.

The previous approach used in `publisherQuery()` for building a phrase
query enforcing the specified prefix for all terms fails if

1. the input phrase contains a non-word term that Xapian's query parser
   doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc);
2. the input phrase contains at least three terms that Xapian's query
   parser has no issue with.

Using the `quest` tool (coming with xapian-tools under Ubuntu) the
issue can be demonstrated as follows:

```
$ quest -o phrase -d some_xapian_db "Energy & security"
Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2))
Exactly 0 matches
MSet:

$ quest -o phrase -d some_xapian_db "Energy & security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db 'Energy 1/2 security act'
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db "Energy a#1 security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
```

The problem comes from parsing the query with the default operation set
to `OP_PHRASE` (exemplified by the `-o phrase` option in above
invocations of `quest`). A workaround is to parse the phrase with a
default operation of `OP_OR` and then combine all the terms with
`OP_PHRASE`.

Besides stemming should be disabled in order to target an exact phrase
match (save for the non-word terms, if any, that are ignored by the
query parser).
2021-04-27 16:59:21 +04:00
a759ab989f Catalog filtering by publisher works via Xapian 2021-04-27 16:59:21 +04:00
7ccd9ffcce Catalog filtering by language works via Xapian 2021-04-27 16:59:21 +04:00
0c0a37073b Catalog filtering by category works via Xapian 2021-04-27 16:59:21 +04:00
415c65cf03 Catalog filtering by book name works via Xapian 2021-04-27 16:59:21 +04:00
8287f351e7 Final logic of Library::filterViaBookDB()
Moved the `filter.hasQuery()` check inside `buildXapianQuery()`.
`Library::filterViaBookDB()` only cares if the query that is going to be
run on the book DB would match all documents. The rest of changes
related to enhancing the usage of Xapian for the catalog search will
happen inside `buildXapianQuery()` and `updateBookDB()`.
2021-04-27 16:59:21 +04:00
ea779ac200 Extracted buildXapianQuery() 2021-04-27 16:59:21 +04:00
80cd1fc989 Renamed 2 functions in Filter and Library 2021-04-27 16:59:21 +04:00
2d76f8395e Dropped unused functions from Filter's private API
This should have been done back in PR #460
2021-04-27 16:59:21 +04:00
965b9622c2 removed redirect to articles in search 2021-04-20 20:23:42 +05:30
9d4370403b get_url() was renamed in zim::search_iterator 2021-04-16 13:30:36 +04:00
611146aa37 Added Search Link for bad bookName/articleName on 404 2021-04-12 21:31:47 +05:30