libkiwix

Commit Graph

Author	SHA1	Message	Date
Veloman Yunkan	f751aff2fb	Full case/diacritics insensitivity in catalog filtering Catalog filtering should now be case/diacritics insensitive for all fields. However it is not validated for language, name and category fields, and is validated for tags, creator & publisher only for text supplied in the filter (but not for values read from the book).	2021-04-27 16:59:21 +04:00
Veloman Yunkan	87dc9d2723	Made catalog filtering by query diacritics insensitive Catalog filtering by titles/description was sensitive to diacritics present in the query string. Fixed that. Also enhanced the unit test to validate the insensitivity to diacritics present in either the title/description or the query string.	2021-04-27 16:59:21 +04:00
Veloman Yunkan	9c7366890d	Catalog filtering by tags works via Xapian	2021-04-27 16:59:21 +04:00
Veloman Yunkan	19e195cb7d	Filter::Tags typedef	2021-04-27 16:59:21 +04:00
Veloman Yunkan	3d5fd8f585	Catalog filtering by creator works via Xapian	2021-04-27 16:59:21 +04:00
Veloman Yunkan	d3d5abe14d	Handling of non-words in publisher query This change fixes the failure of the LibraryTest.filterByPublisher unit-test broken by the previous commit. The previous approach used in `publisherQuery()` for building a phrase query enforcing the specified prefix for all terms fails if 1. the input phrase contains a non-word term that Xapian's query parser doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc); 2. the input phrase contains at least three terms that Xapian's query parser has no issue with. Using the `quest` tool (coming with xapian-tools under Ubuntu) the issue can be demonstrated as follows: ``` $ quest -o phrase -d some_xapian_db "Energy & security" Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2)) Exactly 0 matches MSet: $ quest -o phrase -d some_xapian_db "Energy & security act" UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries $ quest -o phrase -d some_xapian_db 'Energy 1/2 security act' UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries $ quest -o phrase -d some_xapian_db "Energy a#1 security act" UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries ``` The problem comes from parsing the query with the default operation set to `OP_PHRASE` (exemplified by the `-o phrase` option in above invocations of `quest`). A workaround is to parse the phrase with a default operation of `OP_OR` and then combine all the terms with `OP_PHRASE`. Besides stemming should be disabled in order to target an exact phrase match (save for the non-word terms, if any, that are ignored by the query parser).	2021-04-27 16:59:21 +04:00
Veloman Yunkan	a759ab989f	Catalog filtering by publisher works via Xapian	2021-04-27 16:59:21 +04:00
Veloman Yunkan	7ccd9ffcce	Catalog filtering by language works via Xapian	2021-04-27 16:59:21 +04:00
Veloman Yunkan	0c0a37073b	Catalog filtering by category works via Xapian	2021-04-27 16:59:21 +04:00
Veloman Yunkan	415c65cf03	Catalog filtering by book name works via Xapian	2021-04-27 16:59:21 +04:00
Veloman Yunkan	8287f351e7	Final logic of Library::filterViaBookDB() Moved the `filter.hasQuery()` check inside `buildXapianQuery()`. `Library::filterViaBookDB()` only cares if the query that is going to be run on the book DB would match all documents. The rest of changes related to enhancing the usage of Xapian for the catalog search will happen inside `buildXapianQuery()` and `updateBookDB()`.	2021-04-27 16:59:21 +04:00
Veloman Yunkan	ea779ac200	Extracted buildXapianQuery()	2021-04-27 16:59:21 +04:00
Veloman Yunkan	80cd1fc989	Renamed 2 functions in Filter and Library	2021-04-27 16:59:21 +04:00
Veloman Yunkan	2d76f8395e	Dropped unused functions from Filter's private API This should have been done back in PR #460	2021-04-27 16:59:21 +04:00
Veloman Yunkan	ec9186b174	Library::removeBookById() updates the search DB This fix makes the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB` unit-test pass.	2021-04-09 17:06:45 +04:00
Veloman Yunkan	aaaa5a637e	Library::filter() doesn't create empty books This changes how the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB` unit-test fails.	2021-04-09 17:06:45 +04:00
Veloman Yunkan	24ed96a38c	Library.removeBookById() drops the reader too This fix makes the `XmlLibraryTest.removeBookByIdDropsTheReader` unit-test pass.	2021-04-09 17:05:56 +04:00
Veloman Yunkan	aa2a031ba4	Xapian headers are not exposed through libkiwix	2021-04-07 18:24:33 +04:00
Veloman Yunkan	e214efecd4	Language code conversion via ICU Language code is converted from ISO 639-3 to ISO 639 (which is understood by Xapian) via ICU. The previous approach via an explicit map had its advantages since Xapian has more than one stemmer implementations for some languages (selectable via Xapian-specific identifiers). This commit relies on the defaults associated with the ISO 639 language codes.	2021-03-17 14:32:03 +01:00
Veloman Yunkan	09233bf4f3	Support for partial queries in catalog search The search text in the catalog query is interpreted as partial by default, but partial query mode can be disabled in C++. The latter possibility is not exposed via the /catalog/search kiwix-serve endpoint, though.	2021-03-17 14:32:03 +01:00
Veloman Yunkan	a599fb3892	Initial version of Xapian-based catalog search	2021-03-17 14:32:03 +01:00
Veloman Yunkan	a17fc0ef2d	Library::getBooksByTitleOrDescription()	2021-03-17 14:32:03 +01:00
Veloman Yunkan	db06b2c7ca	Library::BookIdCollection typedef	2021-03-17 14:32:03 +01:00
Veloman Yunkan	a20f9e2ce1	Library::filter() works in two stages 1. Get the subset of books matching the q (title/description) parameter of the search 2. Filter out books not matching the other parameters of the search. Stage 1. currently works in the old way, but will be replaced by Xapian based search in subsequent commits.	2021-03-17 14:32:03 +01:00
Veloman Yunkan	e55bf514e8	Dedicated 'category' parameter in catalog search	2021-03-17 14:10:57 +04:00
Matthieu Gautier	46626a3f98	Add the method get bookByPath in library.	2020-03-06 12:08:05 +01:00
Matthieu Gautier	f560a1f815	Be able to filter the books by name.	2020-01-30 19:02:33 +01:00
Matthieu Gautier	49aa0fbb9f	Use a macro to write the filters.	2020-01-30 15:42:54 +01:00
luddens	c9a15c9961	Add a parameter to getBookmarks fct to get valid bookmarks only The default value of this parameter is false, in this case all the bookmarks are returned, otherwise only those who are related to books of the library.	2019-10-31 14:05:21 +02:00
Matthieu Gautier	598dd3c175	[API Break] Fix pathTools (and a bit stringTools). Api changes : - removeLastPathElement do not takes extra arguments `removePreSeparator` and `removePostSeparator`. This is not needed as path do not need special tailing separator. - Only one function `split`. Arguments can be implicitly convert to string. No need for overloading functions to explicitly cast them. - `split` function takes another argument `trimEmpty`. If true, empty element are removed. Path manipulation now almost pass trough a vector<string> to store each path's part. Most of the complex works is now made in the normalizeParts function.	2019-09-19 18:16:06 +02:00
Matthieu Gautier	ce8fff0b42	Make the library create the reader.	2019-08-11 10:19:48 +02:00
Matthieu Gautier	72223d69fe	Fix include in pathTools.h	2019-08-10 11:02:23 +02:00
Matthieu Gautier	31c9375a3a	Better API to filter books in a library. Instead of having a single method `listBooksIds` that tries to be exhaustive about all the filter and sort option, split the method in two separated methods `filter` and `sort`. The `filter` method takes a `Filter` object that represent on what we are filtering. This object has to be construct before calling `filter`. ```cpp Filter filter; filter.query("Astring"); filter.acceptTags({"nopic"}); // return all book in eng and with "Astring" in the tile or description". library.filter(filter); //equivalent to library.listBooksIds(ALL, UNSORTED, "Astring", "", "", "", {"nopic"}); // or better library.filter(Filter().query("Astring").acceptTags({"nopic"})); ``` The method `listBooksIds` has been marked as deprecated. Add a small test on the library.	2019-06-26 16:41:01 +02:00
Matthieu Gautier	c6254d9504	Allow the library to be filtered by tags. This add an argument to `listBooksIds` to filter by tags. So, this is an API break.	2019-03-07 17:08:39 +01:00
Matthieu Gautier	af7689e3e8	[API break] Move all the tools in the tools directory instead of common. The `common` name is from the time where kiwix was only one repository for all the project (android, desktop, server...). Now we have split the repositories and kiwix-lib is the "common" repo, the "common" directory is somehow nonsense.	2019-01-23 15:31:38 +01:00
Matthieu Gautier	12498e2cfe	Add bookmarks support. The library now contains (simple) methods to handle bookmarks. The bookmark are stored in a separate xml file. Bookmark are mainly a couple (`zimId`, `articleUrl`). However, in the xml we store a bit more data : - The article's title (for display) - The book's title, lang and date (for potential update of zim files)	2018-12-02 15:47:29 +01:00
Matthieu Gautier	b5ce60a627	Move the dump of the library into library.xml in a specific class. The same way the dump into a opds feed is in a specific class.	2018-11-28 12:09:28 +01:00
Matthieu Gautier	cf1cfe774e	Correctly check for ArticleCount and MediaCount before writing them.	2018-11-12 10:58:10 +01:00
Matthieu Gautier	2682fa8f9c	Remove unecessary variable or output.	2018-10-26 14:19:10 +02:00
Matthieu Gautier	b1508c0b98	Better listBooksIds supported mode. Only have REMOTE or LOCAL is a bit restrictive. By using flags a user can specify for complex request.	2018-10-24 11:50:11 +02:00
Matthieu Gautier	a73ef23f6e	Keep the book size in byte in memory (instead of in kb) We keep the size in kb in library.xml for compatibility.	2018-10-24 10:47:12 +02:00
Matthieu Gautier	fe6d5fa93e	Store the downloadId in the book (and in the library).	2018-10-24 10:47:12 +02:00
Matthieu Gautier	7804bf2276	Reimplement listBooksIds. No real improvement.	2018-10-24 10:47:12 +02:00
Matthieu Gautier	839320d5e7	Move the `Book` class in its own source file.	2018-10-24 10:47:12 +02:00
Matthieu Gautier	1e8f85eaff	Rename methods `title()` into `getTitle()`. Same for all attributes.	2018-10-24 10:47:12 +02:00
Matthieu Gautier	e0704b3b21	Move the initialization code of a book from xml\|opds into Book.	2018-10-24 10:47:12 +02:00
Matthieu Gautier	57fbb98bca	Do not store the favicon base64 encoded in the book. The fact that the favicon is base64 encoded in a storage detail.	2018-10-24 10:47:12 +02:00
Matthieu Gautier	66a9a69480	Move the code updating a book from a reader in the Book class.	2018-09-06 18:30:37 +02:00
Matthieu Gautier	aa6772b345	Remove the "last" book functionnality. - This is not used by any application. - This is application specific and should not be stored in the library (who is a list of book).	2018-09-06 18:30:37 +02:00
Matthieu Gautier	bba3c252e4	Make the member of the book protected. It is up to the book to manage its attribute. Also remove the `absolutePath` (and `indexAbsolutePath`). The `Book::path` is always stored absolute. The fact that the path can be stored absolute or relative in the `library.xml` is not relevant for the book.	2018-09-06 18:30:37 +02:00

1 2

55 Commits