Commit Graph

1749 Commits

Author SHA1 Message Date
Veloman Yunkan 4178c169dd Xapian documents in book DB store only the book id 2021-04-27 16:59:21 +04:00
Veloman Yunkan 59e9a0cd77 Merged XmlLibraryTest with LibraryTest
The library set up by LibraryTest now contains two valid books
initialized via XML. Therefore XmlLibraryTest is not needed as a
separate test suite.
2021-04-27 16:59:21 +04:00
Veloman Yunkan f751aff2fb Full case/diacritics insensitivity in catalog filtering
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
2021-04-27 16:59:21 +04:00
Veloman Yunkan 87dc9d2723 Made catalog filtering by query diacritics insensitive
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.

Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
2021-04-27 16:59:21 +04:00
Veloman Yunkan 9c7366890d Catalog filtering by tags works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 19e195cb7d Filter::Tags typedef 2021-04-27 16:59:21 +04:00
Veloman Yunkan 3d5fd8f585 Catalog filtering by creator works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan d3d5abe14d Handling of non-words in publisher query
This change fixes the failure of the LibraryTest.filterByPublisher
unit-test broken by the previous commit.

The previous approach used in `publisherQuery()` for building a phrase
query enforcing the specified prefix for all terms fails if

1. the input phrase contains a non-word term that Xapian's query parser
   doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc);
2. the input phrase contains at least three terms that Xapian's query
   parser has no issue with.

Using the `quest` tool (coming with xapian-tools under Ubuntu) the
issue can be demonstrated as follows:

```
$ quest -o phrase -d some_xapian_db "Energy & security"
Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2))
Exactly 0 matches
MSet:

$ quest -o phrase -d some_xapian_db "Energy & security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db 'Energy 1/2 security act'
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db "Energy a#1 security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
```

The problem comes from parsing the query with the default operation set
to `OP_PHRASE` (exemplified by the `-o phrase` option in above
invocations of `quest`). A workaround is to parse the phrase with a
default operation of `OP_OR` and then combine all the terms with
`OP_PHRASE`.

Besides stemming should be disabled in order to target an exact phrase
match (save for the non-word terms, if any, that are ignored by the
query parser).
2021-04-27 16:59:21 +04:00
Veloman Yunkan e805f68994 Enhanced & broke LibraryTest.filterByPublisher 2021-04-27 16:59:21 +04:00
Veloman Yunkan a759ab989f Catalog filtering by publisher works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 7ccd9ffcce Catalog filtering by language works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 0c0a37073b Catalog filtering by category works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 415c65cf03 Catalog filtering by book name works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan 8287f351e7 Final logic of Library::filterViaBookDB()
Moved the `filter.hasQuery()` check inside `buildXapianQuery()`.
`Library::filterViaBookDB()` only cares if the query that is going to be
run on the book DB would match all documents. The rest of changes
related to enhancing the usage of Xapian for the catalog search will
happen inside `buildXapianQuery()` and `updateBookDB()`.
2021-04-27 16:59:21 +04:00
Veloman Yunkan ea779ac200 Extracted buildXapianQuery() 2021-04-27 16:59:21 +04:00
Veloman Yunkan 80cd1fc989 Renamed 2 functions in Filter and Library 2021-04-27 16:59:21 +04:00
Veloman Yunkan 2d76f8395e Dropped unused functions from Filter's private API
This should have been done back in PR #460
2021-04-27 16:59:21 +04:00
Veloman Yunkan 29a6a34ecf Delimited kiwix::Filter's public and private APIs 2021-04-27 16:59:21 +04:00
Veloman Yunkan 2f3f1a4859 Improved LibraryTest.filterByMultipleCriteria 2021-04-27 16:59:21 +04:00
Veloman Yunkan b9be742085 LibraryTest.filterByMaxSize 2021-04-27 16:59:21 +04:00
Veloman Yunkan 95c354a5fa LibraryTest.filterByCategory 2021-04-27 16:59:21 +04:00
Veloman Yunkan cdd272fc5a LibraryTest.filterByName 2021-04-27 16:59:21 +04:00
Veloman Yunkan ef962a9174 LibraryTest.filterByPublisher 2021-04-27 16:59:21 +04:00
Veloman Yunkan f063d350c6 LibraryTest.{filterLocal,filterRemote} 2021-04-27 16:59:21 +04:00
Veloman Yunkan d8fe593f59 Extended the unit-test library with 2 XML entries 2021-04-27 16:59:21 +04:00
Veloman Yunkan 22b8625033 Enter EXPECT_FILTER_RESULTS()
This diff is easier to view if whitespace change is ignored.
2021-04-27 16:59:21 +04:00
Veloman Yunkan 0f277ffa34 Enhanced the LibraryTest.filterByTags unit-test 2021-04-27 16:59:21 +04:00
Veloman Yunkan 068f7e5e95 New unit-test LibraryTest.filterByCreator 2021-04-27 16:59:21 +04:00
Veloman Yunkan 8c810d2d2f Enhanced the LibraryTest.filterByQuery unit-test 2021-04-27 16:59:21 +04:00
Veloman Yunkan 8c18a37961 Split LibraryTest.filterCheck into several tests 2021-04-27 16:59:21 +04:00
Veloman Yunkan db3e0d7f72 Enhanced the LibraryTest.filterCheck unit-test
Now the LibraryTest.filterCheck unit-test validates the actual entries
returned by `Library::filter` (previously only the count of the results
was checked).
2021-04-27 16:59:21 +04:00
Matthieu Gautier d134ad417f
Merge pull request #497 from MananJethwani/issue/481
removed redirect to articles in search
2021-04-20 17:13:01 +02:00
Manan Jethwani 965b9622c2 removed redirect to articles in search 2021-04-20 20:23:42 +05:30
Kelson 11db5dec4e
Merge pull request #494 from kiwix/ripple_effect_of_libzim_pr524
get_url() was renamed in zim::search_iterator
2021-04-16 12:38:15 +02:00
Veloman Yunkan 9d4370403b get_url() was renamed in zim::search_iterator 2021-04-16 13:30:36 +04:00
Kelson cb57178c23
Merge pull request #491 from kiwix/fix_macos_ci
Update brew before installing packages
2021-04-12 18:41:00 +02:00
Matthieu Gautier 9ba5ab4678 Update brew before installing packages
brew changed his backend repository, we must update brew itself first.
2021-04-12 18:31:29 +02:00
Matthieu Gautier a597870025
Merge pull request #465 from soumyankar/master 2021-04-12 18:17:57 +02:00
Vertigo 611146aa37 Added Search Link for bad bookName/articleName on 404 2021-04-12 21:31:47 +05:30
Matthieu Gautier 6d2f227c42
Merge pull request #486 from kiwix/fix_for_issue462 2021-04-12 15:18:57 +02:00
Veloman Yunkan 0c7d19ab45 Testing of Manager.readXml() 2021-04-12 15:14:12 +02:00
Veloman Yunkan b54215f146 Manager::readOpds() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan 9033f2f28e Manager::readXml() doesn't modify its input 2021-04-12 15:14:12 +02:00
Matthieu Gautier 5c289abd0e
Merge pull request #485 from kiwix/fix_for_issue478 2021-04-12 15:05:13 +02:00
Veloman Yunkan ec9186b174 Library::removeBookById() updates the search DB
This fix makes the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test pass.
2021-04-09 17:06:45 +04:00
Veloman Yunkan aaaa5a637e Library::filter() doesn't create empty books
This changes how the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test fails.
2021-04-09 17:06:45 +04:00
Veloman Yunkan 49940a30d0 XmlLibraryTest.removeBookByIdUpdatesTheSearchDB
The new unit-test fails with a reason not expected before it was
written. The `Library::filter()` operation returns a correct result
after the call to `removeBookById()` (this was a surprise!) but it has
a side-effect of re-adding an empty book with the id still surviving
in the search DB (the emptiness of this re-created book doesn't allow
it to pass the other filtering criteria, which explains why the result
of `Library::filter()` is correct). Had to add a special check
to the new unit-test against that hidden side-effect of
`Library::removeBookById()` + `Library::filter()` combination.
2021-04-09 17:06:45 +04:00
Veloman Yunkan 24ed96a38c Library.removeBookById() drops the reader too
This fix makes the `XmlLibraryTest.removeBookByIdDropsTheReader`
unit-test pass.
2021-04-09 17:05:56 +04:00
Veloman Yunkan ccdc316217 Two unit-tests for Library::removeBookById
The `XmlLibraryTest.removeBookByIdDropsTheReader` unit-test fails,
demonstrating a bug in `kiwix::Library::removeBookById()`.
2021-04-09 16:59:55 +04:00
Matthieu Gautier ba44033273
Merge pull request #464 from MananJethwani/issue/kiwix-tools/205
adding kind and path attributes to suggest response object and using it in autocomplete
2021-04-07 18:08:29 +02:00