- Throw a exception if we cannot extract from string.
(We throw the same exception as `std::sto*`)
- Add a specialization to extract string from string
- Add some unit test
Xapian version 1.4.18 contains a bug in snippet generation caused by
incorrect handling of stemming.
The test-point with a search pattern "beatles" produced snippets with no
highlights of the search term. Debugging showed that the search pattern
"beatles" was transformed to a search term "beatl" which then didn't
match the word "beatles" in the text from which a snippet had to be
extracted.
The test case passed on my development machine as well as for most CI
configurations. However the "Packages / build-deb (ubuntu-bionic)"
variant failed because of a slightly different handling of punctuation
at the snippet boundaries:
Test context:
url: /ROOT/search?pattern=beatles&content=zimfile
actual snippet: ...side "Yellow Submarine" ...........
expected snippet: ...-side "Yellow Submarine" ...........
Above mismatch resulted in a looser comparison of the snippet contents
and failed the requirement that the snippet MUST contain highlights
(this is how the said bug in Xapian was discovered).
An attempt to change the search pattern to "field" didn't eliminate the
problem. Despite the search pattern itself being in singular form (i.e.
identical to its stemmed version) the plural form "fields" in the
snippet was still not highlighted.
Using for a search pattern an adjective instead of a noun achieved the
desired outcome.
The "expected" snippets in the test data must be a union of all possible
snippets produced at runtime for a given (document, search terms) pair
on all platforms of interest:
- Overlapping snippets must be properly merged
- Non-overlapping snippets can be joined with a " ... " in between.
This is a preliminary implementation checking only the following
cases:
- no search results
- all search results fitting on a single page
The second test-case fails because of a bug in search renderer (leading
to the pagination footer being pointlessly enabled). Will fix it in the
next commit.
Taskbar injected by a server adds distraction to unit-tests focusing
on the HTML contents of the returned pages. The new test-suite
TaskbarlessServerTest will have taskbar disabled.
In #727 inline CSS [was extracted](e4a4b2f961)
from `static/templates/no_search_result.html` into a separate stylesheet
resource. The purpose was to later
1. get rid of the custom `static/templates/no_search_result.html` error
template and use a general purpose error template instead (this was
accomplished by PR #744).
2. deduplicate the CSS code between `static/templates/no_search_result.html` and
`static/templates/search_result.html` by making the latter to also refer to
an internal CSS resource rather than containing inline stylesheet code.
While preparing to implement the 2nd point, I figured out that
`kiwix::SearchRenderer` is used as a component in `kiwix-desktop` too,
which probably would be upset by a link to a libkiwix's internal CSS resource.
This commit documents that finding.
Revert to the plain old 'i18n_resources_list.txt' file.
Auto discovering of i18n file has a main flaw (and a small bug):
- The main flaw is that rerun the configure will not detect new
translation files. It means that if we use cache in our CI,
new translation will not be included.
- The bug is that on Windows, meson fails with a error about a non existent
`` (empty) file name. I suppose it is because python replace
`\n` by `\r\n` on Windows, and the the `.strip().split('\n')` keeps empty
lines.
The small bug could be fixed, but the main flaw make the whole better if
we use a script to generate the listing.
This commit is somehow a half revert of 2eff5b55a6
Previously, on clicking Magnet, we were redirecting to a different site:
https://download.kiwix.org/zim/other/xyzBookWithDate.zim.magnet
This had the real magnet link as page content
Now we use the real magnet link in the href, thus not redirecting and starting the download right away.
Fix#767
Excluding qqq.json any .json file under static/i18n is now considered to
be a i18n resource. This eliminates the need to update the
i18n_resources_list.txt file every time a new language json file is
added. Thus Translatewiki PRs will not require extra work.
Now the whole content of a resource is preprocessed with a single
invocation of `re.sub()` rather than line-by-line.
Also, the function `get_preprocessed_resource()` returns a single value
rather than a (preprocessed_content, modification_count) pair; the
situation when the preprocessed resource is identical to the source
version is signalled by a return value of None.
The cache-id of resources now includes dependency information. This commit
illustrates that property with the changed cache-id of skin/index.js which
depends on skin/{download,hash,magnet,bittorent}.png.
The implementation is not fool-proof - cyclic dependency between
resources is not detected and will lead to infinite recursion.
The current implementation of resource preprocessing contains a bug
(with respect to the problem that it tries to solve): it doesn't take
into account the dependence of static resources on each other. If
resource A refers to B and B refers to C, then a change in C would
result in its cache id being updated in the preprocessed version of B.
However the cache id of B won't change since the cache id is derived
from the source rather than from the preprocessed output.
This commit is the first step towards addressing the described issue.
Now cache-id of a resource is computed on demand rather than precomputed
for all resources. The only thing remaining is to compute the cache-id
from the preprocessed content.
If during an earlier build a resource was symlinked in the build
directory (because it wasn't modified by preprocessing) and later
changes are made to the resource that result in its preprocessing no
longer being a no-op, then the preprocessing is performed (in place) on
the original resource directly (via the symlink). Therefore any symlinks
must be removed before preprocessing a resource.