Before this change cacheids were computed only for those static
resources that were referenced from other resources via KIWIXCACHEID.
A few static resources without such references existed.
Now all resources under skin/ have their cacheids computed.
During static resource preprocessing and compilation their cacheid
values are embedded into libkiwix and can be accessed at runtime.
If a static resource is requsted without specifying any cacheid
it is served as dynamic content (with short TTL and the library id
used for the ETag, though using the cacheid for the ETag would
be better).
If a cacheid is supplied in the request it must match the cacheid of the
resource (otherwise a 404 Not Found error is returned) whereupon the
resource is served as immutable content.
Known issues:
- One issue is caused by the fact that some static resources don't get a
cacheid; this is resolved in the next commit.
- Interaction of this change with the support for dynamically customizing
static resources (via KIWIX_SERVE_CUSTOMIZED_RESOURCES env var) was
not addressed.
Excluding qqq.json any .json file under static/i18n is now considered to
be a i18n resource. This eliminates the need to update the
i18n_resources_list.txt file every time a new language json file is
added. Thus Translatewiki PRs will not require extra work.
Now the whole content of a resource is preprocessed with a single
invocation of `re.sub()` rather than line-by-line.
Also, the function `get_preprocessed_resource()` returns a single value
rather than a (preprocessed_content, modification_count) pair; the
situation when the preprocessed resource is identical to the source
version is signalled by a return value of None.
The cache-id of resources now includes dependency information. This commit
illustrates that property with the changed cache-id of skin/index.js which
depends on skin/{download,hash,magnet,bittorent}.png.
The implementation is not fool-proof - cyclic dependency between
resources is not detected and will lead to infinite recursion.
The current implementation of resource preprocessing contains a bug
(with respect to the problem that it tries to solve): it doesn't take
into account the dependence of static resources on each other. If
resource A refers to B and B refers to C, then a change in C would
result in its cache id being updated in the preprocessed version of B.
However the cache id of B won't change since the cache id is derived
from the source rather than from the preprocessed output.
This commit is the first step towards addressing the described issue.
Now cache-id of a resource is computed on demand rather than precomputed
for all resources. The only thing remaining is to compute the cache-id
from the preprocessed content.
If during an earlier build a resource was symlinked in the build
directory (because it wasn't modified by preprocessing) and later
changes are made to the resource that result in its preprocessing no
longer being a no-op, then the preprocessing is performed (in place) on
the original resource directly (via the symlink). Therefore any symlinks
must be removed before preprocessing a resource.
kiwix-resources preprocesses all resources rather than only templates. At
this point this doesn't change anything since only (some) template resources
contain KIWIXCACHEID placeholders. But this enhancement opens the door
to the preprocessing of static/skin/index.js (after preprocessing is
able to handle relative links, which comes in the next commit).
The story of search_results.css
static/skin/search_results.css was extracted from
static/templates/no_search_result.html before the latter was dropped.
static/templates/no_search_result.html in turn seems to be a copied and
edited version of static/templates/search_result.html.
In the context of exploratory work on the internationalization of
kiwix-serve (PR #679) I noticed duplication of inline CSS across those
two templates and intended to eliminated it. That goal was not fully
accomplished (static/templates/search_result.html remained untouched)
because by that time PR #679 grew too big and the efforts were diverted
into splitting it into smaller ones. Thus search_results.css slipped
into one of those small PRs, without making much sense because nothing
really justifies preserving custom CSS in the "Fulltext search unavailable"
error page.
At the same time, it served as the only case where a link to a cacheable
resource is generated in C++ code (rather than found in a template).
This poses certain problems to the handling of cache-ids. A workaround
is to expel the URL into a template so that it is processed by
`kiwix-resources`. This commit merely demonstrates that solution. But
whether it should be preserved (or rather the "Fulltext search
unavailable" page should be deprived of CSS) is questionable.
In template resources (found under static/templates), strings of the
form "PATH/TO/STATIC/RESOURCE?KIWIXCACHEID" are expanded into
"PATH/TO/STATIC/RESOURCE?cacheid=CACHEIDVAL" where CACHEIDVAL is a
8-digit hexadecimal hash digest of the file at
static/PATH/TO/STATIC/RESOURCE.
Introduced a new resource compiler script kiwix-compile-i18n that
processes i18n string data stored in JSON files and generates sorted C++
tables of string keys and values for all languages.
Mustache templating system is a bit simpler than ctpp2 and ctpp2 is no
more maintained (see #189).
We are moving to the kainjow's Mustache project
(https://github.com/kainjow/Mustache).
It simplify a lot our system has it is header only and we don't have to
precompile the template.
Fix#21
std::runtime_error is defined in <stdexcept> not <exception>.
Recent gcc version compiled the resource file without complain but old
version need a correct include.
If there are several uses of the compile_resource script it will have
several definition of getResource function.
So, define a custum getResource function per resources "pack" and add
a define to have a nice API.
A developer must take care of not include two generated .h in the same
compilation unit as there will be a redefine error.
The best way to avoid this is to always include the generated .h in the
c(pp) file and never in a header.
If a compilation unit need to use two pack at the same time, we have to
undef 'getResource' and use the real getResource_* methods.
- No more dependency to reswrap binary (everything is done in python)
- Resource strings can be directly accessed.
As side effect, it add a check at compilation if the resource is
declared and compiled in the binary.
- The resource content can be overwritten at runtime with a env variable.
There is also few clean in the static as some files shoul be in the tools
directory.
The compile_resource script is install to let other project use it.