Skip to content

Commit

Permalink
Fix regression in parsing nested wikilinks in file captions
Browse files Browse the repository at this point in the history
This regression seems more severe than the bug the commit was
attempting to fix (incorrect parsing of nested wikilinks in normal
links), so that bug is reintroduced until localization-aware parsing
that allows us to detect file links is added.

This commit partially reverts fac60de.
  • Loading branch information
earwig committed Feb 14, 2022
1 parent 492fc22 commit 2155638
Show file tree
Hide file tree
Showing 5 changed files with 34 additions and 23 deletions.
5 changes: 4 additions & 1 deletion CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
v0.7 (unreleased):
v0.6.4 (unreleased):

- Dropped support for end-of-life Python 3.5.
- Added support for Python 3.10. (#278)
- Fixed a regression in v0.6.2 that broke parsing of nested wikilinks in file
captions. For now, the parser will interpret nested wikilinks in normal links
as well, even though this differs from MediaWiki. (#270)

v0.6.3 (released September 2, 2021):

Expand Down
9 changes: 7 additions & 2 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,19 @@
Changelog
=========

v0.7
----
v0.6.4
------

Unreleased
(`changes <https://github.com/earwig/mwparserfromhell/compare/v0.6.3...develop>`__):

- Dropped support for end-of-life Python 3.5.
- Added support for Python 3.10.
(`#278 <https://github.com/earwig/mwparserfromhell/issues/278>`_)
- Fixed a regression in v0.6.2 that broke parsing of nested wikilinks in file
captions. For now, the parser will handle interpret wikilinks in normal links
as well, even though this differs from MediaWiki.
(`#270 <https://github.com/earwig/mwparserfromhell/issues/270>`_)

v0.6.3
------
Expand Down
10 changes: 6 additions & 4 deletions src/mwparserfromhell/parser/ctokenizer/tok_parse.c
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ static int Tokenizer_parse_tag(Tokenizer *);
/*
Determine whether the given code point is a marker.
*/
static int is_marker(Py_UCS4 this)
static int
is_marker(Py_UCS4 this)
{
int i;

Expand Down Expand Up @@ -2929,9 +2930,10 @@ Tokenizer_parse(Tokenizer *self, uint64_t context, int push)
return NULL;
}
} else if (this == next && next == '[' && Tokenizer_CAN_RECURSE(self)) {
if (this_context & LC_WIKILINK_TEXT) {
return Tokenizer_fail_route(self);
}
// TODO: Only do this if not in a file context:
// if (this_context & LC_WIKILINK_TEXT) {
// return Tokenizer_fail_route(self);
// }
if (!(this_context & AGG_NO_WIKILINKS)) {
if (Tokenizer_parse_wikilink(self)) {
return NULL;
Expand Down
5 changes: 3 additions & 2 deletions src/mwparserfromhell/parser/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -1406,8 +1406,9 @@ def _parse(self, context=0, push=True):
return self._handle_argument_end()
self._emit_text("}")
elif this == nxt == "[" and self._can_recurse():
if self._context & contexts.WIKILINK_TEXT:
self._fail_route()
# TODO: Only do this if not in a file context:
# if self._context & contexts.WIKILINK_TEXT:
# self._fail_route()
if not self._context & contexts.NO_WIKILINKS:
self._parse_wikilink()
else:
Expand Down
28 changes: 14 additions & 14 deletions tests/tokenizer/wikilinks.mwtest
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,20 @@ output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar[b

---

name: nested
label: a wikilink nested within another
input: "[[file:foo|[[bar]]]]"
output: [WikilinkOpen(), Text(text="file:foo"), WikilinkSeparator(), WikilinkOpen(), Text(text="bar"), WikilinkClose(), WikilinkClose()]

---

name: nested_padding
label: a wikilink nested within another, separated by other data
input: "[[file:foo|a[[b]]c]]"
output: [WikilinkOpen(), Text(text="file:foo"), WikilinkSeparator(), Text(text="a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c"), WikilinkClose()]

---

name: invalid_newline
label: invalid wikilink: newline as only content
input: "[[\n]]"
Expand Down Expand Up @@ -89,20 +103,6 @@ output: [Text(text="[[foo[bar]]")]

---

name: invalid_nested_text
label: invalid wikilink: nested within the text of another
input: "[[foo|[[bar]]]]"
output: [Text(text="[[foo|"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="]]")]

---

name: invalid_nested_text_2
label: invalid wikilink: a wikilink nested within the text of another, with additional content
input: "[[foo|a[[b]]c]]"
output: [Text(text="[[foo|a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c]]")]

---

name: invalid_nested_title
label: invalid wikilink: nested within the title of another
input: "[[foo[[bar]]]]"
Expand Down

0 comments on commit 2155638

Please sign in to comment.