ws.parser_helpers.wikicode module¶
- ws.parser_helpers.wikicode.strip_markup(text, normalize=True, collapse=True)¶
Parses the given text and returns the text after stripping all MediaWiki markup, leaving only the plain text.
- Parameters
normalize – passed to
mwparserfromhell.wikicode.Wikicode.strip_code()
collapse – passed to
mwparserfromhell.wikicode.Wikicode.strip_code()
- Returns
- ws.parser_helpers.wikicode.get_adjacent_node(wikicode, node, ignore_whitespace=False)¶
Get the node immediately following node in wikicode.
- Parameters
wikicode – a
mwparserfromhell.wikicode.Wikicode
objectnode – a
mwparserfromhell.nodes.Node
objectignore_whitespace – When True, the whitespace between node and the node being returned is ignored, i.e. the returned object is guaranteed to not be an all white space text, but it can still be a text with leading space.
- Returns
a
mwparserfromhell.nodes.Node
object or None if node is the last object in wikicode
- ws.parser_helpers.wikicode.get_parent_wikicode(wikicode, node)¶
Returns the parent of node as a wikicode object. Raises
ValueError
if node is not a descendant of wikicode.
- ws.parser_helpers.wikicode.remove_and_squash(wikicode, obj)¶
Remove obj from wikicode and fix whitespace in the place it was removed from.
- ws.parser_helpers.wikicode.get_section_headings(text)¶
Extracts section headings from given text. Custom regular expression is used instead of
mwparserfromhell
for performance reasons.Note
Known issues:
templates are not handled (use
ws.parser_helpers.template_expansion.expand_templates()
prior to calling this function)
- Parameters
text (str) – content of the wiki page
- Returns
list of section headings (without the
=
marks)
- ws.parser_helpers.wikicode.get_anchors(headings, pretty=False, suffix_sep='_')¶
Converts section headings to anchors.
Note
Known issues:
templates are not handled (call
ws.parser_helpers.template_expansion.expand_templates()
on the wikitext before extracting section headings)all tags are always stripped, even invalid tags (
mwparserfromhell
is not that configurable)if
pretty
isTrue
, tags escaped with <nowiki> in the input are not encoded in the output
- Parameters
headings (list) – section headings (obtained e.g. with
get_section_headings()
)pretty (bool) – if
True
, the anchors will be as pretty as possible (e.g. for use in wikilinks), otherwise they are fully dot-encodedsuffix_sep (str) – the separator between the base anchor and numeric suffix for duplicate section names
- Returns
list of section anchors
- ws.parser_helpers.wikicode.ensure_flagged_by_template(wikicode, node, template_name, *template_parameters, overwrite_parameters=True)¶
Makes sure that
node
inwikicode
is immediately (except for whitespace) followed by a template withtemplate_name
and optionaltemplate_parameters
.- Parameters
wikicode – a
mwparserfromhell.wikicode.Wikicode
objectnode – a
mwparserfromhell.nodes.Node
objecttemplate_name (str) – the name of the template flag
template_parameters – optional template parameters
- Returns
the template flag, as a
mwparserfromhell.nodes.template.Template
objet
- ws.parser_helpers.wikicode.ensure_unflagged_by_template(wikicode, node, template_name, *, match_only_prefix=False)¶
Makes sure that
node
inwikicode
is not immediately (except for whitespace) followed by a template withtemplate_name
.- Parameters
wikicode – a
mwparserfromhell.wikicode.Wikicode
objectnode – a
mwparserfromhell.nodes.Node
objecttemplate_name (str) – the name of the template flag
match_only_prefix (bool) – if
True
, only the prefix of the adjacent template must matchtemplate_name
- ws.parser_helpers.wikicode.is_redirect(text, *, full_match=False)¶
Checks if the text represents a MediaWiki redirect page.
- Parameters
full_match (bool) – Restricts the behaviour to return
True
only for pages which do not contain anything else but the redirect line.
- ws.parser_helpers.wikicode.parented_ifilter(wikicode, recursive=True, matches=None, flags=re.IGNORECASE | re.UNICODE | re.DOTALL, forcetype=None)¶
Iterate over nodes and their corresponding parents.
The arguments are interpreted as for
ifilter()
. For each tuple(parent, node)
yielded by this method,parent
is the direct parent wikicode ofnode
.The method is intended for performance optimization by avoiding expensive search e.g. in the
replace
method. See themwparserfromhell
issue for details: https://github.com/earwig/mwparserfromhell/issues/195