wiki-scripts documentation¶
The wiki-scripts project is a general framework for writing bots, maintenance
scripts or performing data analysis using the MediaWiki API interface. The
repository includes several scripts automating common maintenance tasks on the
ArchWiki, but most of the functionality implemented in the underlying
ws
module is general and reusable on any MediaWiki powered wiki.
Notable features¶
General features of the ws
module:
Custom interface for connecting to the MediaWiki API (
ws.client
).Automatic handling of the CSRF token.
Automatic conversion of MediaWiki timestamp strings into the Python’s
datetime.datetime
objects and back.Automatic handling of API query-continuation.
Class for easy handling of MediaWiki page titles (
Title
).Custom SQL database capable of mirroring (almost) all data stored on the wiki (useful e.g. for caching of expensive queries involving revisions content).
Recursive template expansion using
mwparserfromhell
and the SQL database. Seews.parser_helpers.template_expansion
.
Featured scripts¶
interlanguage.py
updates the interlanguage links based on the ArchWiki’s interlanguage map and fixes categories of local pages.link-checker.py
parses all pages on the wiki and tries to fix various functional and stylistic issues with wikilinks, external links and manual page links.url-replace.py
parses all pages on the wiki and performs various replacements on external link URLs. This functionality is also included inlink-checker.py
.extlink-checker.py
parses all pages and checks if external links are accessible and marks them with the Dead link template if they are clearly broken.statistics.py
generates automatic updates to the ArchWiki:Statistics page.toc.py
generates the Table of contents page and its localized versions.update-package-templates.py
finds broken links using the AUR/Grp/Pkg templates and tries to update them (for example when a package has been moved from the AUR to the official repositories).
For a full list of available scripts see the root directory in the git
repository. The examples directory contains less notable notable scripts
showing various ways of the core ws
module usage.
Installation¶
Get the latest development version by cloning the git repository:
git clone git@github.com:lahwaacz/wiki-scripts.git
cd wiki-scripts
Alternatively download a tarball of the latest stable release.
There is no package on PyPI or any other repository yet, all dependencies have to be installed manually.
Requirements¶
Python version 3
The following are required only by some scripts:
WikEdDiff (for highlighting differences between revisions in interactive mode)
Pygments (alternative highlighter when WikEdDiff is not available)
pyalpm (for
update-package-templates.py
)NumPy and matplotlib (for
statistics_histograms.py
)hstspreload (for
link-checker.py
andurl-replace.py
)
Optional dependencies:
PostgreSQL server, SQLAlchemy, Alembic and a driver such as Psycopg2 (for local database caching)
Tk/Tcl (for copying the output of
statistics.py
to the clipboard)colorlog (for colorized logging output)
Dependencies for running the tests:
Necessary Python packages are installed automatically in the virtual environments.
Other tools used for development:
Acknowledgement¶
There is a list of client software maintained on mediawiki.org, many of them are quite inspirational.
simplemediawiki is the original inspiration for the core
ws.client.connection
and (partially)ws.client.api
modules.Some scripts are inspired by the Wiki Monkey’s plugins, but (obviously) were written from scratch.