ws.db.schema module¶
Known incompatibilities from MediaWiki schema:
Not binary compatible, but stores the same data. Thus compatibility can be achieved via wiki-scripts <-> MWAPI interface, but wiki-scripts can’t read a MediaWiki database directly. This wouldn’t be possible even theoretically, since the database can contain serialized PHP objects etc.
Added some custom tables.
Enforced foreign key constraints, including namespaces stored in custom tables, and some check constraints.
Columns not available via the API (e.g. user passwords) are nullable, since they are not part of the mirroring process. Likewise revision.rev_text_id is nullable so that we can sync metadata and text separately.
- Removed columns that were deprecated even in MediaWiki:
page.page_restrictions archive.ar_text archive.ar_flags
Reordered columns in archive table to match the revision table.
Revamped the protected_titles table - removed unnecessary columns pt_user, pt_reason and pt_timestamp since the information can be found in the logging table. See https://phabricator.wikimedia.org/T65318#2654217 for reference.
Boolean columns use Boolean type instead of SmallInteger as in MediaWiki.
Unknown/invalid IDs are represented by NULL instead of 0. Except for user_id, where we add a dummy user with id = 0 to represent anonymous users.
Removed default values from all timestamp columns.
Removed silly default values - if we don’t know, let’s make it NULL.
- Revamped the tags tables:
Besides the tag name, we need to store everything that MediaWiki generates or stores elsewhere.
The change_tag table was split into tagged_recentchange, tagged_logevent, tagged_revision and tagged_archived_revision. Foreign keys on the other tables are enforced.
The equivalent of the tag_summary table does not exist, we can live with the GROUP BY queries.
- Various notes on tables used by MediaWiki, but not wiki-scripts:
site_stats: we don’t sync the site stats because the values are inconsistent even in MediaWiki
sites, site_identifiers: as of MW 1.28, they are not visible via the API
job, objectcache, querycache*, transcache, updatelog: not needed for wiki-scripts operation
user_former_groups: used only to prevent user auto-promotion into groups from which they were already removed; not visible through the API
- ws.db.schema.create_custom_tables(metadata)¶
- ws.db.schema.create_site_tables(metadata)¶
- ws.db.schema.create_recentchanges_tables(metadata)¶
- ws.db.schema.create_users_tables(metadata)¶
- ws.db.schema.create_revisions_tables(metadata)¶
- ws.db.schema.create_pages_tables(metadata)¶
- ws.db.schema.create_recomputable_tables(metadata)¶
- ws.db.schema.create_multimedia_tables(metadata)¶
- ws.db.schema.create_tables(metadata)¶