ws.db.sql_types module

Custom types with automatic convertors.

Advantages of textual types:
  • natural for the representation of textual (Unicode) data

Disadvantages of textual types:
  • subject to encoding and collation

Advantages of binary types:
  • complete control over the data representation and conversion to Python objects

  • length is in bytes, so there is no storage overhead for ASCII-only data

In MySQL, textual types (*TEXT, *CHAR) represent Unicode strings, but all utf8 collations are case-insensitive: http://stackoverflow.com/a/4558736/4180822 On the other hand, binary types (*BLOB, *BINARY) are treated as “byte strings”, i.e. ASCII text with binary collation.

In PostgreSQL, the binary type (bytea) does not represent “strings”, i.e. there are much less operations and functions defined on bytea then in MySQL for binary strings. By using the textual types, we also benefit from native conversion functions in the sqlalchemy driver (e.g. psycopg2).

MediaWiki’s PostgreSQL schema uses TEXT for just about everyting, i.e. no VARCHAR. PostgreSQL manual says that there is no performance difference between char(n), varchar(n) and text.

Also note that the MySQL limits are in bytes, whereas textual types are measured in characters. Therefore we follow the PostgreSQL schema and use TEXT instead of VARCHAR.

class ws.db.sql_types.MWTimestamp(*args, **kwargs)

Bases: sqlalchemy.sql.type_api.TypeDecorator

Convertor for TIMESTAMP handling infinite values.

Construct a TypeDecorator.

Arguments sent here are passed to the constructor of the class assigned to the impl class level attribute, assuming the impl is a callable, and the resulting object is assigned to the self.impl instance attribute (thus overriding the class attribute of the same name).

If the class level impl is not a callable (the unusual case), it will be assigned to the same instance attribute ‘as-is’, ignoring those arguments passed to the constructor.

Subclasses can override this to customize the generation of self.impl entirely.

impl = DateTime()
process_bind_param(value, dialect)

Python -> database

process_result_value(value, dialect)

database -> python

class ws.db.sql_types.SHA1(*args, **kwargs)

Bases: sqlalchemy.sql.type_api.TypeDecorator

Convertor for the SHA1 hashes.

In MediaWiki they are represented as a base36-encoded number in the database and as a hexadecimal string in the API.

In both forms the encoded string has to be padded with zeros to fixed length - 31 digits in base36, 40 digits in hex.

Construct a TypeDecorator.

Arguments sent here are passed to the constructor of the class assigned to the impl class level attribute, assuming the impl is a callable, and the resulting object is assigned to the self.impl instance attribute (thus overriding the class attribute of the same name).

If the class level impl is not a callable (the unusual case), it will be assigned to the same instance attribute ‘as-is’, ignoring those arguments passed to the constructor.

Subclasses can override this to customize the generation of self.impl entirely.

impl = LargeBinary(length=31)
process_bind_param(value, dialect)

python -> db

process_result_value(value, dialect)

db -> python

class ws.db.sql_types.JSONEncodedDict(*args, **kwargs)

Bases: sqlalchemy.sql.type_api.TypeDecorator

Represents an immutable structure as a JSON-encoded string.

Construct a TypeDecorator.

Arguments sent here are passed to the constructor of the class assigned to the impl class level attribute, assuming the impl is a callable, and the resulting object is assigned to the self.impl instance attribute (thus overriding the class attribute of the same name).

If the class level impl is not a callable (the unusual case), it will be assigned to the same instance attribute ‘as-is’, ignoring those arguments passed to the constructor.

Subclasses can override this to customize the generation of self.impl entirely.

impl

alias of sqlalchemy.sql.sqltypes.UnicodeText

process_bind_param(value, dialect)

Receive a bound parameter value to be converted.

Subclasses override this method to return the value that should be passed along to the underlying TypeEngine object, and from there to the DBAPI execute() method.

The operation could be anything desired to perform custom behavior, such as transforming or serializing data. This could also be used as a hook for validating logic.

This operation should be designed with the reverse operation in mind, which would be the process_result_value method of this class.

Parameters
  • value – Data to operate upon, of any type expected by this method in the subclass. Can be None.

  • dialect – the Dialect in use.

process_result_value(value, dialect)

Receive a result-row column value to be converted.

Subclasses should implement this method to operate on data fetched from the database.

Subclasses override this method to return the value that should be passed back to the application, given a value that is already processed by the underlying TypeEngine object, originally from the DBAPI cursor method fetchone() or similar.

The operation could be anything desired to perform custom behavior, such as transforming or serializing data. This could also be used as a hook for validating logic.

Parameters
  • value – Data to operate upon, of any type expected by this method in the subclass. Can be None.

  • dialect – the Dialect in use.

This operation should be designed to be reversible by the “process_bind_param” method of this class.