Jump to content

User:Pathoschild/ToyEngine

From Meta, a Wikimedia project coordination wiki

These are notes for a work in progress.


ToyEngine is a small, modular framework which simplifies creating tools on the Wikimedia Toolserver. It provides logging, cached wiki data, simple caching with expiry, and a database provider built on top of PDO. The database provider enables easy parameterized SQL, simplifies connecting and querying, and optimizes connections when iterating through multiple wiki databases.

Examples

[edit]

Get a user's edit count on every wiki

[edit]
// get the framework
require_once('toyengine/ToyEngine.php');
$engine = new ToyEngine();

// start the database provider
$db = $engine->getDatabase();

// iterate over each wiki
$wikis = $engine->getWikis();
foreach($wikis->getWikis() as $wiki) {
	$db->connect($wiki->name);
	$count = $db->query('SELECT user_editcount FROM user WHERE user_name = ?', $userName)->fetchValue();
	echo "The user has {$count} edits on {$wiki->domain}.";
}

Build a drop-down list of open wikis (example)

[edit]
// get the framework
require_once('toyengine/ToyEngine.php');
$engine = new ToyEngine();

// build list
echo "<select name='wiki'>";
$wikis = $engine->getWikis();
foreach($wikis->getDomainHash(false/*no locked wikis*/) as $name => $domain)
	echo "<option value='{$name}'>{$domain}</option>";
echo "</select>";

Cache an expensive operation

[edit]
// get the framework
require_once('toyengine/ToyEngine.php');
$engine = new ToyEngine();

// get a cached result from the cache
// if it's not cached yet, this will call the function, cache the result, and return it
$cache = $engine->getCache();
$cache->getOrSave('expensive-operation', function() {
	return expensiveQuery();
});

Code interfaces

[edit]

ToyEngine

[edit]
/**
 * A framework which simplifies development on the Wikimedia Toolserver. It uses interfaces with a factory to
 * construct the core objects, which lets reusers override behaviour or implementations.
 */
class ToyEngine {
	/**
	 * Construct the ToyEngine framework instance.
	 * @param IFactory $factory Provides implementation instances for the ToyEngine framework (or null for the default
	 * implementation).
	 */
	public function __construct(IFactory $factory = null);

	/**
	 * Get an object which reads and writes temporary data to an underlying key=>value storage with expiry.
	 * @return ICache
	 */
	public function getCache();

	/**
	 * Get an object which provides data about the Wikimedia wikis and their replicated Toolserver databases.
	 * @return IWikis
	 */
	public function getWikis();

	/**
	 * Get an object which manages connections and reads or writes data to the database.
	 * @return IDatabase
	 */
	public function getDatabase();

	/**
	 * Get an object which writes debug messages to the output.
	 * @return ITracer
	 */
	public function getTracer();

	/**
	 * Format a string as a wiki username. (This method is multibyte-aware.)
	 * @param string $str The string to format
	 * @return string The formatted string: trimmed, with an uppercase first letter, and with underscores replaced with
	 * spaces.
	 */
	public function formatUsername($str);

	/**
	 * Enforces a schema defining valid arguments and default values on a key=>value array.
	 * Argument keys not found in the schema will throw an exception. Missing keys will be added using the default
	 * values specified in the schema.
	 * @param array $arguments An associative array to apply the schema to.
	 * @param array $schema An associative array whose keys are the allowed keys in $arguments, and whose values are the
	 * default values to apply to missing keys.
	 * @return array The modified argument array.
	 * @throws \UnexpectedValueException The argument array contains keys not found in the schema.
	 */
	public function applyArgumentSchema($arguments, $schema);
}

Cache

[edit]
/**
 * Reads and writes data to storage with expiry dates.
 */
interface ICache {
	/**
	 * Set a date before which to purge data. Cached items stored before this date will be deleted on retrieval, as if
	 * they had expired.
	 * @param \DateTime $purgeDate The purge date.
	 */
	public function setPurgeDate($purgeDate);

	/**
	 * Get cached data by its key.
	 * @param string $key The key with which the data was cached.
	 * @return mixed|null The cached item's value if found, else null.
	 */
	public function get($key);

	/**
	 * Get a cached item by its key.
	 * @param string $key The key with which the cache item was cached.
	 * @return ICacheItem|null The cache item if found, else null.
	 */
	public function getWithMetadata($key);

	/**
	 * Get cached data by its key, or save a new value if there is no cached value.
	 * @param string $key The key by which the data can be retrieved.
	 * @param callback $getValue Generates the value to cache if the cache does not contain the value.
	 * @param \DateInterval|null $expiry The duration for which to cache the item (or null for one day).
	 * @return mixed
	 */
	public function getOrSave($key, $getValue, $expiry = null);


	/**
	 * Save an object to the cache.
	 * @param string $key The key by which the data can be retrieved.
	 * @param mixed $value The data to cache.
	 * @param \DateInterval|null $expiry The duration for which to cache the item (or null for one day).
	 */
	public function save($key, $value, $expiry = null);
}
/**
 * Represents a cached item with expiry.
 */
interface ICacheItem {
	/**
	 * The date on which the item was cached.
	 * @returns \DateTime
	 */
	public function date();

	/**
	 * The date on which the item will expire.
	 * @returns \DateTime
	 */
	public function expiry();

	/**
	 * Whether the item should be purged.
	 * @param null|\DateTime $minDate The date before which to purge cache items.
	 * @returns boolean
	 */
	public function isPurged($minDate = NULL);

	/**
	 * Whether the item has exceeded its expiry date.
	 * @returns boolean
	 */
	public function isExpired();

	/**
	 * The data that was cached.
	 * @returns mixed
	 */
	public function value();
}

Database

[edit]
/**
 * Manages connections and reads or writes data to the database.
 *
 * This database implementation wraps PHP Data Objects (PDO) with optimization and convenience methods.
 *
 * It stores open database connections internally (to avoid opening new connections unnecessarily) and wraps many of
 * PDO's methods for error handling (eg, when Database::ERROR_PRINT set). When an error occurs, it ignores all further
 * calls until the next Connect() or resetException() call.
 *
 * The db field exposes the PDO object for the currently active database, but should typically not be used directly.
 */
interface IDatabase {
	/**
	 * Get an open connection to a database. If there is an open connection pointing to the same server, that connection
	 * will be flipped to this database and returned.
	 * @param string $database The name of the database to connect to, like 'enwiki' or 'enwiki_p'.
	 * @param string $host The server host on which the database resides (or null to determine this automatically for a
	 * replicated wiki database).
	 * @param string|null $username The username used to connect to the server (or null to get this from ~/.my.cnf).
	 * @param string|null $password The password used to connect to the server (or null to get this from ~/.my.cnf).
	 */
	public function connect($database, $host = null, $username = null, $password = null);

	/**
	 * Return to the previous connection.
	 * (For example, this lets you temporarily connect to another database without affecting code in the containing scope.)
	 */
	public function connectPrevious();

	/**
	 * Execute a SQL query against the connected database.
	 * @param string $sql The SQL statement to execute, with parameterized values replaced with placeholders.
	 * @param array|mixed $values The set of values to inject into the query string. This can be a single value, an array
	 * of values, or a variadic set of values.
	 * @return IDatabaseQuery The database query ready to be executed.
	 * @example query('SELECT * FROM table')
	 * @example query('SELECT * FROM table WHERE field IN (?, ?)', array($value, $anotherValue))
	 * @example query('SELECT * FROM table WHERE field IN (?, ?)', $value, $anotherValue)
	 */
	public function query($sql, $values = null);

	/**
	 * Disconnect from the current database. This should generally never be called, because it prevents the provider
	 * from optimising and recycling connections. Connections will be closed automatically when the script ends.
	 */
	public function disconnect();
}
/**
 * Represents the result of a database query.
 */
interface IDatabaseQuery {
	/**
	 * Get the number of rows returned by the query.
	 * @return int
	 */
	public function countRows();

	/**
	 * Fetch a single value from the next next row.
	 * @param int $columnNumber The number of the column to retrieve.
	 * @return string
	 */
	public function fetchValue($columnNumber = 0);

	/**
	 * Fetch the next row in the result as an associative array of field names and values.
	 * @return array
	 * @example: array('id' => 42, 'name' => 'someValue', 'anotherField' => 'anotherValue')
	 */
	public function fetchAssoc();

	/**
	 * Fetch all result rows as a multidimensional associative array of field names and values.
	 * @return array
	 * @example: array(
	 *      [0] => array('id' => 42, 'name' => 'someValue', 'anotherField' => 'anotherValue')
	 *      [1] => array('id' => 43, 'name' => 'something', 'anotherField' => 'some value')
	 * )
	 */
	public function fetchAllAssoc();

	/**
	 * Fetch every value in a column as a flat array.
	 * @param int $columnNumber The number of the column to retrieve.
	 * @return array
	 */
	public function fetchAllColumn($columnNumber = 0);
}

Logging

[edit]
/**
 * Writes debug messages to the output.
 */
interface ITracer {
	//########
	// Constants
	//########
	/**
	 * Minimal severity for trivial details used in debugging.
	 */
	const DEBUG = 1;

	/**
	 * Normal severity for informational messages about application state.
	 */
	const INFO = 2;

	/**
	 * Moderate severity for potential issues.
	 */
	const WARN = 3;

	/**
	 * High severity for an error condition or failure.
	 */
	const ERROR = 4;

	/**
	 * Critical severity for a problem that requires immediate attention.
	 */
	const CRITICAL = 5;

	//########
	// Methods
	//########
	/**
	 * Write a message to the output with a DEBUG severity.
	 * @param string $msg The message to write.
	 * @param int $severity The severity of the message, matching an ITracer constant.
	 */
	public function log($msg, $severity);

	/**
	 * Write a message to the output with a DEBUG severity.
	 * @param string $msg The message to write.
	 */
	public function debug($msg);

	/**
	 * Write a message to the output with an INFO severity.
	 * @param string $msg The message to write.
	 */
	public function info($msg);

	/**
	 * Write a message to the output with a WARN severity.
	 * @param string $msg The message to write.
	 */
	public function warn($msg);

	/**
	 * Write a message to the output with an ERROR severity.
	 * @param string $msg The message to write.
	 */
	public function error($msg);

	/**
	 * Write a message to the output with a CRITICAL severity.
	 * @param string $msg The message to write.
	 */
	public function critical($msg);
}

Wikis

[edit]
/**
 * Provides data about wikis and their replicated Toolserver databases.
 */
interface IWikis {
	/**
	 * Get metadata about known wikis and their replicated databases.
	 * @returns IToyWiki[]
	 */
	public function GetWikis();

	/**
	 * Get a dbname=>domain hash of wikis.
	 * @param bool $includeClosed Whether to include closed wikis in the hash.
	 * @returns array
	 */
	public function GetDomainHash($includeClosed = false);

	/**
	 * Get metadata about a known wiki and its replicated database.
	 * @param string $name The database name of the wiki, matching {IToyWiki->name()} or {IToyWiki->dbName()}.
	 * @returns IWiki|null Returns the matching wiki, or null if none was found.
	 */
	public function GetWiki($name);

	/**
	 * Get the replicated database server host for a wiki.
	 * @param string $name The database name of the wiki, matching {IToyWiki->name()} or {IToyWiki->dbName()}.
	 * @returns string|null Returns the matching wiki's database host, or null if none was found.
	 */
	public function GetHost($name);

	/**
	 * Get the wiki domain for a wiki.
	 * @param string $name The database name of the wiki, matching {IToyWiki->name()} or {IToyWiki->dbName()}.
	 * @returns string|null Returns the matching wiki's database domain, or null if none was found.
	 */
	public function GetDomain($name);
}
/**
 * Provides metadata about a wiki and its replicated Toolserver database.
 */
interface IWiki {
	/**
	 * The simplified database name (like 'enwiki').
	 * @returns string
	 */
	public function name();

	/**
	 * The database name (like 'enwiki_p').
	 * @returns string
	 */
	public function dbName();

	/**
	 * The ISO 639 language code associated with the wiki (like 'fr').
	 * Note: a few wikis have invalid codes (like 'zh-classical').
	 * @returns string
	 */

	public function language();
	/**
	 * The wiki family or project name (like 'wikibooks').
	 * @returns string
	 */
	public function family();

	/**
	 * The domain portion of the URL (like 'en.wikisource.org'). This may be NULL for closed wikis.
	 * @returns string
	 */
	public function domain();

	/**
	 * The number of articles on the wiki.
	 * @returns int
	 */
	public function size();

	/**
	 * The number of the server on which the wiki's replicated database is located (like '2').
	 * @returns int
	 */
	public function serverNumber();

	/**
	 * The host name of the server on which the wiki's replicated database is located (like 'sql-s2-rr.toolserver.org').
	 * @returns string
	 */
	public function serverHost();

	/**
	 * Whether the wiki is a meta-project like the Wikimedia Foundation wiki or Metawiki.
	 * @returns bool
	 */
	public function isMeta();

	/**
	 * Whether the wiki is locked and no longer editable by the public.
	 * @returns bool
	 */
	public function isClosed();

	/**
	 * Whether the wiki has multilingual content.
	 * @returns bool
	 */
	public function isMultilingual();
}

Factory

[edit]
/**
 * Provides implementation instances for the ToyEngine framework.
 */
interface IFactory {
	/**
	 * Get an object which reads and writes temporary data to an underlying key=>value storage with expiry.
	 * @return ICache
	 */
	public function getCache();

	/**
	 * Get an object which provides data about the Wikimedia wikis and their replicated Toolserver databases.
	 * @return IWikis
	 */
	public function getWikis();

	/**
	 * Get an object which manages connections and reads or writes data to the database.
	 * @return IDatabase
	 */
	public function getDatabase();

	/**
	 * Get an object which writes debug messages to the output.
	 * @return ITracer
	 */
	public function getTracer();
}

Extending the framework

[edit]

The framework uses PHP5 interfaces for all its modules. Modules can be replaced with new implementations by extending the the ToyEngine object, or overriding methods on the Factory object and passing the factory instance into the ToyEngine constructor.

To do

[edit]
  • Templating? This should probably be done with something like Smarty instead.