From Apibot
Jump to: navigation, search

This is a quick introduction on Apibot usage. For a simple hello world program, see the quick start guide.

Using the bot

To use the bot, you will typically create a simple PHP file. In it you include some bot modules, create some bot objects and have the objects do the work you need.

This PHP file is the only one you have to write yourself. It usually requires very little PHP knowledge, just the basics of using Apibot. For an average computer user this will be about an hour of learning. :-)

A hands-on follows:

The Core object

If you build a house, or factory, or pretty much anything else, you should provide it with electricity, water, phone, Internet etc. Collectively, all these utilities are often called core.

Apibot has its Core, too - a set of objects that provide the different basic functions and utilities for the bot. They are contained in the Core class. An object of this class is needed to create almost every bot module.

To create an object of the Core class, you need a wiki login and (optionally) a set of bot settings. In Apibot, wiki logins are typically stored in an array called $logins, which is described in a file called logins.php. The bot settings are typically defined in an array, named $bot_settings and defined in settings.php. Finally, the Core class is defined in the file core/core.php.

To create a Core object, put somewhere in the beginning of your PHP file lines like these:

require_once ( dirname ( __FILE__ ) . '/apibot/settings.php' );
require_once ( dirname ( __FILE__ ) . '/apibot/logins.php' );
require_once ( dirname ( __FILE__ ) . '/apibot/core/core.php' );

$core = new Core ( $logins[''], $bot_settings );

(Hint: Copy and paste these. This example assumes that Apibot is in a subdirectory of the current directory, named "apibot", and that the $logins array contains a login with the key - if not so, edit where appropriate.)

Internally, the Core object is a container for several other objects that export the functionality needed by the bot modules and/or by the bot user / programmer. All are public, that is, can be accessed directly from external modules. Here they are:

  • Settings - stores and exports the settings the bot was created with
  • Browser - handles HTTP(S) connections, eg. these between the bot and the MediaWiki
  • Exchanger - handles the MediaWiki-specific data transfers. (As the API and the Web backends require different request handling, there are two Exchanger objects - exchanger_api and exchanger_web.)
  • Identity - stores and exports the account identity related settings and info
  • Info - stores, refreshes, handles and exports all kinds of info about the wiki the bot works with, the user account it uses etc. (That is a lot of info!)
  • Infostore - saves to file(s) and retrieves information. (Used by the Info and Identity objects, but you also may use it for your own info.)
  • Tokens - stores and exports the MediaWiki request tokens. (As the API and the Web backends may require different request tokens, there are two Tokens objects - tokens_api and tokens_web.)
  • Log - enables the logging by the bot modules

The user interfaces

Apibot is highly modular and Lego-like. It has objects that provide different tasks, different levels and ease of access, etc. If you know some PHP and/or are knowledgeable about the MediaWiki API, you can even make your own bot from Apibot parts. :-) However, most users will probably opt for its standard user interfaces.

Apibot has two different user interfaces - Bridge and Assembly line. A short description of both follows.

The Bridge interface


Bridge is the classic Apibot interface, similar to most MediaWiki bots (including Apibot 0.3x).

It is implemented by the class Bridge, found in bridge.php. Using its functions, you can feel like an admiral that commands a nuclear carrier from its bridge. :-)

The class exports two types of functions. One is the wiki tasks - fetching a page, submitting the page text, blocking / unblocking users, uploading files, reverting vandal edits, etc. The other type are the wiki queries - objects that can return lists of wiki objects (pages, users, page revisions, log entries etc.).

In addition, this class exports an Info class - a wealth of information, more than most users would believe to exist. It offers over 500 functions that return different kinds of info about the wiki you work with, the user account you use, etc.

The upside of this interface is that it is very similar to Apibot 0.3x, as well as to most other MediaWiki bots. The downside is that using it requires slightly more PHP knowledge than the Assembly line interface.


To use it, you must create an object of the Bridge class, implemented in the file bridge.php. Insert in your PHP file, under the lines that create the $core object, lines like the following:

require_once ( dirname ( __FILE__ ) . '/apibot/bridge/bridge.php' );

$bridge = new Bridge ( $core );

Query functions

The Bridge interface gives you the means to query the wiki for information. MediaWiki can supply many types of info. Accordingly, the Bridge interface allows for many different types of queries.

Typically you will call a Bridge function that returns a Query object of a specific type. Some of the functions need no parameters at all (but can be given some). For example, the query that will list the wiki users needs no vital info. Other query functions need some key info - for example, the query that will list the revisions of a page will need the page title or pageid.

Here are some examples on creating a Query object:

$query_revs = $bridge->query_title_revisions ( $title );
$query_l_au = $bridge->query_list_allusers();

All query functions can take two additional parameters. The first is an array with parameters you can use to direct the data retrieving, for example to retrieve page revisions from this to this date and time. The second is config settings for this query. If you don't want to supply a parameter, specify NULL for it, or just leave it out if it is the last one. For example:

$query_revs = $bridge->query_title_revisions ( $title, $params, $settings );
$query_revs = $bridge->query_title_revisions ( $title, $params );
$query_revs = $bridge->query_title_revisions ( $title, NULL, $settings );

$query_l_af = $bridge->query_list_allfiles ( $params, $settings );
$query_l_af = $bridge->query_list_allfiles ( $params );
$query_l_af = $bridge->query_list_allfiles ( NULL, $settings );

(Click here for a full list of the Bridge Query functions.)

Query objects can supply wiki information according to their type. Here is the typical way to use them:

$result = $query->xfer();
while ( $result )
  foreach ( $query->data as $element )
     // process the data element
  $result = $query->next();

Queries also export some other functions - setting parameters to them, etc. (Click here for a full list of the functions of a Query object.)

Non-Query functions

These functions typically change the wiki info (pages contents, user rights), do some things, or retrieve HTML code.

Most functions shown here support more parameters, which can be found in their descriptions. Also, most will return either some fetch (a page object, HTML text or other), or a Boolean value telling whether their action was successful.

// Edit a page
$page = $bridge->fetch_editable ( "My Pride of a Page" );  // get the page I work on
$page->replace_string ( "Teh", "The" );                    // replace everywhere in it this nasty typo
$bridge->edit ( $page, "Fixed a nagging typo", true );     // update the page with this comment and a Minor change flag

// Block an user
$bridge->block ( "Nasty Vandal", "You deserved it!" );

// Delete a page (on most wikis you will need administrator rights)
// ... by title:
$bridge->delete_title ( "My Failed Attempt", "This was a garbage, sorry" );
// ... by pageid (these are unique numeric page identifiers that do not change with renaming the page):
$bridge->delete_pageid ( 32187, "Created by mistake" );

// Send an email to another user, by username (some wikis do not allow this, or require certain rights):
$bridge->emailuser ( "My Friend", $subject, $message_text );

// Expand page templates (as if they are directly embedded in the text instead of called by name):
$expanded_text = $bridge->expandtemplates ( $text );  // $text contains some wikitext

// Import text from interwiki (on a multi-language installation):
$bridge->import_interwiki ( "My New Page", "Will translate this tomorrow", "de" );

// Import text from a XML file:
$bridge->import_xml ( $xml_file, "These pages were prepared offline - get them!" ); // $xml_file contains the file

// Move (rename) a page (on some wikis you might need certain rights):
// ... by title:
$bridge->move_title ( "Old Title", "New Title", "This name suits it better" );
// ... by pageid:
$bridge->move_title ( 473843, "New Title", "This name suits it better" );  // 473843 is the pageid of the page

// Parse (render to HTML) a page:
$html = $bridge->parse_page ( "My Beloved Page" ); // $html will contain the HTML code for the page.

// Parse (render to HTML) arbitrary text:
$html = $bridge->parse_text ( $text );  // $text contains the text to be parsed to HTML

// Patrol a recentchange by ID (mark it to tell the other users "Already checked - don't waste time on it"):
$bridge->patrol ( 387393783 );  // 387393783 is the RCID you want to mark. On most wikis you need certain rights.

// Protect a page (require mininal rights to change it; on most wikis you need certain rights to do this):
$bridge->protect ( "Disputed Page", "Calm down that edit war!", $protections );  // $protections contains a desc

// Purge page cache:
$bridge->purge ( "Lagging Page" );

// Rollback (last user changes back to another user's change):
$bridge->rollback ( "Vandalized Page", "Cleaning the muck", "Nasty Vandal" );

// Unblock an user:
$bridge->unblock ( "Framed Innocent", "Sorry, that was a mistake - my apologies!" );

// Undelete deleted page revisions:
$bridge->undelete ( "Page Title", "Deleted by mistake", $timestamps ); // $timestamps are these of the deleted revisions

// Undo page edits:
$bridge->undo ( "Page Title", "Misguided changes", $from_revid, $to_revid );  // revision IDs to start with and go until

// Unwatch (mark page as not watched; typically you do this for your account, not for a bot one :-) ):
$bridge->unwatch ( "Page Title" );

// Upload a file:
$bridge->upload_file ( "My Photo.JPG", "Will use it on my user page", $file_body );
// (There are also other page upload functions.)

// Changes user rights:
$bridge->userrights ( "Elected For Sysop", "Giving him the privileges", array ( "sysop" ), array() );
// (The first array contains the groups the user must gain. The second one contains the group s/he must lose.)
// (In most wikis you need certain privileges to change user rights.)

// Watch (mark page as watched; typically you do this for your account, not for a bot one :-) ):
$bridge->watch ( "Page Title" );

Click here for a full list of the Bridge Non-Query functions.


Click here for more examples on using the Bridge interface.


The Assembly line interface


This is the newer interface of Apibot. It follows the logic of an assembly line.

An assembly line typically starts with a person that loads it with things to process (in Apibot parlance, a feeder). Then probably some persons look at the things and remove these that do not fit the line criteria (filters). Further starts a row of persons that work on the processed things. Some modify them (workers), other supply additional details (fetchers). Finally, there come persons who unload the ready product from the line - that is, send the results back to the wiki, or write them somewhere (writers).

The Apibot assembly line consists of PHP objects, linked one to another. They belong to one of the types mentioned above - feeders, filters, workers, fetchers, writers... Every line starts with a feeder. Follows a chain of other objects, linked in the desired order. Finally, there comes a writer that puts the changes into effect - sends them to the wiki, or writes them into a file or database, etc. You just have to string the objects together like Dr. Frankenstein assembles body parts, and to start your monster. :-)

Using the line objects is very convenient, since they are highly specialized. Some have no need for setting up at all - for example, a Filter_Page_IsNonRedirect will expect to be passed wiki pages, will discard all pages that are redirects and will let through the pages that are not redirects. Others may be given some parameters - eg. a Feeder_Query_List_Recentchanges can be given a start and end date and time for the recentchanges it should feed. (And a lot of other parameters, should you need to be more specific.) Some may need to be told explicitly what to do - eg. a Worker_EditPage_ReplaceStrings object must be told which strings to replace in the page text, and with what, or it will do nothing. (Currently most line objects cannot read your mind.)

The logic of stringing line objects is similar to the logic of stringing UNIX command-line utilities, and is intuitive. You can see this in the examples on using the Assembly line interface. Object chains can be branched and merged, thus providing the opportunity for very complex processing.

Using the Assembly line interface is generally easier than the Bridge interface, especially for complex tasks and novice PHP users. Even if you don't speak PHP, most examples are straightforward enough to be easily modified to suit your needs.


For example, let's fix a popular typo in all wiki articles. Copy-paste the code below in your PHP file, and change it according to your goals.

After including the needed files, we create first a feeder - object that will supply the data to be processed. (In this case - all wiki articles.)

Then, we create a filter that will remove the redirects from the conveyer belt (these are not articles), and link it to the feeder.

Then, we create a worker that replaces the typo in the text (and provides a report for the page edit subject). We tell it what to replace and with what, and link it to the filter. (See how the line grows? :-) )

Then, we create a writer that submits the edited page (if it was changed) back to the wiki, and link it to the worker.

Finally, we kick the feeder to start feeding. Voila! :-)

require_once ( dirname ( __FILE__ ) . '/apibot/line/feeders/query/list/allpages.php' );
require_once ( dirname ( __FILE__ ) . '/apibot/line/filters/page/text/non_redirect.php' );
require_once ( dirname ( __FILE__ ) . '/apibot/line/workers/edit/replace_strings.php' );
require_once ( dirname ( __FILE__ ) . '/apibot/line/writers/wiki/edit.php' );

$feeder = new Feeder_Query_List_Allpages ( $core );
$feeder->namespace = 0;  // will move over namespace 0 - that of the articles

$filter_nr = new Filter_IsNonRedirect ( $core ); // redirects are not articles
$filter_nr->link_to ( $feeder );

$replaces = array (
  'string' => 'Teh ',
  'with' => 'The ',
  'report' => '$1 typo(s) fixed',
$worker_rs = new Worker_EditPage_ReplaceStrings ( $core, $replaces );
$worker_rs->link_to ( $filter_nr );

$writer_edit = new Writer_Edit ( $core );
$writer_edit->link_to ( $worker_rs );


Or, let's say that a vandal spoiled hundreds of pages overnight. Why rollback all by hand? Let the bot do the job:

require_once ( dirname ( __FILE__ ) . '/apibot/line/feeders/query/list/usercontribs.php' );
require_once ( dirname ( __FILE__ ) . '/apibot/line/filters/misc/unique.php' );
require_once ( dirname ( __FILE__ ) . '/apibot/line/writers/wiki/rollback.php' );

$feeder = new Feeder_Query_List_Usercontribs ( $core );
$feeder->set_list_param ( 'user', "Nasty Vandal" );  // the vandal username, obviously

$filter = new Filter_Unique ( $core );
$filter->link_to ( $feeder );

$writer = new Writer_Wiki_Rollback ( $core );
$writer->user = "Nasty Vandal";
$writer->link_to ( $filter );


(Create a feeder that supplies all contributions of the vandal. Then link to it a filter that removes duplicates - let's save some traffic. Then, link a writer that rollbacks the page mentioned in the user contribution, or whatever page title contains the data passed to the filter. Finally, start the feeder. :-) )


Click here for more examples on using the Assembly line interface.