Wikidata:Pywikibot - Python 3 Tutorial/Talk Page Messages

From Wikidata
Jump to navigation Jump to search

Bots can post helpful messages and templates on talk pages. This tutorial will provide you with everything you need to know to do that.

We will use the following lines to get the page object of the test.wikidata main page's talk page.

import pywikibot

site = pywikibot.Site("test", "wikidata")
repo = site.data_repository()
page = pywikibot.Page(site, "Wikidata talk:Main Page")

print(type(page))
print(dir(page))

The print statements show us that our last call returns a pywikibot.page.Page object. This object has the following methods:

['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_cmpkey', '_cosmetic_changes_hook', '_getInternals', '_get_parsed_page', '_latest_cached_revision', '_link', '_revisions', '_save', 'applicable_protections', 'aslink', 'autoFormat', 'backlinks', 'botMayEdit', 'canBeEdited', 'categories', 'change_category', 'content_model', 'contributingUsers', 'contributors', 'coordinates', 'data_item', 'data_repository', 'defaultsort', 'delete', 'editTime', 'embeddedin', 'encoding', 'exists', 'expand_text', 'extlinks', 'fullVersionHistory', 'full_url', 'get', 'getCategoryRedirectTarget', 'getCreator', 'getDeletedRevision', 'getLatestEditors', 'getMovedTarget', 'getOldVersion', 'getRedirectTarget', 'getReferences', 'getRestrictions', 'getTemplates', 'getVersionHistory', 'getVersionHistoryTable', 'image_repository', 'imagelinks', 'interwiki', 'isAutoTitle', 'isCategory', 'isCategoryRedirect', 'isDisambig', 'isEmpty', 'isFlowPage', 'isImage', 'isIpEdit', 'isRedirectPage', 'isStaticRedirect', 'isTalkPage', 'is_flow_page', 'iterlanglinks', 'itertemplates', 'langlinks', 'lastNonBotUser', 'latestRevision', 'latest_revision', 'latest_revision_id', 'linkedPages', 'loadDeletedRevisions', 'markDeletedRevision', 'move', 'moved_target', 'namespace', 'oldest_revision', 'pageAPInfo', 'permalink', 'preloadText', 'previousRevision', 'previous_revision_id', 'properties', 'protect', 'protection', 'purge', 'put', 'put_async', 'removeImage', 'replaceImage', 'revision_count', 'revisions', 'save', 'section', 'sectionFreeTitle', 'set_redirect_target', 'site', 'templates', 'templatesWithParams', 'text', 'title', 'titleForFilename', 'titleWithoutNamespace', 'toggleTalkPage', 'touch', 'undelete', 'urlname', 'userName', 'version', 'watch']

There are two methods which are useful for what we are trying to do. We can either get the text of the page (using text()) or the save() method directly, which can also take certain arguments.

Let us first look at the save() method. If you open a terminal and navigate to the projfect directory you can use grep to find the source of this method:

$ grep -rn "def save"
..
pywikibot/page.py:1046:    def save(self, summary=None, watch=None, minor=True, botflag=None,
...

The -rn is used to tell grep to search through all directories recursivly (-r) and to list the line numbers of where the expression was found (-n). The text between the double quotes is the one that is matched.

We can see that we need to go into the page.py file and look at the line 1046 for the definition of the function (Please note that the line number will change as pywikibot is under constant development). The function header looks like this:

    def save(self, summary=None, watch=None, minor=True, botflag=None,
             force=False, async=False, callback=None,
             apply_cosmetic_changes=None, **kwargs):

Now we can see all the things we need to pass to the function (minus the self, which is already provided by the object). The docstring informs us that summary is a string that contains the edit summary, watch enables us to watch the page after saving, and minor can be used to mark the edit as a minor edit. The botflag argument marks the edit as a bot edit. You should set this to True to make sure that people can interprit the edit-history easier. If a page is protected from bot edits (page.botMayEdit() returns False), then force=True can overide this setting. Put a lot of thought into, if that is a good idea. We can ignore the other keywords at the moment. The **kwargs is the next argument that interests us. This means that we can pass additional keyword-arguments (kwargs) that are interpreted by other functions. To know where the **kwargs are used we will have to read the code below the function header. If you followed the function calls correctly you should be looking at the line def editpage() in site.py (otherwise try to grep it). This is the place where a lot of the **kwargs are handled. The docstring shows us that the keyword appendtext can be used to add content to the page. We will try this with our next script:

import pywikibot

site = pywikibot.Site("test", "wikidata")
repo = site.data_repository()
page = pywikibot.Page(site, "Wikidata talk:Main Page")

heading = "== Test edit =="
content = "This is another test edit"
message = "\n\n{}\n{} --~~~~".format(heading, content)

page.save(summary="Testing", watch=None, minor=False, botflag=True,
             force=False, async=False, callback=None,
             apply_cosmetic_changes=None, appendtext=message)

As you can see we just define a message that we would like to post and make a call to the page.save method. We pass all the arguments with keywords to make the script more understandable and the last keywords is the only keyword we supply to the **kwargs dictionary. This program will post a new section on https://test.wikidata.org/wiki/Wikidata_talk:Main_Page, and after running the program you can look at that page to see if everything worked.