wiki:WikiDiffTutorial
Warning: Can't synchronize with repository "(default)" (Unsupported version control system "svn": No module named svn). Look in the Trac log for more information.

Version 1 (modified by GreenTea <tgreenwoodgeer@…>, 9 years ago) (diff)

It's not perfect, but it works

Wiki Diff Tutorial

TurboGears 0.8a3

This is a follow up to the following tutorials:

At this point, it is assumed that you have:

  • installed TG
  • completed the previous tutorials
  • are ready to add mo' good stuff

Here is what this tutorial is all about: Wiki Diffs. I'd like to add the ability to add users, user prefs, and to see the diffs of various wiki edits. However, I'm holding off on all the session/user stuff as others are rounding out that framework.

Here we will:

  • refactor the model so that pages have entries
  • refactor the controller to query against pages and entries
  • refactor the controller query code into a helper class
  • add code to diff current against (current - 1)
  • add a diff.kid template to display the diffs

Later, it would be cool to add:

  • track who is modifying the wiki pages
  • show latest changes
  • diff arbitrary versions

Refactor Model

Add the Entry class, and set it up as a 1:M relationship with Page.

  • Page highlights
    class Page(SQLObject):
        entries = MultipleJoin('Entry')
        #data is now in the entry's table
    
  • Entry (new) highlights
    class Entry(SQLObject):
        data = StringCol()
        page = ForeignKey('Page')
    

#model.py

from sqlobject import *
from turbogears.database import PackageHub
from datetime import datetime

hub = PackageHub("toddswiki")
__connection__ = hub

class Page(SQLObject):
    pagename = StringCol(alternateID=True, length=30)
    attached_files = RelatedJoin('UploadedFile')
    entries = MultipleJoin('Entry')

class UploadedFile(SQLObject):
    filename = StringCol(alternateID=True)
    abspath = StringCol()
    size = IntCol()
    referenced_in_pages = RelatedJoin('Page')

class Entry(SQLObject):
    data = StringCol()
    mod_date_time = DateTimeCol(default=datetime.now())
    #mod_date_time = DateTimeCol(default=sqlbuilder.func.now())
    revision = IntCol(default=1)
    author = StringCol(length=30, default="anonymous")
    page = ForeignKey('Page')

Page.createTable(ifNotExists=True)
UploadedFile.createTable(ifNotExists=True)
Entry.createTable(ifNotExists=True)

Create a DB Wrapper

The queries we do are a wee bit complicated, so let's localize the code. What we are querying for is this:

  • Page is a one to many (1:M) relationship to Entries
  • Usually, we want the last entry (by revision #) for a given Page
  • Then, we usually add an entry to the page with a revision # + 1
  • The code could be cleaner, esp with respect to the try/except blocks in the addPageEntry (but it's working, it's late, and it's going in as is :o) )
    page = Page.byPagename(pagename)
    uploads = [item.filename for item in page.attached_files]
    try:
    maxRevision=Entry.select(AND(Entry.q.pageID==Page.q.id, Page.q.pagename==pagename)).max('revision')
    entry = Entry(page            =    page, 
                  data            =    data, 
                  mod_date_time   =     datetime.datetime.now(), 
                  revision        =     maxRevision + 1
    			)
    

Here's the full wrapper class:

#WikiDBWrapper.py

from turbogears import controllers
from model import Page, UploadedFile, Entry
from sqlobject import * 
import os, datetime

class WikiDBWrapper(controllers.Root):
    def getLatestPage(self, pagename, data="getlatestpage\n-------------\n"):
        #verify page exists, if not, return false
        try:
            page = Page.byPagename(pagename)
            uploads = [item.filename for item in page.attached_files]
        except SQLObjectNotFound:
            return False
    
        #get the latest entry
        try:
            maxRev = Entry.select(AND(Entry.q.pageID==Page.q.id, Page.q.pagename==pagename)).max('revision')
            entry = Entry.select(AND(Entry.q.pageID==Page.q.id, Page.q.pagename==pagename, Entry.q.revision==maxRev))[0]
        except:
            #entry does not exist, so create a default entry
            page = self.addPageEntry(pagename, data)
            entry = page.get('entry')
            
        #return all this as a dict
        return dict(pagename=pagename, uploads=uploads, entry=entry)

    def addPageEntry(self, pagename, data="addpageentry\n---------------\n"):
        try:
            page = Page.byPagename(pagename)
            uploads = [item.filename for item in page.attached_files]
            try:
                maxRevision=Entry.select(AND(Entry.q.pageID==Page.q.id, Page.q.pagename==pagename)).max('revision')
                entry = Entry(page          =    page, 
                            data            =    data, 
                            mod_date_time   =    datetime.datetime.now(), 
                            revision        =    maxRevision + 1
                            )
            except:
                entry = Entry(page    =    page, 
                    data              =    data, 
                    mod_date_time     =    datetime.datetime.now(), 
                    revision          =    1
                    )
        except:
            #page doesn't exist, create it
            page = Page(pagename=pagename)
            uploads=[]
            entry = Entry(page        =    page, 
                    data              =    data, 
                    mod_date_time     =     datetime.datetime.now(), 
                    revision          =     1
                    )
        return dict(pagename=pagename, uploads=uploads, entry=entry)

Refactor the Controller

The controller needs to use our new wrapper class.

  • init the wiki db wrapper
    wiki = WikiDBWrapper()
    
  • change the index method to use the wrapper
    @turbogears.expose(html="toddswiki.templates.page")
    def index(self, pagename="FrontPage"):
        page = wiki.getLatestPage(pagename)
        if not page:
    	    raise cherrypy.HTTPRedirect(turbogears.url("/notfound",pagename=pagename))
    
  • pass back the expected dict, only referencing the dict returned from getLatestPage
    return dict(data=content, pagename=page.get('pagename'), \
    uploads=page.get('uploads'), date=page.get('entry').mod_date_time, \
    rev=page.get('entry').revision)
    
  • make similar changes to the edit and save methods

#controller.py

import turbogears, cherrypy, re
from turbogears import controllers, validators
from model import Page, hub, UploadedFile, Entry
from docutils.core import publish_parts
from sqlobject import * 
import os, datetime
from cherrypy.lib.cptools import serveFile
import pkg_resources
from WikiDBWrapper import WikiDBWrapper

#regex to find the uppercaseworduppercaseword thingy that is a wiki word
wikiwords = re.compile(r"\b([A-Z]\w+[A-Z]+\w+)")

#default upload dir to ./uploads
UPLOAD_DIR = cherrypy.config.get("wiki.uploads", os.path.join(os.getcwd(),"uploads"))
if not os.path.exists(UPLOAD_DIR):
    os.makedirs(UPLOAD_DIR) 

#init the wiki db wrapper
wiki = WikiDBWrapper()

class Root(controllers.Root):
    def favicon_ico(self):
         return serveFile(pkg_resources.resource_filename("toddswiki", "static/favicon.ico")) 

    @turbogears.expose(html="toddswiki.templates.welcome")
    def index(self):
        import time
        return dict(now=time.ctime())

    @turbogears.expose(html="toddswiki.templates.page")
    def index(self, pagename="FrontPage"):
        count = cherrypy.session.get('count', 0) + 1
        cherrypy.session['count'] = count
        print 'Counter: %s' % count
    
        page = wiki.getLatestPage(pagename)
        if not page:
            raise cherrypy.HTTPRedirect(turbogears.url("/notfound",pagename=pagename))

        try:
            content = publish_parts(page.get('entry').data, writer_name="html")["html_body"]
        except:
            content = page.get('entry').data

        root = str(turbogears.url("/"))
        content = wikiwords.sub(r'<a href="%s\1">\1</a>' % root, content)
        content = content.encode("utf8")
        return dict(data=content, pagename=page.get('pagename'), uploads=page.get('uploads'), date=page.get('entry').mod_date_time, rev=page.get('entry').revision)

    @turbogears.expose(html="toddswiki.templates.edit")
    def notfound(self, pagename):
        return dict(pagename=pagename, data='Insert Data\n-----------\n', uploads=[])

    @turbogears.expose(html="toddswiki.templates.edit")
    def edit(self, pagename):
        page = wiki.getLatestPage(pagename)    
        return dict(pagename=page.get('pagename'), data=page.get('entry').data, uploads=page.get('uploads'))

    @turbogears.expose(validators=dict(new=validators.StringBoolean()))
    def save(self, pagename, data, submit):
        hub.begin()
        wiki.addPageEntry(pagename=pagename, data=data)
        hub.commit()
        hub.end()
        turbogears.flash("Changes saved")
        new = False
        raise cherrypy.HTTPRedirect(turbogears.url("/%s" % pagename))

    @turbogears.expose(html="toddswiki.templates.page")
    def default(self, pagename):
        return self.index(pagename)
    
    @turbogears.expose(html="toddswiki.templates.pagelist")
    def pagelist(self):
        pages = [page.pagename for page in Page.select(orderBy=Page.q.pagename)]
        return dict(pages=pages)

    @turbogears.expose()
    def upload(self, upload_file, pagename, new, **keywords):
        try:
            p = Page.byPagename(pagename)
        except SQLObjectNotFound:
            turbogears.flash("Must save page first")
            raise cherrypy.HTTPRedirect(turbogears.url("/%s" % pagename))
        
        total_data=''
        while True:
            data = upload_file.file.read(8192)
            if not data:
                break
            total_data += data
        target_file_name = os.path.join(os.getcwd(),UPLOAD_DIR,upload_file.filename)
        try:
            u =  UploadedFile.byFilename(upload_file.filename)
            turbogears.flash("File already uploaded: %s is already at %s" %  (upload_file.filename, target_file_name))
        except SQLObjectNotFound:
            f = open(target_file_name, 'w')
            f.write(total_data)
            f.close
            turbogears.flash("File uploaded successfully: %s saved as : %s" % (upload_file.filename, target_file_name))
            u = UploadedFile(filename=upload_file.filename, abspath=target_file_name, size=0)
            
        Page.byPagename(pagename).addUploadedFile(u)
        raise cherrypy.HTTPRedirect(turbogears.url("/%s" % pagename))


    @turbogears.expose()
    def download(self, filename):
        uf = UploadedFile.byFilename(filename)
        return cherrypy.lib.cptools.serveFile(uf.abspath, "application/x-download", "attachment", uf.filename)

Add Diff Code to the Controller

Add the code to generate the diffs:

  • We use don't use the db wrapper here, but these calls could be in that class, too.
       @turbogears.expose(html="toddswiki.templates.diffs")
       def diffLastEntry(self, pagename, rev):
           try:
               maxRev = Entry.select(AND(Entry.q.pageID==Page.q.id, Page.q.pagename==pagename)).max('revision')
               textA = Entry.select(AND(Entry.q.pageID==Page.q.id, Page.q.pagename==pagename, Entry.q.revision==maxRev))[0].data 
               textB = Entry.select(AND(Entry.q.pageID==Page.q.id, Page.q.pagename==pagename, Entry.q.revision==(maxRev -1) ))[0].data 
           except:
               turbogears.flash("Error in diff")
               raise cherrypy.HTTPRedirect(turbogears.url("/%s" % pagename))
    
           return dict(diffresults=self.diffTwoEntries(textA, textB))
    

  • Use the difflib.Differ class to generate the diffs
     def diffTwoEntries(self, textA, textB):
         from difflib import Differ
         d = Differ()
         return d.compare(textB.splitlines(1), textA.splitlines(1))
    

Add the diff.kid template

I'm just using a simple unordered list to display the results. Something fancy would be nice.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:py="http://purl.org/kid/ns#"
    py:extends="'master.kid'">

<head>
    <meta content="text/html; charset=UTF-8" http-equiv="content-type" py:replace="''"/>
    <title>TurboGears Wiki</title>
</head>

<body>
    <ul>
        <li py:for="diff in diffresults">
            ${diff}
        </li>
    </ul>
    
</body>
</html>

Notes

  • removed the references to the 'new' variable in the controller and in the templates from the previous tutorial. using the database to track state in this example.
  • files are zipped and attached to this doc

References

 http://sqlobject.org/SQLObject.html  http://www-128.ibm.com/developerworks/library/os-pythonsqlo/?ca=drs  http://groups.google.com/group/turbogears/search?group=turbogears&q=sqlobject&qt_g=1&searchnow=Search+this+group

Attachments