Warning: Can't synchronize with repository "(default)" (Unsupported version control system "svn": No module named svn). Look in the Trac log for more information.

Ticket #1765 (closed task: fixed)

Opened 9 years ago

Last modified 9 years ago

Document a long standing problem in SO and bug SO maintainer to fix it

Reported by: faide Owned by: Chris Arndt
Priority: normal Milestone: 1.0.x bugfix
Component: Installation Version: 1.0.4.4
Severity: normal Keywords: needs review
Cc:

Description

See Jason's post on how he fixed the TG locking issue in this thread:

 http://groups.google.fr/group/turbogears/browse_thread/thread/5d14e2ec645f680c

Change History

comment:1 Changed 9 years ago by Chris Arndt

  • Owner changed from anonymous to Chris Arndt
  • Type changed from defect to task
  • Component changed from TurboGears to Documentation
  • Summary changed from Document a long standing problem in SO and bug Ian Bickin to Document a long standing problem in SO and bug SO maintainer to fix it

comment:2 Changed 9 years ago by Chris Arndt

The Problem

From Richard Clark's initial post on the above mentioned thread:

After a random interval running, anything between 5 minutes and a day, the app concerned will stop responding to requests - ie, all in- progress or new requests will simply hang. An strace indicates the process is waiting on a futex, which suggests some kind of lock acquire jamup.

The Diagnosis

From Jason's final post on the thread:

It's a bug in SQLObject. We're using an old version, but the same code is in the newer versions too.

Below is an explanation of the issue.

Postgres has all sorts of locks that are created when you do many different things. The threadSafeMethod also creates a lock, but this one is local to a single python process.

Let's go over an example. You have two threads (A and B) that are in the middle of different db transactions. Thread A has locks on tables I and

  1. Thread B creates a row on table III that has a foreign key to table I,

it will wait for thread A's lock to be given up before actually creating that row. Then if thread A gets context again and tries to create or access a row in table III it will lock because of the threadSafeMethod's lock.

Postgres will see one thread trying to create a row and the other thread (the one that the former thread is waiting for a lock from) idle in transaction. Neither one will continue doing anything.

Usually other requests come in so new threads pick them up and try to query tg-visit, but the db has certain types of locks on tg-visit rows so those threads end up waiting for thread A and thread B to give up, even though they never will.

It turns out that the code (see below - Ed.) was originally put into SQLObject on Alberto's request:

 http://sourceforge.net/tracker/index.php?func=detail&aid=1407684&group_id=74338&atid=540674

I've tried recreating his problem, but was never able to.

The Solution

The file is declarative.py and the problem exists with the threadSafeMethod.

If you want the quick fix, comment out this line in  declarative.py:

cls.__init__ = threadSafeMethod(lock)(cls.__init__)

We've been running a modified version of sqlobject for months in production with no problems.

comment:3 Changed 9 years ago by Chris Arndt

It appears that the code section (in class DeclarativeMeta) containing the line from the above mentioned fix does not exists any more in the latest trunk version (revision 3396) of declarative.py. It is still present in the latest release version of SQLObject (10.0) though. Can someone confirm that this issue still exists when using SQLObject from SVN?

comment:4 Changed 9 years ago by faide

  • Milestone changed from 1.1 to 1.1.1

comment:5 Changed 9 years ago by faide

  • Milestone changed from 1.6 to 1.5

comment:6 Changed 9 years ago by Chris Arndt

  • Owner changed from Chris Arndt to faide
  • Component changed from Documentation to Installation
  • Milestone changed from 1.5 to 1.1

This problem was fixed in SQLObject 0.10.1.

Are we ok to bump up the SQLObject dependency for TG 1.0 and 1.1 to this version? I get no (db-related) test failures in TG 1.0 or 1.1 with this SQLObject version and Python 2.4 / 2.5. I haven't tested with Python 2.3 though.

comment:7 Changed 9 years ago by Chris Arndt

  • Milestone changed from 1.1 to 1.0.x bugfix

comment:8 Changed 9 years ago by Chris Arndt

  • Owner changed from faide to Chris Arndt
  • Keywords needs review added
  • Status changed from new to assigned

comment:9 Changed 9 years ago by faide

Ok to bump SO version. You asked on the ML and no one yelled.

comment:10 Changed 9 years ago by Chris Arndt

  • Status changed from assigned to closed
  • Resolution set to fixed

Fixed in r5386.

Note: See TracTickets for help on using tickets.