English 中文(简体)
Python: Analyzing complex statements during execution
原标题:

I am wondering if there is any way to get some meta information about the interpretation of a python statement during execution.

Let s assume this is a complex statement of some single statements joined with or (A, B, ... are boolean functions)

if A or B and ((C or D and E) or F) or G and H:

and I want to know which part of the statement is causing the statement to evaluate to True so I can do something with this knowledge. In the example, there would be 3 possible candidates:

A
B and ((C or D and E) or F)
G and H

And in the second case, I would like to know if it was (C or D and E) or F that evaluated to True and so on...

Is there any way without parsing the statement? Can I hook up to the interpreter in some way or utilize the inspect module in a way that I haven t found yet? I do not want to debug, it s really about knowing which part of this or-chain triggered the statement at runtime.

Edit - further information: The type of application that I want to use this in is a categorizing algorithm that inputs an object and outputs a certain category for this object, based on its attributes. I need to know which attributes were decisive for the category.
As you might guess, the complex statement from above comes from the categorization algorithm. The code for this algorithm is generated from a formal pseudo-code and contains about 3,000 nested if-elif-statements that determine the category in a hierarchical way like

if obj.attr1 < 23 and (is_something(obj.attr10) or eats_spam_for_breakfast(obj)):
    return  Category1 
elif obj.attr3 ==  Welcome Home  or count_something(obj) >= 2:
    return  Category2a 
elif ...

So aside from the category itself, I need to flag the attributes that were decisive for that category, so if I d delete all other attributes, the object would still be assigned to the same category (due to the ors within the statements). The statements can be really long, up to 1,000 chars, and deeply nested. Every object can have up to 200 attributes.

Thanks a lot for your help!

Edit 2: Haven t found time in the last two weeks. Thanks for providing this solution, it works!

最佳回答

Could you recode your original code:

if A or B and ((C or D and E) or F) or G and H:

as, say:

e = Evaluator()
if e( A or B and ((C or D and E) or F) or G and H ):

...? If so, there s hope!-). The Evaluator class, upon __call__, would compile its string argument, then eval the result with (an empty real dict for globals, and) a pseudo-dict for locals that actually delegates the value lookups to the locals and globals of its caller (just takes a little black magic, but, not too bad;-) and also takes note of what names it s looked up. Given Python s and and or s short-circuiting behavior, you can infer from the actual set of names that were actually looked up, which one determined the truth value of the expression (or each subexpression) -- in an X or Y or Z, the first true value (if any) will be the last one looked up, and in a X and Y and Z, the first false one will.

Would this help? If yes, and if you need help with the coding, I ll be happy to expand on this, but first I d like some confirmation that getting the code for Evaluator would indeed be solving whatever problem it is that you re trying to address!-)

Edit: so here s coding implementing Evaluator and exemplifying its use:

import inspect
import random

class TracingDict(object):

  def __init__(self, loc, glob):
    self.loc = loc
    self.glob = glob
    self.vars = []

  def __getitem__(self, name):
    try: v = self.loc[name]
    except KeyError: v = self.glob[name]
    self.vars.append((name, v))
    return v


class Evaluator(object):

  def __init__(self):
    f = inspect.currentframe()
    f = inspect.getouterframes(f)[1][0]
    self.d = TracingDict(f.f_locals, f.f_globals)

  def __call__(self, expr):
    return eval(expr, {}, self.d)


def f(A, B, C, D, E):
  e = Evaluator()
  res = e( A or B and ((C or D and E) or F) or G and H )
  print  R=%r from %s  % (res, e.d.vars)

for x in range(20):
  A, B, C, D, E, F, G, H = [random.randrange(2) for x in range(8)]
  f(A, B, C, D, E)

and here s output from a sample run:

R=1 from [( A , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 0), ( B , 1), ( C , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 0), ( B , 0), ( G , 1), ( H , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 0), ( B , 1), ( C , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 0), ( B , 1), ( C , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 1)]
R=0 from [( A , 0), ( B , 0), ( G , 0)]
R=1 from [( A , 1)]
R=1 from [( A , 1)]
R=1 from [( A , 1)]
R=0 from [( A , 0), ( B , 0), ( G , 0)]
R=1 from [( A , 0), ( B , 1), ( C , 1)]

You can see that often (about 50% of the time) A is true, which short-circuits everything. When A is false, B evaluates -- when B is also false, then G is next, when B is true, then C.

问题回答

As far as I remember, Python does not return True or False per se:

Important exception: the Boolean operations or and and always return one of their operands.

The Python Standard Library - Truth Value Testing
Therefore, following is valid:

A = 1
B = 0
result = B or A # result == 1

The Python interpreter doesn t give you a way to introspect the evaluation of an expression at runtime. The sys.settrace() function lets you register a callback that is invoked for every line of source code, but that s too coarse-grained for what you want to do.

That said, I ve experimented with a crazy hack to have the function invoked for every bytecode executed: Python bytecode tracing.

But even then, I don t know how to find the execution state, for example, the values on the interpreter stack.

I think the only way to get at what you want is to modify the code algorithmically. You could either transform your source (though you said you didn t want to parse the code), or you could transform the compiled bytecode. Neither is a simple undertaking, and I m sure there are a dozen difficult hurdles to overcome if you try it.

Sorry to be discouraging...

BTW: What application do you have for this sort of technology?

I would just put something like this before the big statement (assuming the statement is in a class):

for i in ("A","B","C","D","E","F","G","H"):
    print i,self.__dict__[i]

"""I do not want to debug, it s really about knowing which part of this or-chain triggered the statement at runtime.""": you might need to explain what is the difference between "debug" and "knowing which part".

Do you mean that you the observer need to be told at runtime what is going on (why??) so that you can do something different, or do you mean that the code needs to "know" so that it can do something different?

In any case, assuming that your A, B, C etc don t have side effects, why can t you simply split up your or-chain and test the components:

part1 = A
part2 = B and ((C or D and E) or F)
part3 = G and H
whodunit = "1" if part1 else "2" if part2 else "3" if part3 else "nobody"
print "Perp is", whodunit
if part1 or part2 or part3:
    do_something()

??

Update:

"""The difference between debug and knowing which part is that I need to assign a flag for the variables that were used in the statement that first evaluated to True (at runtime)"""

So you are saying that given the condition "A or B", that if A is True and B is True, A gets all the glory (or all the blame)? I m finding it very hard to believe that categorisation software such as you describe is based on "or" having a short-circuit evaluation. Are you sure that there s an intent behind the code being "A or B" and not "B or A"? Could the order be random, or influenced by the order that the variables where originally input?

In any case, generating Python code automatically and then reverse-engineering it appears to be a long way around the problem. Why not just generate code with the part1 = yadda; part2 = blah; etc nature?





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签