UnpicklingError: NEWOBJ class argument isn't a type object

I'm using a custom pickler that replaces any un-pickleable objects (such as sockets or files) with a string representation of them, based on the code from Shane Hathaway here: Python: Pickling a dict with some unpicklable items

It works most of the time, but when I try to unpickle a Django HttpResponse, I get the following error: UnpicklingError: NEWOBJ class argument isn't a type object

I have no clue what the error actually means. If it pickles okay, why should it not be able to unpickle? I've found three references to this error on Google, but no real explanation of why it occurs or how to fix it.

Here is my code:

from cPickle import Pickler, Unpickler, UnpicklingError

class FilteredObject:
    def __init__(self, about):
        self.about = about
    def __repr__(self):
        return 'FilteredObject(%s)' % repr(self.about)

class MyPickler(object):
    def __init__(self, file, protocol=2):
        pickler = Pickler(file, protocol)
        pickler.persistent_id = self.persistent_id
        self.dump = pickler.dump
        self.clear_memo = pickler.clear_memo

    def persistent_id(self, obj):
           if not hasattr(obj, '__getstate__') and not isinstance(obj,
        (basestring, bool, int, long, float, complex, tuple, list, set, dict)):
            return ["filtered:%s" % str(obj)]
        else:
            return None

class MyUnpickler(object):
    def __init__(self, file):
        unpickler = Unpickler(file)
        unpickler.persistent_load = self.persistent_load
        self.load = unpickler.load
        self.noload = unpickler.noload

    def persistent_load(self, obj_id):
        if obj_id[0].startswith('filtered:'):
            return FilteredObject(obj_id[0][9:])
        else:
            raise UnpicklingError('Invalid persistent id')

###### serialize to file

f = open('test.txt','wb')
p = MyPickler(f)
p.dump(data)
f.close()

###### unserialize from file

f = open('test.txt','rb')
pickled_data = f.read()
f.seek(0)
u = MyUnpickler(f)
data = u.load()    

Successful pickling happens in two steps, the pickle.dump by the Pickler, and the pickle.load by the Unpickler. The Pickler converts an object to a serialized format (eg strings), and the Unpickler digests the pickled object and generates a new object that should be equivalent to the original. Pickle has several functions that can be used to dump pickles... so part 1 is getting the objects to convert to the serialized format. With a custom pickler, you can bypass some of python's safeguards to pickle objects that pickle itself can't pickle. Following your example, I could create a simple Pickler, that would convert lambdas and whatnot to strings by converting each object to it's __repr__ .

>>> x = lambda x:x
>>> repr(x)
'<function <lambda> at 0x4d39cf0>'
>>> 
>>> import pickle
>>> l = repr(x)
>>> pickle.dumps(l)
"S'<function <lambda> at 0x4d39cf0>'np0n."

This would definitely be pickleable, as it's a string. However, the problem is how to build an object from the saved string. For a lambda, if you had a function that could look-up the memory reference noted in the string, you could get the object back... but only if you still had the original object alive in your memory... so that's no good. So the trick of converting to strings only works when there's enough information contained in the __repr__ string to build a new object from the stored string's information. You could get fancier about what you store, but you will eventually run into problems by converting the objects to strings, most likely. So this is a case where your Pickler would work, but your Unpickler would fail.

Dictionaries are interesting to pickle, because they can have anything in them, and usually do pretty quickly. One of the nastiest dictionaries is the globals() dictionary. To serialize it, I'd use dill, which can serialize almost anything in python.

>>> import dill
>>> 
>>> def foo(a):
...   def bar(x):
...     return a*x
...   return bar
... 
>>> class baz(object):
...   def __call__(self, a,x):
...     return foo(a)(x)
... 
>>> b = baz()
>>> b(3,2)
6
>>> c = baz.__call__
>>> c(b,3,2)
6
>>> g = dill.loads(dill.dumps(globals()))
>>> g
{'dill': <module 'dill' from '/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/dill-0.2a.dev-py2.7.egg/dill/__init__.pyc'>, 'c': <unbound method baz.__call__>, 'b': <__main__.baz object at 0x4d61970>, 'g': {...}, '__builtins__': <module '__builtin__' (built-in)>, 'baz': <class '__main__.baz'>, '_version': '2', '__package__': None, '__name__': '__main__', 'foo': <function foo at 0x4d39d30>, '__doc__': None}

Actually, dill registers it's types into the pickle registry, so if you have some black box code that uses pickle and you can't really edit it, then just importing dill can magically make it work without monkeypatching the 3rd party code.

Or, if you want the whole interpreter session sent over as an "python image", dill can do that too.

>>> # continuing from above
>>> dill.dump_session('foobar.pkl')
>>>
>>> ^D
dude@sakurai>$ python
Python 2.7.5 (default, Sep 30 2013, 20:15:49) 
[GCC 4.2.1 (Apple Inc. build 5566)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('foobar.pkl')
>>> c(b,3,2)
6

Dill also has some good tools for helping you understand what is causing your pickling to fail when your code fails.

链接地址: http://www.djcxy.com/p/64826.html

上一篇: 自定义pickle行为以实现向后兼容

下一篇: UnpicklingError:NEWOBJ类参数不是一个类型对象