
As a data scientist/engineer you may very likely work with big dataset.
If you’re trying to pickle big files over 2GB, you may encounter this on MAC OS X:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
|
OSError Traceback (most recent call last) <ipython-input-5-5b2910a2f7f4> in <module>() ----> 1 df.to_pickle('test.pkl') /Users/.../lib/python3.5/site-packages/pandas/core/generic.py in to_pickle(self, path) 1175 """ 1176 from pandas.io.pickle import to_pickle -> 1177 return to_pickle(self, path) 1178 1179 def to_clipboard(self, excel=None, sep=None, **kwargs): /Users/.../lib/python3.5/site-packages/pandas/io/pickle.py in to_pickle(obj, path) 18 """ 19 with open(path, 'wb') as f: ---> 20 pkl.dump(obj, f, protocol=pkl.HIGHEST_PROTOCOL) 21 22 OSError: [Errno 22] Invalid argument
|
This has been a long reported issue over a year, as you can find in Issue 24658, not exactly pickle related though.
Fortunately it’s fixed now but may need some time before release, one can either:
- patch the fix
- just split the target file to 2GB-
近期评论