Pickle vs JSON — Which is Faster?

If you’re here for the short answer — JSON is 25 times faster in reading (loads) and 15 times faster in writing (dumps). I’ve been thinking about this since I wrote the ObjectProperty and JsonProperty classes for Google App Engine. They’re both easy to use and work as expected. I did have some trouble with ObjectProperty but I figured it out in the end.

As my previous posts mention, the ObjectProperty class uses Python’s pickle module, while JsonProperty works with simplejson (bundled with Python 2.6 and above, available through django.utils in Google App Engine). I decided to measure the performance of these two.

Unfortunately I couldn’t do much benchmarking on top of Google App Engine since there’s too much lag between the application server and Google’s Datastore so I decided to write simple benchmarks and find out which is faster — pickle or JSON. I started out by constructing a dataset which I’ll be pickling and “jsoning”, which resulted in some random lists, dictionaries and nested dictionaries containing lorem ipsum texts.

I then used Python’s timeit module to measure how long it took to “dumps” and “loads” the dataset using pickle and simplejson. I also measured the resulted pickle/json strings length to see which will be smaller in size, and guess what — JSON wins in all rounds. I ran the tests 10, 20, 50, 100, 500 and 1000 times for reading, writing and length comparison. Below are three charts illustrating the results:

As you see, dumps in JSON are much faster — by almost 1500%, and that is 15 times faster than Pickling! Now let’s see what happens with loads:

Loads shows even more goodies for JSON lovers — a massive 2500%, how’s that!? Of course some of you might be concerned with size, memory usage, etc. Since there’s no good method of measuring the actual bytes, I used Python’s len function to simply measure the number of characters in the resulting pickle/JSON string.

So yes, JSON is faster in all three aspects. If you’d like to experiment yourself, feel free to use the source code I wrote. Beware of running the 500/1000 tests, those can take hours ;)

The benchmark was done on an Ubuntu 10.10 64-bit machine with Python 2.6 installed, but I don’t think that results will be different on others. The conclusion to this is that if you need to store complex objects, such as functions, class instances, etc., you have to use pickle, while if you’re only looking for a way to store simple objects, lists and nested dictionaries, then you’re better off with JSON.

Thank you for reading and retweeting ;)

Update: If you’re sticking to Pickling objects and have the freedom to use C compiled libraries, then go ahead with cPickle instead of pickle, although that still lacks behind JSON (twice in loading and dumping). As to App Engine, I tried running a short benchmark with cPickle vs simplejson from the django.utils package, results were better for pickle, but still not enough to beat JSON which is 30% faster. I know there are other libraries worth mentioning here, but my hands are tied since I’m running App Engine with Python 2.5 and not allowed to install extra modules ;) Cheers and thanks for the comments!