You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I dislike benchmarks in general so don't copy this code. I kind of stole it
from Beazley in another great talk he did on concurrency in Python. He said
not to copy it so I'm telling you not to copy it.
$ python manage.py shell
>>> import time
>>> from django.conf import settings
>>> from django.core.cache import caches
>>> for key in settings.CACHES.keys():
... caches[key].clear()
Fool me once, strike one. Feel me twice? Strike three.
Filebased cache has two severe drawbacks.
Culling is random.
set() uses glob.glob1() which slows linearly with directory size.
DiskCache
Wanted to solve Django-filebased cache problems.
Felt like something was missing in the landscape.
Found an unlikely hero in SQLite.
I'd rather drive a slow car fast than a fast car slow
Story: driving down the Grapevine in SoCal in friend's 1960s VW Bug.
Features
Lot's of features. Maybe a few too many. Ex: never used the tag metadata and
eviction feature.
Use Case: Static file serving with read()
Some fun features. Data is stored in files and web servers are good at
serving files.
Use Case: Analytics with incr()/pop()
Tried to create really functional APIs.
All write operations are atomic.
Case Study: Baby Web Crawler
Convert from ephemeral, single-process to persistent, multi-process.
"get" Time vs Percentile
Tradeoff cache latency and miss-rate using timeout.
"set" Time vs Percentile
Django-filebased cache so slow, can't plot.
Design
Cache is a single shard. FanoutCache uses multiple shards. Trick is
cross-platform hash.
Pickle can actually be fast if you use a higher protocol. Default 0. Up to 4
now.
Don't choose higher than 2 if you want to be portable between Python 2
and 3.
Size limit really indicates when to start culling. Limit number of items
deleted.
SQLite
Tradeoff cache latency and miss-rate using timeout.
SQLite supports 64-bit integers and floats, UTF-8 text and binary blobs.
Use a context manager for isolation level management.
Pragmas tune the behavior and performance of SQLite.
Default is robust and slow.
Use write-ahead-log so writers don't block readers.
Memory-map pages for fast lookups.
Best way to make money in photography? Sell all your gear.
Who saw eclipse? Awesome, right?
Hard to really photograph the experience.
This is me, staring up at the sun, blinding myself as I hold my glasses and
my phone to take a photo. Clearly lousy.
Software talks are hard to get right and I can't cover everything related to
caching in 20 minutes. I hope you've learned something tonight or at least
seen something interesting.
Conclusion
Windows support mostly "just worked".
SQLite is truly cross-platform.
Filesystems are a little different.
AppVeyor was about half as fast as Travis.
check() to fix inconsistencies.
Caveats:
NFS and SQLite do not play nice.
Not well suited to queues (want read:write at 10:1 or higher).
Alternative databases: BerkeleyDB, LMDB, RocksDB, LevelDB, etc.
Engage with me on Github, find bugs, complain about performance.
If you like the project, star-it on Github and share it with friends.