{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Brief comparison between standalone zlib vs zlib in Blosc "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook compares the performance of a standalone zlib compression library against the one that runs inside Blosc.\n",
    "\n",
    "The run below has been executed on a machine with a Xeon E3-1240 v3 @ 3.40GHz with 4 physical cores and hyperthreading support."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import bcolz\n",
    "from bcolz.utils import human_readable_size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Store an array in bcolz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'762.94 MB'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Use a more or less random dataset\n",
    "np.random.seed(11)\n",
    "a = np.random.random_integers(0, 1000, 100*1000*1000)\n",
    "human_readable_size(a.nbytes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('bcolz version:', '1.0.1.dev35')\n"
     ]
    }
   ],
   "source": [
    "print(\"bcolz version:\", bcolz.__version__)\n",
    "bcolz.cparams.setdefaults(cname='zlib', clevel=3, shuffle=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 8.56 s, sys: 2.32 s, total: 10.9 s\n",
      "Wall time: 1.88 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "carray((100000000,), int64)\n",
       "  nbytes := 762.94 MB; cbytes := 129.95 MB; ratio: 5.87\n",
       "  cparams := cparams(clevel=3, shuffle=1, cname='zlib', quantize=0)\n",
       "  chunklen := 100000; chunksize: 800000; blocksize: 131072\n",
       "[921 703  80 ..., 506 453 366]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Create a dataset (compress)\n",
    "chunks = (100*1000,)   # 800 KB cache size\n",
    "%time ca = bcolz.carray(a, chunklen=chunks[0])\n",
    "ca"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 1.6 s, sys: 344 ms, total: 1.95 s\n",
      "Wall time: 395 ms\n"
     ]
    }
   ],
   "source": [
    "# Get a numpy array (decompress)\n",
    "%time a2 = ca[:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at the wall and cpu times, we see that zlib can use multiple threads thanks to Blosc machinery."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Store the array using zlib in HDF5 in-memory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import h5py\n",
    "import tempfile\n",
    "import operator\n",
    "\n",
    "# Some utilities to use HDF5 files in memory\n",
    "def h5fmem(**kwargs):\n",
    "    \"\"\"Convenience function to create an in-memory HDF5 file.\"\"\"\n",
    "\n",
    "    # need a file name even tho nothing is ever written\n",
    "    fn = tempfile.mktemp()\n",
    "\n",
    "    # file creation args\n",
    "    kwargs['mode'] = 'w'\n",
    "    kwargs['driver'] = 'core'\n",
    "    kwargs['backing_store'] = False\n",
    "\n",
    "    # open HDF5 file\n",
    "    h5f = h5py.File(fn, **kwargs)\n",
    "\n",
    "    return h5f\n",
    "\n",
    "\n",
    "def h5d_diagnostics(d):\n",
    "    \"\"\"Print some diagnostics on an HDF5 dataset.\"\"\"\n",
    "    \n",
    "    print(d)\n",
    "    nbytes = reduce(operator.mul, d.shape) * d.dtype.itemsize\n",
    "    cbytes = d._id.get_storage_size()\n",
    "    if cbytes > 0:\n",
    "        ratio = nbytes / cbytes\n",
    "    else:\n",
    "        ratio = np.inf\n",
    "    r = '  compression: %s' % d.compression\n",
    "    r += '; compression_opts: %s' % d.compression_opts\n",
    "    r += '; shuffle: %s' % d.shuffle\n",
    "    r += '\\n  nbytes: %s' % human_readable_size(nbytes)\n",
    "    r += '; nbytes_stored: %s' % human_readable_size(cbytes)\n",
    "    r += '; ratio: %.1f' % ratio\n",
    "    r += '; chunks: %s' % str(d.chunks)\n",
    "    print(r)\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('h5py version:', '2.6.0')\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<HDF5 file \"tmph1tSPR\" (mode r+)>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print(\"h5py version:\", h5py.__version__)\n",
    "h5f = h5fmem()\n",
    "h5f"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 6.48 s, sys: 72 ms, total: 6.55 s\n",
      "Wall time: 6.56 s\n",
      "<HDF5 dataset \"h1\": shape (100000000,), type \"<i8\">\n",
      "  compression: gzip; compression_opts: 3; shuffle: True\n",
      "  nbytes: 762.94 MB; nbytes_stored: 128.66 MB; ratio: 5.0; chunks: (100000,)\n"
     ]
    }
   ],
   "source": [
    "# Create a dataset (compress)\n",
    "%time ha = h5f.create_dataset('h1', data=a, chunks=chunks, compression='gzip', compression_opts=3, shuffle=True)\n",
    "h5d_diagnostics(ha)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 1.09 s, sys: 200 ms, total: 1.29 s\n",
      "Wall time: 1.29 s\n"
     ]
    }
   ],
   "source": [
    "# Get a numpy array (decompress)\n",
    "%time a2 = ha[:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we see that the zlib filter (named 'gzip' in HDF5) only use a single thread (similar wall and cpu times). "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using zarr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('zarr version', '1.0.1.dev11+dirty')\n"
     ]
    }
   ],
   "source": [
    "import zarr\n",
    "print('zarr version', zarr.__version__)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 6.96 s, sys: 1.48 s, total: 8.44 s\n",
      "Wall time: 2.5 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "zarr.core.Array((100000000,), int64, chunks=(100000,), order=C)\n",
       "  compression: blosc; compression_opts: {u'cname': u'zlib', u'shuffle': 1, u'clevel': 3}\n",
       "  nbytes: 762.9M; nbytes_stored: 129.2M; ratio: 5.9; initialized: 1000/1000\n",
       "  store: __builtin__.dict"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Create a dataset (compress)\n",
    "%time za = zarr.array(a, chunks=chunks, compression=\"blosc\", compression_opts=dict(cname='zlib', clevel=3))\n",
    "za"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 1.43 s, sys: 316 ms, total: 1.74 s\n",
      "Wall time: 680 ms\n"
     ]
    }
   ],
   "source": [
    "# Get a numpy array (decompress)\n",
    "%time a2 = za[:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see how zarr can also use zlib in multithreaded mode too (via Blosc)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Discussion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "The zlib library integrated in Blosc can make use of its multithreading machinery, giving to better performance.\n",
    "\n",
    "On the other hand, it is worth noting that Blosc splits chunks in internal blocks  (in this case, 128 KB, which fits in L2 comfortably) in order to make better use of the caches, so making the compression/decompression faster too.  You can notice that this does not affect compression ratios a lot.\n",
    "\n",
    "Finally, bcolz seems to compress/decompress noticeably faster than zarr in this case.  This is probably due to zarr only using 4 threads internally, whereas bcolz can use the full 8 (logical processors).  That means that Intel hyper-threading implementation can be used for a good advanatge here (1.3x faster in compression and up to 1.7x faster in decompression)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}