Aug 252011

Recently, I came across this post from PetaPixel about JPEGmini

It details a company that has come up with a way to optimize an image so that it visually looks virtually identical to the original, but takes up a fraction of the original file size.That’s awesome! I even tested it against some image files:

Black Sea Nettles – JPEG shrunk using custom algorithm.
  • 24 megapixel JPEG / 12MB in size => 1.6MB resulting file after JPEGmini.
  • 12 megapixel JPEG / 4.8MB in size => 1.1MB resulting file after JPEGmini.
I performed a difference test against the original and the JPEGmini version. They are virtually identical. If you take the resulting difference image and adjust the levels, there are in fact different, but the difference can be attributed to minor random noise vs compression artifacts. Amazing!
DIY JPEG Compression Improvement
Being the kind of guy I am, I was wondering… how did they do this? According to the JPEGmini site:
  • improved algorithms
  • compressing to a level just before artifacting becomes an issue
Well, thinking that I could at least get some improvement, I leverage ImageMagick and some shell scripting.
I tested with the same images and came up with the following:
  • 24MP / 12MB => 2.8MB (1.7MB with updates below) (jpeg compression quality @73%)
  • 12MP / 4.6MB => 1.1MB (jpeg compression quality @87%)
The images produced were likewise virtually the same and a difference mask between the two in photoshop shows nothing, unless you auto-level. The amount of difference between my reduced images and JPEGmini’s reduced images were comparable. No new JPEG compression algorithm on my part, just applying basic JPEG compression guidelines from FileFormat’s JPEG Page:

The JPEG library supplied by the Independent JPEG Group uses a quality setting scale of 1 to 100. To find the optimal compression for an image using the JPEG library, follow these steps:

  • Encode the image using a quality setting of 75 (-Q 75). If you observe unacceptable defects in the image, increase the value, and re-encode the image. If the image quality is acceptable, decrease the setting until the image quality is barely acceptable. This will be the optimal quality setting for this image.  Repeat this process for every image you have (or just encode them all using a quality setting of 75).

The process that my script goes through:

  1. Compress original image at 99% image quality
  2. Do a comparison metric between the original and the new image.
  3. If the compression has resulted in a difference greater than a certain threshold, use the compression quality percentage prior to the current one.
  4. If difference is less than threshold, decrease image quality by 1% and repeat from step #2.
Using this process against the whole image, I could reduce <12MP files down to ranges similar to what JPEGmini was able to achieve, albeit slower. For files >12MP, I would achieve great savings in space, but not as great as JPEGmini, for a given threshold of difference.
Edit: Upon reviewing my notes, it looks like the threshold for <12MP and for >12MP is different from the JPEGmini files. When optimizing with the expanded threshold for larger files, I am able to get comparable to what JPEGmini gets.
How I Did The Compression Process
  • Use ImageMagick’s “compare” to generate “image distortion” readings:
  • convert -quality PERCENTAGE% original-image.jpg compressed-image.jpg
  • compare -verbose -metric mae original-image.jpg compressed-image.jpg
  • Using the example file from JPEGmini, establish a baseline:
    • 24MP image = ceiling of 370 points of distortion
    • 12MP image = ceiling of 265 points of distortion
  • Starting at 99% quality for JPEG compression, continue to reduce quality until the distortion measured exceeds the baseline. If it does, use the compression level from before.
  • The level of acceptable compression will differ from image to image. If the image has more pixels/images, then it will be able to contain more points of distortion before the image quality suffers.
    Optimizations (speed)
    To optimize the speed, especially for larger images, I scaled down the original image and performed the tests on the scaled down images. Once the ideal was located, I performed the compression on the original with the quality determined from the scaled down image.  For very large images, I take a crop from center of the image, which results in a more accurate quality prediction.
    I can see why most applications don’t both with this process… it’s basically exhaustive testing to determine how much you can compress before image quality is degraded unacceptably. In addition to scaling down the image, I could also do quality percentage point skipping and backtracking when quality is degraded.
    What would be ideal is an API and library set that does an image quality analysis and gives you an optimally compressed JPEG out of the box. That is the value add of what JPEGmini is offering, and I think that it is a technology which any company involved with image storage should look into.
    The Code
    Note, the shell script/etc are not published, as the goal was to explore whether or not it could be done. Based on the description above, one can easily write the appropriate wrapper script or use the appropriate java/python/php/etc hooks.
    • Improved optimization to the quality locating logic resulted in a reduction from 5 minutes to 13 seconds to locate the quality setting and compress a 24MP image.
    • Planning on creating a Lightroom 3 export plugin for this scripted method.
    • Planning on trying to reduce the time required from 13 seconds down to under 10 seconds.
    Aug 092011


    This is a followup to my previous article, “Managing Resources – Part I“. It’s been half a year since I wrote that article and it has been a mixed blessing kind of trip. So, here are some of my thoughts from the attempt at managing my storage through complexity, and my thoughts going forward.

    How Did It Go?

    If you read part 1, you’ll note I went with a big box, lots of cpu cores, ram, and storage. I opted for complexity and multiple layers of abstraction to get all the features I wanted. I opted to get ZFS by emlpoying Nexenta and other Solaris derived solutions. Virtual machines to get the best of both worlds, and leveraging PCI-passthru to get a desktop experience on a virtualized server platform.

    In short, this did not work out very well. :(

    What Went Wrong

    From the start, there were problems:

    • PCI passthru only worked with a small subset of devices, and the devices it did work with were primarily storage controllers. Only specific, expensive, storage controllers. Display controllers were more or less out of the question, save as CUDA processor usages.
    • Without PCI passthrough for the storage controllers that I had, the option of using Nexenta in an ESXi VM with direct access to the underlying hardware to present storage options was a bust as well. Tests with Nexenta running on top of VMFS as presented through ESXi resulted in horrible performance numbers. Think something along the lines of 5MB/second read/write.
    • Juggling 10-12 1TB hard drives proved to be a nightmare in a conventional tower case. Likewise, the heat and noise generated by the drives alone made running the server 24/7 a problematic proposition.
    • The issues proved to be made worse by the fact that the OpenSolaris instance I was using had an advanced zpool version, which was not importable, at the time. This made importing and accessing the original data in my array a non-starter.
    Where Are Things Now
    In the end, I opted to get the  OWC Elite Quad Pro and load it up with 4 x 1TB drives, setting it up for Raid 5, and hooking it up to the Airport Extreme to use as a time machine drive for laptops and as personal dumping space, while the big server is being debugged.
    The main server is currently a mass of sata and power cables. I’ve added 2 quad port pcie x4 SATA cards to the server to handle more drives. I’m planning on re-imaging the box and loading it up with Linux(Ubuntu or Centos) and employing ZFSonLinux to implement ZFS storage pools and get RAIDZ2 up and running. This would remove the need for virtual machines, as I would get ZFS in Linux and would be able to run the applications I wanted to run, natively.
    I’ve also acquired 2 x 60GB SSD(s) to serve as L2ARC caching and ZIL logging for ZFS to improve the performance of the system. The intent is to have these two SSD(s) mirrored and having the mirrored device used for the caching and logging.
    What Needs To Get Done
    The remaining road block now is how to get the data off of the existing ZPOOL(solaris) and onto the new ZPOOL(linux). I’m thinking it will take the following form:
    1. Build new Linux environment / ZPOOL environment. (pool1)
    2. Export pool1 from Linux environment.
    3. Boot into old OpenSolaris environment and import all pools (pool0, pool1)
    4. While booted up in OpenSolaris, zfs send from pool0 to pool1.
    5. Once completed, export pool1 from OpenSolaris environment.
    6. Boot into Linux environment and import pool1.
    7. Confirm data is all good, then re-purpose all disks in the original pool0 OpenSolaris environment.
    The current capacity of the server is 14 SATA ports:
    • 6 on the motherboard :: mb0
    • 4 on quad sata 1  :: qs1
    • 4 on quad sata 2  :: qs2
    After taking into account the 8 disks for the existing array, this leaves 6 ports available for the new environment to be built off of. Building small RAIDZ groups, I’m planning on grouping the space as follows:
    • RAIDZ :: mb0:2 + qs1:0 + qs2:0    [ + 2TB ]
    • RAIDZ :: mb0:3 + qs1:1 + qs2:1    [ + 2TB ]
    This way, the six ports are spread across the motherboard and each of the 2 additional controllers. When the data is transferred over and the original 8 disks are freed up, they will be added to pool1 in a similar fashion:
    • RAIDZ :: mb0:4 + qs1:2 + qs2:2   [ + 2TB ]
    • RAIDZ :: mb0:5 + qs1:3 + qs2:3   [ + 2TB ]
    The remaining ports on the motherboard, MB0:0 and MB0:1 will be used for the SSD(s) for the L2ARC and the ZIL. As well as for the booting up of the Linux OS.
    The effective space will be 8TB of usable RAIDZ data. However, it is 8TB spread out over 12 x 1TB drives vs the original 7TB spread out over 8 x 1TB drives. The arrangement also ensures that should one controller card fail, the arrays will be recoverable.
    Upgrading storage and/or resilvering a replacement disk would also be significantly sped up by the fact that each RaidZ group is only comprised of 3 disks.
    If/when I decided to upgrade to 2TB disks, I can do so 3 at a time. Though ideally, they would all be RaidZ2 with perhaps more disks per group, but funds are no infinite. :( What this configuration buys me is more IOPS across the disks, and hopefully better performance.
    Future Upgrades & Potential Issues
    Once this is up and running, I may bring the OWC Elite unit into the mix via an ESATA controller, with the 4 drives in the unit presented as individual drives. However, because of the way that ZFS works, I would not be able to add a drive to each of the RAIDZ(s) since ZFS lacks the ability to reshape the geometry of the zpool as well as the ability to shrink a zpool by removing a device from it. So, future adding of storage would require ever more disks/space to enact reconfigurations, which makes using ZFS a bit problematic.
    One Avenue To Upgrade Via Adding More Disks, While Retaining Redundancy
    I suppose I _could_ swap out disks from each RAIDZ group and replace it with a disk from the external quad enclosure like so:
    • RAIDZ :: mb0:2 + qs1:0 + qs2:0   [ + 2TB ]
    • RAIDZ :: mb0:3 + qs1:1 + qs2:1   [ + 2TB ]
    • RAIDZ :: mb0:4 + qs1:2 + qs2:2   [ + 2TB ]
    • RAIDZ :: mb0:5 + qs1:3 + qs2:3   [ + 2TB ]
    • RAIDZ :: mb0:2 + es0:1 + qs2:0   [ + 2TB ]
    • RAIDZ :: mb0:3 + qs1:1 + es0:2   [ + 2TB ]
    • RAIDZ :: mb0:4 + qs1:2 + qs2:2   [ + 2TB ]
    • RAIDZ :: mb0:5 + qs1:3 + qs2:3   [ + 2TB ]
    • RAIDZ :: es0:0 + qs1:0  + qs2:1  [ +2TB ]
    • + 1 x 1TB hot spare : es0:3
    In this fashion, another 2TB of RAIDZ storage is added to the zpool, making it 10TB usable across 5 x RAIDZ blocks. The extra slot in the external SATA enclosure can be used to house a hot spare, to allow for auto-rebuilding.
    Basically, if you want to add additional physical storage devices… you will need to ensure that the devices are re-juggled around in such a way that the failure of any one device will not result in a failure of any RAIDZ component that makes up the zpool.
    In anycase, storage and data management continues to be a pain on the home front. :)