Usage

Although the library was designed to enable the development of more advanced codecs, usage of the provided implementations is very easy.

In the most basic scenario we can compress a single list of values (e.g. Int64) like this:

DeltaCodec_Singleton.png

An internal constructor is defined that allows for alternative finishers:

DeltaCodec_InternalConstructor.png
Although this creates a perfectly legitimate codec, it is a very BAD idea to do this except in tests. Internals are shared with the provided test assembly, so you can get away with this kind of experimentation. But there is a good reason why it has been restricted.
Sealing Concrete Classes
By convention, Codecs are defined as sealed classes because the hash code of the full name of the codec is used as the “magic number” (serialized as part of the headers). Unless a user controls ad hoc codec construction for both encoding and decoding, there is apt to be a mismatch and an exception will be thrown when decoding. For production use, create a sealed codec with the specific transform and finisher type desired, and then only use the static (singleton) instance!

Creating a new custom codec is actually quite easy:

DeltaCodec_CreatingCustom.png
The simple transforms supplied with the core library are not sealed specifically so that you can inherent from them and use some of the functionality, overriding only what you need to customize. For example, you might only want to change the transformation for floating point types. Or perhaps you only need to transform a subset of types for a specific complex data structure. But it is still a good idea to seal your custom transform implementation to be safe.

Codecs, on the other hand, are meant to coordinate the final results of transformation and finishing compression. They are responsible for managing the header information and final serialization. Thus, it makes sense to always make them sealed.

Codecs can still override the virtual methods of the base DeltaCodec class. And they can add new APIs to handle more complex data structures or method signatures.

This separation of concerns makes it very easy to create arbitrarily sophisticated codecs. Abiding by a few simple conventions keeps the logic orderly and robust.
Method Signatures
The virtual methods of the base class for codecs have a standard signature:

DeltaCodec_Signatures.png

This signature doesn’t restrict a user to passing in only lists of intrinsic data types. Any type can be passed in because concrete implementations can decide to support whatever they need. The list is provided first, all other parameters are optional:
  • NumBlocks – How many blocks will be encoded in parallel (default = 1, i.e. no parallelism).
  • Level –Strongly typed to the System.IO.Compression.CompressionLevel enumeration.
    • NoCompression
    • Fastest (default)
    • Optimal
  • Granularity –Specifies how to factor the data, if at all.
    • Null = No Factoring (default)
    • 0 = Auto Factor (calculate the granularity and use that to factor the data)
    • Any other = Use the provided value to factor the data
  • Monotonicity – An enumeration value that can possibly be used to optimize.
    • None
    • NonDecreasing – Might be useful for DateTime, TimeSpan, accumulations, etc.
    • NonIncreasing – Might be useful for decaying series of various types.

For the basic codecs included with the library, the number of blocks and the compression level parameters have the most profound effect on performance. But algorithms that depend more on transformation than finishing compression may require granularity, monotonicity, or other parameters to function well. Derived codecs can, of course, change or extend parameter lists in any way necessary.


Last edited Jun 10, 2015 at 6:58 AM by bstabile, version 14