<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Olhovsky</title>
	<atom:link href="http://www.olhovsky.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.olhovsky.com</link>
	<description>Miscellaneous Coding</description>
	<lastBuildDate>Thu, 03 Jun 2010 16:30:41 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Fields vs. properties performance on the Xbox 360.</title>
		<link>http://www.olhovsky.com/2010/04/fields-vs-properties-performance-on-the-xbox-360/</link>
		<comments>http://www.olhovsky.com/2010/04/fields-vs-properties-performance-on-the-xbox-360/#comments</comments>
		<pubDate>Sat, 17 Apr 2010 20:32:00 +0000</pubDate>
		<dc:creator>olhovsky</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[XNA]]></category>

		<guid isPermaLink="false">http://www.olhovsky.com/?p=140</guid>
		<description><![CDATA[I took Sam Allen&#8217;s performance test and ran a similar test (below) on the Xbox 360 with XNA.

static string _backing; // Backing store for property
static string Property  // Getter and setter
{
    get
    {
        return _backing;
    }
   [...]]]></description>
			<content:encoded><![CDATA[<p>I took <a href="http://dotnetperls.com/property-test">Sam Allen&#8217;s performance test</a> and ran a similar test (below) on the Xbox 360 with XNA.</p>
<pre class="brush:c#">
static string _backing; // Backing store for property
static string Property  // Getter and setter
{
    get
    {
        return _backing;
    }
    set
    {
        _backing = value;
    }
}
static string Field;    // Static field

public FieldPropertyTest(Game game)
    : base(game)
{

    long[] results1 = new long[10];
    long[] results2 = new long[10];

    Property = "string";
    Field = "string";
    const int m = 1000000;
    for (int x = 0; x < 10; x++) // Ten tests
    {
        Stopwatch s1 = new Stopwatch();
        s1.Start();
        for (int i = 0; i < m; i++) // Test property
        {
            Property = "string";
            //if (Property == "cat")
            //{
            //}
        }
        s1.Stop();
        Stopwatch s2 = new Stopwatch();
        s2.Start();
        for (int i = 0; i < m; i++) // Test field
        {
            Field = "string";
            //if (Field == "cat")
            //{
            //}
        }
        s2.Stop();
        results1[x] = s1.ElapsedMilliseconds;
        results2[x] = s2.ElapsedMilliseconds;
    }
}</pre>
<p>The results:</p>
<p><strong>Get:</strong><br />
Property: 341 ms<br />
Field: 307 ms</p>
<p><strong>Set:</strong><br />
Property: 85 ms<br />
Field: 66 ms</p>
<p>This performance difference from the more robust .NET CLR used in Allen's tests is owing to the fact that the .NET Compact CLR (which the Xbox 360 uses) doesn't inline functions at all -- to my knowledge.</p>
<p>Of course, the small size of the numbers above is good news -- it took 10 million gets on a field to produce a 34ms difference.<br />
It's hard to imagine a scenario where using a field in place of a property would be a relatively large enough performance gain to affect your framerate.</p>
<p>There are also the usual <a href="http://blogs.msdn.com/shawnhar/archive/2009/07/14/the-perils-of-microbenchmarking.aspx">caveats with micro-benchmarks</a> like these, and the difference may be smaller (or larger) than what you see above, in real-world uses.</p>
<p>So, I'd take <a href="http://www.codinghorror.com/blog/2006/08/properties-vs-public-variables.html">Jeff Atwood's advice</a> and simply always use (auto-)properties, unless you have a very tight performance sensitive loop that is accessing many properties.</p>
<p>Edit:<br />
The <a href="http://blogs.msdn.com/b/netcfteam/archive/2006/12/22/managed-code-performance-on-xbox-360-for-the-xna-framework-1-0.aspx?wa=wsignin1.0">NetCF team blog</a> states that getters/setters are inlined, which is probably a source of some confusion. It's also possible that my results above are wrong, if I've fallen into a micro benchmarking trap. Let me know if you find some results opposing what I've written above.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.olhovsky.com/2010/04/fields-vs-properties-performance-on-the-xbox-360/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Occurences of one string in another.</title>
		<link>http://www.olhovsky.com/2009/12/occurences-of-one-string-in-another/</link>
		<comments>http://www.olhovsky.com/2009/12/occurences-of-one-string-in-another/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 20:24:40 +0000</pubDate>
		<dc:creator>olhovsky</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Misc]]></category>

		<guid isPermaLink="false">http://www.olhovsky.com/?p=128</guid>
		<description><![CDATA[The string &#8220;ABC&#8221; occurs in &#8220;ABBC&#8221; twice, if you remove any characters you wish from &#8220;ABBC&#8221;.
The following algorithm finds such occurences in O(n) time given any two strings.
One assumption worth noting for the O(n) time bound is that the size of the alphabet is limited to some constant size.
Nested loops in this algorithm scream O(n^2)!! [...]]]></description>
			<content:encoded><![CDATA[<p>The string &#8220;ABC&#8221; occurs in &#8220;ABBC&#8221; twice, if you remove any characters you wish from &#8220;ABBC&#8221;.</p>
<p>The following algorithm finds such occurences in O(n) time given any two strings.<br />
One assumption worth noting for the O(n) time bound is that the size of the alphabet is limited to some constant size.</p>
<p>Nested loops in this algorithm scream O(n^2)!! However, those loops are bounded by a constant that is proportional to the size of the alphabet, squared. Since the alphabet is bounded by a constant (by my assumption above), these nested loops will be bounded by a constant &#8212; so they don&#8217;t take us out of O(n) time.</p>
<pre class="brush:python">__author__ = "Kris Olhovsky"
__date__ = "$Dec 29, 2009 6:54:11 PM$"

class Char:
    def __init__(self, char):
        self.char = char
        self.paths = {} # Key is length of paths, value is the # of paths.

    ''' Return the number of paths to this char of length len. '''
    def retrieve_matches(self, len):
        count = 0
        for c in self.paths:
            if len in self.paths[c]:
                count += self.paths[c][len]
                self.paths[c][len] = 0 # Delete recorded matches.
        return count

    ''' Copies paths from char c to this char, adding 1 to the
        length of every path. Paths of length greater than len are ignored. '''
    def add_paths(self, c, len):
        temp = {} # Avoid altering map during iteration.
        for char in c.paths:
            for length in c.paths[char]:
                if length + 1 <= len:
                    if c not in temp:
                        temp[c] = {}
                    if length + 1 not in temp[c]:
                        temp[c][length+1] = 0
                    temp[c][length+1] += c.paths[char][length]

        for k in temp: # Apply changes to map.
            for j in temp[k]:
                if k not in self.paths:
                    self.paths[c] = {}
                if j not in self.paths[k]:
                    self.paths[k][j] = 0
                self.paths[k][j] += temp[k][j]

''' Returns the number of occurences of string s1 in string s2. '''
def subseq(s1, s2):
    if s1 == "": # Special case defined by the problem.
        return 1

    prev = {}   # Maps a char to a dict of chars that appear one index to
                # the left of that char anywhere in s1.

    chars = {}  # All of the chars that appear in s1.

    for c in s1: # Initialize maps.
        prev[c] = {}
        chars[c] = Char(c)

    i = 1
    while i < len(s1): # Populate maps with data from s1.
        prev[s1[i]][s1[i-1]] = chars[s1[i-1]]
        i += 1

    start = Char('start')
    chars[s1[0]].paths[start] = {}
    chars[s1[0]].paths[start][0] = 0
    occurences = 0 # Record number of occurences of s1 in s2.
    for char in s2:

        if char in chars: # A char in s2 matches a char in s1.
            if char in prev[char]: # If same exists, prioritize this.
                chars[char].add_paths(prev[char][char], len(s1) - 1)
            for c in prev[char]:
                if c != char:
                    chars[char].add_paths(prev[char][c], len(s1) - 1)
            if char == s1[0]: # First char in s1 always starts a new path.
                chars[s1[0]].paths[start][0] += 1

            if char == s1[-1]: # Update occurences when last char of s1 seen.
                occurences += chars[char].retrieve_matches(len(s1) - 1)

    return occurences</pre>
<p>Here are some example usages:</p>
<pre class="brush:python">if __name__ == "__main__":
    s1 = "ABC"
    s2 = "ABBC"
    print subseq(s1, s2)

    s1 = "ABCB"
    s2 = "ABBCBBCB"
    print subseq(s1, s2)

    s1 = "ABCB"
    s2 = "ABCB"
    print subseq(s1, s2)

    s1 = "ABBB"
    s2 = "ABBBBB"
    print subseq(s1, s2)

    s1 = "ABBB"
    s2 = "ABABB"
    print subseq(s1, s2)</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.olhovsky.com/2009/12/occurences-of-one-string-in-another/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Extract Longest Non-Decreasing Sequence From Any Sequence</title>
		<link>http://www.olhovsky.com/2009/11/extract-longest-increasing-sequence-from-any-sequence/</link>
		<comments>http://www.olhovsky.com/2009/11/extract-longest-increasing-sequence-from-any-sequence/#comments</comments>
		<pubDate>Tue, 10 Nov 2009 01:07:19 +0000</pubDate>
		<dc:creator>olhovsky</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Misc]]></category>

		<guid isPermaLink="false">http://www.olhovsky.com/?p=106</guid>
		<description><![CDATA[I wrote some python code that extracts the longest non-decreasing subsequence from any given sequence.
This runs in O(n log n) time, and uses O(n) memory.
''' An item in the final sequence, used to form a linked list. '''
class SeqItem():
    val = 0      # This item's value.
  [...]]]></description>
			<content:encoded><![CDATA[<p>I wrote some python code that extracts the longest non-decreasing subsequence from any given sequence.</p>
<p>This runs in O(n log n) time, and uses O(n) memory.</p>
<pre class="brush:python">''' An item in the final sequence, used to form a linked list. '''
class SeqItem():
    val = 0      # This item's value.
    prev = None  # The value before this one.
    def __init__(self, val, prev):
        self.val = val
        self.prev = prev

''' Extract longest non-decreasing subsequence from sequence seq.'''
def extract_sorted(seq):
    subseqs = [SeqItem(seq[0], None)] # Track decreasing subsequences in seq.
    result_list = [subseqs[0]]
    for i in range(1, len(seq)):
        result = search_insert(subseqs, seq[i], 0, len(subseqs))

    # Build Python list from custom linked list:
    final_list = []
    result = subseqs[-1] # Longest nondecreasing subsequence is found by
                         # traversing the linked list backwards starting from
                         # the final smallest value in the last nonincreasing
                         # subsequence found.
    while(result != None and result.val != None):
        final_list.append(result.val)
        result = result.prev # Walk backwards through longest sequence.

    final_list.reverse()
    return final_list

''' Seq tracks the smallest value of each nonincreasing subsequence constructed.
Find smallest item in seq that is greater than search_val.
If such a value does not exist, append search_val to seq, creating the beginning
of a new nonincreasing subsequence.
If such a value does exist, replace the value in seq at that position, and
search_val will be considered the new candidate for the longest subseq if
a value in the following nonincreasing subsequence is added.
Seq is guaranteed to be in increasing sorted order.
Returns the index of the element in seq that should be added to results. '''
def search_insert(seq, search_val, start, end):
    median = (start + end)/2

    if end - start < 2: # End of the search.
        if seq[start].val > search_val:
            if start > 0:
                new_item = SeqItem(search_val, seq[start - 1])
            else:
                new_item = SeqItem(search_val, None)

            seq[start] = new_item
            return new_item
        else: # seq[start].val <= search_val
            if start + 1 < len(seq):
                new_item = SeqItem(search_val, seq[start])
                seq[start + 1] = new_item
                return new_item
            else:
                new_item = SeqItem(search_val, seq[start])
                seq.append(new_item)
                return new_item

    if search_val < seq[median].val: # Search left side
        return search_insert(seq, search_val, start, median)
    else: #search_val >= seq[median].val: # Search right side
        return search_insert(seq, search_val, median, end)</pre>
<p>And here is an example usage:</p>
<pre class="brush:python">
import random
if __name__ == '__main__':
    seq = []
    for i in range(100000):
        seq.append(int(random.random() * 1000))

    print extract_sorted(seq)</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.olhovsky.com/2009/11/extract-longest-increasing-sequence-from-any-sequence/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Foreach through non-primitive types creates garbage.</title>
		<link>http://www.olhovsky.com/2009/09/code-common-to-many-xna-examples-generates-unneccesary-garbage/</link>
		<comments>http://www.olhovsky.com/2009/09/code-common-to-many-xna-examples-generates-unneccesary-garbage/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 02:44:42 +0000</pubDate>
		<dc:creator>olhovsky</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[XNA]]></category>
		<category><![CDATA[garbage collection]]></category>
		<category><![CDATA[speed up XNA]]></category>

		<guid isPermaLink="false">http://www.olhovsky.com/?p=81</guid>
		<description><![CDATA[Time and again I have seen code like this used in XNA tutorials.

effectDrawBlock.Begin();
foreach (EffectPass pass in effectDrawBlock.CurrentTechnique.Passes)
{
    pass.Begin(); 

    gd.DrawIndexedPrimitives(PrimitiveType.TriangleStrip, 0, 0, 35, 0, 70); 

    pass.End();
}
effectDrawBlock.End();

When you use a foreach over an array of ints, you will not create garbage, and it&#8217;s fast, so that&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>Time and again I have seen code like this used in XNA tutorials.</p>
<pre class="brush:c#">
effectDrawBlock.Begin();
foreach (EffectPass pass in effectDrawBlock.CurrentTechnique.Passes)
{
    pass.Begin(); 

    gd.DrawIndexedPrimitives(PrimitiveType.TriangleStrip, 0, 0, 35, 0, 70); 

    pass.End();
}
effectDrawBlock.End();
</pre>
<p>When you use a foreach over an array of ints, you will not create garbage, and it&#8217;s fast, so that&#8217;s perfectly okay.</p>
<p>However, in the above code, &#8220;EffectPass pass in&#8221; creates a managed effect pass object, used to iterate through the collection. That object then needs to be handled by the garbage collector later.</p>
<p>The fix is to use a for loop when iterating over non-primitive types.<br />
In my case, by changing my code to something more like what you see below, I was able to reduce the number of managed objects generated by my code from 2500/sec to 200/sec as measured by the XNA Framework Remote Perf Monitor.</p>
<pre class="brush:c#">effectDrawBlock.Begin();
for (int i = 0; i < effectDrawBlock.CurrentTechnique.Passes.Count; i++)
{
    effectDrawBlock.CurrentTechnique.Passes[i].Begin();

    gd.DrawIndexedPrimitives(PrimitiveType.TriangleStrip, 0, 0, 35, 0, 70);

    effectDrawBlock.CurrentTechnique.Passes[i].End();
}
effectDrawBlock.End();</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.olhovsky.com/2009/09/code-common-to-many-xna-examples-generates-unneccesary-garbage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ToString Garbage Creation in C#</title>
		<link>http://www.olhovsky.com/2009/09/tostring-garbage-creation-in-c/</link>
		<comments>http://www.olhovsky.com/2009/09/tostring-garbage-creation-in-c/#comments</comments>
		<pubDate>Thu, 17 Sep 2009 05:24:39 +0000</pubDate>
		<dc:creator>olhovsky</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[XNA]]></category>
		<category><![CDATA[c#]]></category>
		<category><![CDATA[garbage collector]]></category>

		<guid isPermaLink="false">http://www.olhovsky.com/?p=70</guid>
		<description><![CDATA[I profiled my XNA game project today using the XNA Framework Remote Perf Monitor and discovered that I was generating about 8000 more manage objects per second than I was expecting to.
It turns out that this code was generating 7400 managed objects per second:
            [...]]]></description>
			<content:encoded><![CDATA[<p>I profiled my XNA game project today using the XNA Framework Remote Perf Monitor and discovered that I was generating about 8000 more manage objects per second than I was expecting to.</p>
<p>It turns out that this code was generating 7400 managed objects per second:</p>
<pre class="brush:c#">            text[0] = "FPS: " + fps.FPS.ToString();
            text[1] = "charPos: " + charPos.ToString();
            text[2] = "tLOD: " + tLOD.ToString();
            text[3] = "tTris: " + tTris.ToString();
            text[4] = "tBlocksDrawn: " + tBlocksDrawn.ToString();
            text[5] = "tBlocksCulled: " + tBlocksCulled.ToString();
            text[6] = "drawCodeTime: " + drawCodeTime.ToString();
</pre>
<p>A complicated optimization for code that isn&#8217;t even going to get compiled into the release build would be silly. The easiest fix here is to only periodically update the content of these strings (say, once per second, instead of once per frame.)<br />
Also, some of the garbage generation could be reduced by separating the arrays into one array for the static strings &#8220;drawCodeTime: &#8221; for example and then drawCodeTime.ToString() in a separate string. That would still leave thousands of objects per second created, sadly.</p>
<p>If you have a game that relies on many string conversions, your best bet is to use StringBuilder objects (which have predefined size) and replace subsections of those strings with the new string using the .Replace method that StringBuilders have built-in.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.olhovsky.com/2009/09/tostring-garbage-creation-in-c/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Convert greyscale images to Alpha8 format with a Custom Content Processor in XNA 3.1.</title>
		<link>http://www.olhovsky.com/2009/09/convert-greyscale-images-to-alpha8-format-with-a-custom-content-processor-in-xna-3-1/</link>
		<comments>http://www.olhovsky.com/2009/09/convert-greyscale-images-to-alpha8-format-with-a-custom-content-processor-in-xna-3-1/#comments</comments>
		<pubDate>Mon, 14 Sep 2009 01:34:31 +0000</pubDate>
		<dc:creator>olhovsky</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[XNA]]></category>
		<category><![CDATA[alpha8]]></category>
		<category><![CDATA[content pipeline]]></category>
		<category><![CDATA[custom content processor]]></category>

		<guid isPermaLink="false">http://www.olhovsky.com/?p=62</guid>
		<description><![CDATA[I noticed that at least two terrain engine examples in XNA are reading heightmap images into 4 channel textures instead of single channel textures.
To create a custom content processor that will permit you to convert any Texture2D compatible input format into an Alpha8 format texture, do the following:
Right click your game solution -> add new [...]]]></description>
			<content:encoded><![CDATA[<p>I noticed that at least two terrain engine examples in XNA are reading heightmap images into 4 channel textures instead of single channel textures.</p>
<p>To create a custom content processor that will permit you to convert any Texture2D compatible input format into an Alpha8 format texture, do the following:</p>
<p>Right click your game solution -> add new item<br />
Under XNA 3.1 select Custom Content Pipeline, and replace the default class in this solution with this code:</p>
<pre class="brush:c#">
using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.Xna.Framework.Content.Pipeline;
using Microsoft.Xna.Framework.Content.Pipeline.Processors;
using Microsoft.Xna.Framework.Content.Pipeline.Graphics;
using Microsoft.Xna.Framework.Graphics.PackedVector;
using Microsoft.Xna.Framework;
using System.ComponentModel;

namespace ContentPipelineExtension1
{
    [ContentProcessor]
    [DesignTimeVisible(true)]
    class HeightMapTextureProcessor : ContentProcessor<TextureContent,TextureContent>
    {
        /// <summary>
        /// Process converts displacement maps to Alpha8 textures.
        /// </summary>
        public override TextureContent Process(TextureContent input,
            ContentProcessorContext context)
        {
            // Convert to data that we can work with:
            input.ConvertBitmapType(typeof(PixelBitmapContent<Vector4>));

            // Select first mipmap, there should only be one:
            MipmapChain mipmapChain = input.Faces[0];

            // There should only be one bitmap, but it doesnt hurt to write this loop:
            foreach (PixelBitmapContent<Vector4> bitmap in mipmapChain)
            {
                for (int x = 0; x < bitmap.Width; x++)
                {
                    for (int y = 0; y < bitmap.Height; y++)
                    {
                        Vector4 pixel = bitmap.GetPixel(x, y);

                        // Move R channel to A channel:
                        bitmap.SetPixel(x, y, new Vector4(0, 0, 0, pixel.X));
                    }
                }
            }

            // Alpha8 contains values from 0 - 1 in the W channel.
            input.ConvertBitmapType(typeof(PixelBitmapContent<Alpha8>));

            input.GenerateMipmaps(false);
            return input;
        }
    }
}
</pre>
<p>Now add a reference to your CustomContentPipeline1 solution in the &#8220;Content&#8221; module in your game solution.</p>
<p>Right click on the heightmap file(s) that you wish to be converted to Alpha8 and select the HeightMapTextureProcessor as the content processor.</p>
<p>That&#8217;s it!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.olhovsky.com/2009/09/convert-greyscale-images-to-alpha8-format-with-a-custom-content-processor-in-xna-3-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2D CDF 9/7 Wavelet Transform in Python</title>
		<link>http://www.olhovsky.com/2009/03/2d-cdf-97-wavelet-transform-in-python/</link>
		<comments>http://www.olhovsky.com/2009/03/2d-cdf-97-wavelet-transform-in-python/#comments</comments>
		<pubDate>Tue, 17 Mar 2009 11:40:48 +0000</pubDate>
		<dc:creator>olhovsky</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[daubechies]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[jpeg2000]]></category>
		<category><![CDATA[lifting]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[wavelet]]></category>

		<guid isPermaLink="false">http://www.olhovsky.com/?p=17</guid>
		<description><![CDATA[As promised, here is an implementation of the Cohen-Daubechies-Feauveau 9 tap / 7 tap wavelet transform on a 2D signal in Python. This is the same transform used in the JPEG2000 codec.
'''
2D CDF 9/7 Wavelet Forward and Inverse Transform (lifting implementation)

This code is provided "as is" and is given for educational purposes.
2008 - Kris Olhovsky - code.inquiries@olhovsky.com
'''

from PIL [...]]]></description>
			<content:encoded><![CDATA[<p>As promised, <a href="http://olhovsky.com/content/wavelet/2dwavelet97lift.py" target="_blank">here is an implementation</a> of the Cohen-Daubechies-Feauveau 9 tap / 7 tap wavelet transform on a 2D signal in Python. This is the same transform used in the JPEG2000 codec.</p>
<pre class="brush:python">'''
2D CDF 9/7 Wavelet Forward and Inverse Transform (lifting implementation)

This code is provided "as is" and is given for educational purposes.
2008 - Kris Olhovsky - code.inquiries@olhovsky.com
'''

from PIL import Image # Part of the standard Python Library

''' Example matrix as a list of lists: '''
mat4x4 = [
         [0,   1,  2,  3], # Row 1
         [4,   5,  6,  7], # Row 2
         [8,   9, 10, 11], # Row 3
         [12, 13, 14, 15], # Row 4
         ]                 # We don't do anything with this matrix.
                           # It's just here for clarification.

def fwt97_2d(m, nlevels=1):
    ''' Perform the CDF 9/7 transform on a 2D matrix signal m.
    nlevel is the desired number of times to recursively transform the
    signal. '''

    w = len(m[0])
    h = len(m)
    for i in range(nlevels):
        m = fwt97(m, w, h) # cols
        m = fwt97(m, w, h) # rows
        w /= 2
        h /= 2

    return m

def iwt97_2d(m, nlevels=1):
    ''' Inverse CDF 9/7 transform on a 2D matrix signal m.
        nlevels must be the same as the nlevels used to perform the fwt.
    '''

    w = len(m[0])
    h = len(m)

    # Find starting size of m:
    for i in range(nlevels-1):
        h /= 2
        w /= 2

    for i in range(nlevels):
        m = iwt97(m, w, h) # rows
        m = iwt97(m, w, h) # cols
        h *= 2
        w *= 2

    return m

def fwt97(s, width, height):
    ''' Forward Cohen-Daubechies-Feauveau 9 tap / 7 tap wavelet transform
    performed on all columns of the 2D n*n matrix signal s via lifting.
    The returned result is s, the modified input matrix.
    The highpass and lowpass results are stored on the left half and right
    half of s respectively, after the matrix is transposed. '''

    # 9/7 Coefficients:
    a1 = -1.586134342
    a2 = -0.05298011854
    a3 = 0.8829110762
    a4 = 0.4435068522

    # Scale coeff:
    k1 = 0.81289306611596146 # 1/1.230174104914
    k2 = 0.61508705245700002 # 1.230174104914/2
    # Another k used by P. Getreuer is 1.1496043988602418

    for col in range(width): # Do the 1D transform on all cols:
        ''' Core 1D lifting process in this loop. '''
        ''' Lifting is done on the cols. '''

        # Predict 1. y1
        for row in range(1, height-1, 2):
            s[row][col] += a1 * (s[row-1][col] + s[row+1][col])
        s[height-1][col] += 2 * a1 * s[height-2][col] # Symmetric extension

        # Update 1. y0
        for row in range(2, height, 2):
            s[row][col] += a2 * (s[row-1][col] + s[row+1][col])
        s[0][col] +=  2 * a2 * s[1][col] # Symmetric extension

        # Predict 2.
        for row in range(1, height-1, 2):
            s[row][col] += a3 * (s[row-1][col] + s[row+1][col])
        s[height-1][col] += 2 * a3 * s[height-2][col]

        # Update 2.
        for row in range(2, height, 2):
            s[row][col] += a4 * (s[row-1][col] + s[row+1][col])
        s[0][col] += 2 * a4 * s[1][col]

    # de-interleave
    temp_bank = [[0]*width for i in range(height)]
    for row in range(height):
        for col in range(width):
            # k1 and k2 scale the vals
            # simultaneously transpose the matrix when deinterleaving
            if row % 2 == 0: # even
                temp_bank[col][row/2] = k1 * s[row][col]
            else:            # odd
                temp_bank[col][row/2 + height/2] = k2 * s[row][col]

    # write temp_bank to s:
    for row in range(width):
        for col in range(height):
            s[row][col] = temp_bank[row][col]

    return s

def iwt97(s, width, height):
    ''' Inverse CDF 9/7. '''

    # 9/7 inverse coefficients:
    a1 = 1.586134342
    a2 = 0.05298011854
    a3 = -0.8829110762
    a4 = -0.4435068522

    # Inverse scale coeffs:
    k1 = 1.230174104914
    k2 = 1.6257861322319229

    # Interleave:
    temp_bank = [[0]*width for i in range(height)]
    for col in range(width/2):
        for row in range(height):
            # k1 and k2 scale the vals
            # simultaneously transpose the matrix when interleaving
            temp_bank[col * 2][row] = k1 * s[row][col]
            temp_bank[col * 2 + 1][row] = k2 * s[row][col + width/2]

    # write temp_bank to s:
    for row in range(width):
        for col in range(height):
            s[row][col] = temp_bank[row][col]

    for col in range(width): # Do the 1D transform on all cols:
        ''' Perform the inverse 1D transform. '''

        # Inverse update 2.
        for row in range(2, height, 2):
            s[row][col] += a4 * (s[row-1][col] + s[row+1][col])
        s[0][col] += 2 * a4 * s[1][col]

        # Inverse predict 2.
        for row in range(1, height-1, 2):
            s[row][col] += a3 * (s[row-1][col] + s[row+1][col])
        s[height-1][col] += 2 * a3 * s[height-2][col]

        # Inverse update 1.
        for row in range(2, height, 2):
            s[row][col] += a2 * (s[row-1][col] + s[row+1][col])
        s[0][col] +=  2 * a2 * s[1][col] # Symmetric extension

        # Inverse predict 1.
        for row in range(1, height-1, 2):
            s[row][col] += a1 * (s[row-1][col] + s[row+1][col])
        s[height-1][col] += 2 * a1 * s[height-2][col] # Symmetric extension

    return s

def seq_to_img(m, pix):
    ''' Copy matrix m to pixel buffer pix.
    Assumes m has the same number of rows and cols as pix. '''
    for row in range(len(m)):
        for col in range(len(m[row])):
            pix[col,row] = m[row][col]</pre>
<p>And here is an example usage:</p>
<pre class="brush:python">if __name__ == "__main__":
    # Load image.
    im = Image.open("test1_512.png") # Must be a single band image! (grey)

    # Create an image buffer object for fast access.
    pix = im.load()

    # Convert the 2d image to a 1d sequence:
    m = list(im.getdata())

    # Convert the 1d sequence to a 2d matrix.
    # Each sublist represents a row. Access is done via m[row][col].
    m = [m[i:i+im.size[0]] for i in range(0, len(m), im.size[0])]

    # Cast every item in the list to a float:
    for row in range(0, len(m)):
        for col in range(0, len(m[0])):
            m[row][col] = float(m[row][col])

    # Perform a forward CDF 9/7 transform on the image:
    m = fwt97_2d(m, 3)

    seq_to_img(m, pix) # Convert the list of lists matrix to an image.
    im.save("test1_512_fwt.png") # Save the transformed image.

    # Perform an inverse transform:
    m = iwt97_2d(m, 3)

    seq_to_img(m, pix) # Convert the inverse list of lists matrix to an image.
    im.save("test1_512_iwt.png") # Save the inverse transformation.</pre>
<p>This is a first step in a larger task of compressing a certain type of vertex data, which can be decompressed entirely by a GPU via an implementation written in shader code.</p>
<p>The reason for a Python implementation is to create a working, easy to understand version of the implementation. Not only for myself, but for the internets.</p>
<p>Imagine sending 1/10th the data when world geometry information needs to change. This would use far less CPU and GPU time transferring data to the GPU, as well as freeing up CPU time that would normally be used decompressing the data, in situations where HDD space is the reason for the compression.</p>
<p>HDD reads will be shorter, space used in the GPU RAM will be saved.</p>
<p>Coming soon; a C or C# implementation of the CDF 9/7 transform, along with a quantizer and encoder, to realize a GPU based compression scheme.</p>
<p>The goals of my compression scheme are to reduce both compression and<span style="font-style: normal;"> decompression time (while balancing compression ratio). Every other scheme I&#8217;ve seen on the internets don&#8217;t seem to focus on reducing the compression time. As you&#8217;ll soon see, this is because my larger project requires fast compression (to allow quick alterations of compressed vertex data).</span></p>
<p>The Python code does use &#8220;height&#8221; and &#8220;width&#8221; variables, although in order to use rectangular images, some adjustments need to be made. This code also assumes that your images have width and height of the form 2^n. E.g. 1024 x 1024, 512 x 512, etc.</p>
<div id="attachment_25" class="wp-caption alignleft" style="width: 266px"><img class="size-full wp-image-25" title="test1_256" src="http://www.olhovsky.com/wp/wp-content/uploads/2009/03/test1_256.png" alt="Sample input image for transforming." width="256" height="256" /><p class="wp-caption-text">Sample input image for transforming.</p></div>
<div id="attachment_23" class="wp-caption alignleft" style="width: 266px"><img class="size-full wp-image-23" title="256fwt1_eg" src="http://www.olhovsky.com/wp/wp-content/uploads/2009/03/256fwt1_eg.png" alt="Image after a single level CDF 9/7 transform." width="256" height="256" /><p class="wp-caption-text">Image after a single level CDF 9/7 transform.</p></div>
<div id="attachment_24" class="wp-caption alignleft" style="width: 266px"><img class="size-full wp-image-24" title="256fwt2_eg" src="http://www.olhovsky.com/wp/wp-content/uploads/2009/03/256fwt2_eg.png" alt="2 level CDF 9/7 wavelet transformed image" width="256" height="256" /><p class="wp-caption-text">2 level CDF 9/7 wavelet transformed image</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.olhovsky.com/2009/03/2d-cdf-97-wavelet-transform-in-python/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
