Sunday 23 December 2012

Typo T-Shirt


ty·po : /ˈtīpō/ (noun) An acronym for Take Your Pantries Off

Saturday 15 September 2012

Movie Budgets

I've been fascinated by the movie business. Who invests such large amounts of cash up front to make a movie? How much of a gamble is it?

I've been looking at a movie budgets table published by The Numbers. It lists over 3600 (mainly US and British) movies and estimations of their budgets and grosses. It includes some unreleased and recently-released films for which the grosses are unavailable or unreliable. For the purposes of this discussion, I've excluded films before 1946 and after 2011, and those that have no income information whatsoever. That left me with 3512 movies.

I bucketed the films by logarithmic budget:

Budget
Movies
Losing Movies
Total Budget
Total Profit
ROI
Up to $9,999
7
2 (29%)
$40,100
$3,063,092
7739%
$10,000 to $99,999
33
9 (27%)
$1,102,000
$302,684,325
27567%
$100,000 to $999,999
155
57 (37%)
$66,670,000
$1,242,723,838
1964%
$1,000,000 to $9,999,999
858
309 (36%)
$3,876,788,054
$14,419,232,185
472%
$10,000,000 to $99,999,999
2256
781 (35%)
$78,584,771,638
$123,049,314,610
257%
$100,000,000 and over
203
26 (13%)
$28,762,300,000
$57,812,766,174
301%
Totals
3512
1184 (34%)
$111,292,671,792
$196,829,784,224
277%

Even from such a simple table, there are a number of interesting observations:
  1. Very few low-budget films make it to the cinema. Those that do generally make huge profits (speaking relatively).
  2. Only two-thirds of cinema releases recoup their budgets (modulo Hollywood accounting).
  3. The vast majority of releases are in the $10M to $100M bracket; but this has the lowest return on investment (ROI, calculated here as [Budget+Profit]÷Budget).
  4. Nine-figure budget movies are much less likely to lose money.
We can also plot ROI (y-axis) against budget (x-axis):


This produces some interesting outliers:
  • "A" is Paranormal Activity (2007) which supposedly cost $15,000 but grossed $200M worldwide. That's over one million percent ROI.
  • "B" is My Date with Drew (2004) which supposedly cost $1,100 to make, but grossed over $180,000. It's the cheapest movie I could find with a successful cinema release.
  • "C" is Pirates of the Caribbean: At World's End (2007), the most expensive film on the list at $300M. It grossed over $660M.
  • "D" is Evan Almighty (2007), the most expensive film ($175M) on the list that failed to make a profit, but only just.
  • "E" is Nomad (2005), a film in which the Kazakh government invested $40M. It only made $79,123 at the US box office. [Actually, these figures are slightly out-of-date: the film made $3M in international releases]
  • "F" is Ed and His Dead Mother (1993) which cost $1,800,000 but only made $673 (or $1097 depending on your sources) at the box office.
Also of note:
  • Mars Needs Moms (2011) which only recouped $40M of its estimated $150M costs, making it the greatest absolute dollar loser on the list.
  • Avatar (2009) made a profit of $2547M on its costs of $237M.
  • Rocky (1976) which made $225M off of its $1M costs. This the best ROI for a movie costing at least $1M.
The naive conclusion seems to be either (i) make lots of movies for about $100,000 in the hopes that one of them is a huge success; or (ii) persuade people to give you more than $100,000,000 thereby guaranteeing a sure-fire success.

Friday 7 September 2012

Map Demo 3.0

For several years on-and-off, I've been working on a demo to showcase compression techniques and to try to sate my curiosity in cartography and country-based statistics. I've just uploaded the latest incarnation: Version 3.0.


This is a bog-standard 32-bit Windows application with all the data embedded within it. Compiling the supplied source code with Visual C 2008 produces an executable of just 118KB; compressing this with UPX brings it down to 84KB, mainly through instruction-stream compression.

The pertinent features are: 
  1. Coastline and border data for 214 countries (including South Sudan and Montenegro);
  2. Over one hundred statistics and codes for each country;
  3. Vector-drawn flags for each country;
  4. 128 map projections;
  5. Choropleth statistically mapping;
  6. The ability to save the data table as a CSV file (significantly larger than the executable!); and
  7. The ability to save maps and flags as Windows Enhanced Metafiles.

Monday 20 August 2012

RGB/HCL in HLSL

The HCL colour space by M. Sarifuddin and Rokia Missaoui (not to be confused with CIELCH) is another colour space that tries to improve on HSL and HSV. It is a cylindrical space which means that, for an accurate implementation, trigonometric functions are necessary. One interesting feature is that it tries to adjust the hue metric to be more perceptually meaningful using piecewise linear interpolation.

Here's the optimised HLSL code to convert from linear RGB:

float HCLgamma = 3;
float HCLy0 = 100;
// HCLmaxL == exp(HCLgamma / HCLy0) - 0.5
float HCLmaxL = 0.530454533953517;

float3 RGBtoHCL(in float3 RGB)
{
  float3 HCL;
  float H = 0;
  float U, V;
#if NO_ASM
  U = -min(RGB.r, min(RGB.g, RGB.b));
  V = max(RGB.r, max(RGB.g, RGB.b));
#else
  float4 RGB4 = RGB.rgbr;
  asm { max4 U, -RGB4 };
  asm { max4 V, RGB4 };
#endif
  float Q = HCLgamma / HCLy0;
  HCL.y = V + U;
  if (HCL.y != 0)
  {
    H = atan2(RGB.g - RGB.b, RGB.r - RGB.g) / PI;
    Q *= -U / V;
  }
  Q = exp(Q);
  HCL.x = frac(H / 2 - min(frac(H), frac(-H)) / 6);
  HCL.y *= Q;
  HCL.z = lerp(U, V, Q) / (HCLmaxL * 2);
}


All components are scaled to fit into the expected [0,1] range.

The weird-looking statement with "frac()" terms is performing the piecewise adjustment of hue.

I'm very happy with this code as it is considerably faster than a simplistic transliteration of the reference code that Sarifuddin Madenda was good enough to send me.

However, the reverse transformation is far from perfect and needs more work to reduce the branching nature of the algorithm:

float HCLtoRGB(in float3 HCL)
{
  float3 RGB = 0;
  if (HCL.z != 0)
  {
    float H = HCL.x;
    float C = HCL.y;
    float L = HCL.z * HCLmaxL;
    float Q = exp((1 - C / (2 * L)) * (HCLgamma / HCLy0));
    float U = (2 * L - C) / (2 * Q - 1);
    float V = C / Q;
    float T = tan((H + min(frac(2 * H) / 4, frac(-2 * H) / 8)) * PI * 2);
    H *= 6;
    if (H <= 1)
    {
      RGB.r = 1;
      RGB.g = T / (1 + T);
    }
    else if (H <= 2)
    {
      RGB.r = (1 + T) / T;
      RGB.g = 1;
    }
    else if (H <= 3)
    {
      RGB.g = 1;
      RGB.b = 1 + T;
    }
    else if (H <= 4)
    {
      RGB.g = 1 / (1 + T);
      RGB.b = 1;
    }
    else if (H <= 5)
    {
      RGB.r = -1 / T;
      RGB.b = 1;
    }
    else
    {
      RGB.r = 1;
      RGB.b = -T;
    }
  }
  return RGB * V + U;
}


The multiple if-statements could be rationalised (binary search style) but I can't help thinking there's some simple trigonometric identity that can be utilised to eradicate them completely.

Monday 6 August 2012

sRGB Approximations for HLSL

The sRGB color space is non-linear. Many transformations to and from the space require the RGB components to be mapped to a linear form.

The "official" transformation for each RGB component value in the range [0,1] is:

  if (C_srgb <= 0.04045)
      C_lin = C_srgb / 12.92;
  else
      C_lin = pow((C_srgb + 0.055) / 1.055, 2.4);

This is often approximated using the "gamma 2.2" formula:

  C_lin_1 = pow(C_srgb, 2.2);

This works fine, but is fairly inaccurate. The graph below uses the right-hand axis for the absolute difference:


For example, if the values are quantized to eight bits, for the sRGB component value 197/255, the linear output is 145/255 instead of 142/255. Can we do better?

In fact, if we simply change the "magic number" to 2.233333... we get better results:

  C_lin_2 = pow(C_srgb, 2.233333333);


However, the "pow" functionality is either prohibitively expensive or non-existent on many platforms. So, if we limit ourselves to simple arithmetic, a good approximation I found is the cubic:

  C_lin_3 = 0.012522878 * C_srgb +
            0.682171111 * C_srgb * C_srgb +
            0.305306011 * C_srgb * C_srgb * C_srgb;



This can be computed in HLSL using:

  float3 RGB = sRGB * (sRGB * (sRGB * 0.305306011 + 0.682171111) + 0.012522878);

The reverse transformation (from linear to sRGB) is more problematic:

Again, the "official" transformation is piecewise:

  if (C_lin <= 0.0031308)
    C_srgb = C_lin * 12.92;
  else
    C_srgb = 1.055 * pow(C_lin, 1.0 / 2.4) - 0.055;

This is usually poorly approximated with the inverse of the computation of "C_lin_1":

  C_srgb_1 = pow(C_lin, 0.4545454545);


In fact, the linear portion of the official graph is tiny, so an almost-perfect approximation is:

  C_srgb_2 = max(1.055 * pow(C_lin, 0.416666667) - 0.055, 0);


The clamp ("max(..., 0)") is free on many platforms, but the formula does use the "pow" functionality. If we assume we only have square-root operations at our disposal, a good approximation I found was:

  C_srgb_3 = 0.585122381 * sqrt(C_lin) +
             0.783140355 * sqrt(sqrt(C_lin)) -
             0.368262736 * sqrt(sqrt(sqrt(C_lin)));


This can be computed in HLSL using:

  float3 S1 = sqrt(RGB);
  float3 S2 = sqrt(S1);
  float3 S3 = sqrt(S2);
  float3 sRGB = 0.585122381 * S1 + 0.783140355 * S2 - 0.368262736 * S3;

An even better approximation (at the cost of an additional 'mad') is:


  float3 S1 = sqrt(RGB);
  float3 S2 = sqrt(S1);
  float3 S3 = sqrt(S2);
  float3 sRGB = 0.662002687 * S1 + 0.684122060 * S2 - 0.323583601 * S3 - 0.0225411470 * RGB;

Depending on your platform architecture, this may be faster using multiplication by a constant matrix for the final step.

Sunday 5 August 2012

RGB/HCY in HLSL

The HCY colour space is a tractable hue/chroma/luminance scheme developed by Kuzma Shapran. It is ideal for pixel shaders, being only slightly more expensive that the HSV and HSL schemes. However, it tries to be more "meaningful" in terms of human perception.

The three components are:
  1. Hue (H) computed in the same manner as HSV and HSL;
  2. Chroma (C) computed as the scaled difference between the maximum unweighted RGB component and the minimum unweighted RGB component; and
  3. Luminance (Y) computed as the weighted sum of RGB components.
Note that the chroma is post-scaled so that the maximum weighted luminance for this hue is always one.

The HLSL conversions are as follows:
// The weights of RGB contributions to luminance.
// Should sum to unity.
float3 HCYwts = float3(0.299, 0.587, 0.114);

float3 HUEtoRGB(in float H)
{
  float R = abs(H * 6 - 3) - 1;
  float G = 2 - abs(H * 6 - 2);
  float B = 2 - abs(H * 6 - 4);
  return saturate(float3(R,G,B));
}

float RGBCVtoHUE(in float3 RGB, in float C, in float V)
{
    float3 Delta = (V - RGB) / C;
    Delta.rgb -= Delta.brg;
    Delta.rgb += float3(2,4,6);
    // NOTE 1
    Delta.brg = step(V, RGB) * Delta.brg;
    float H;
#if NO_ASM
    H = max(Delta.r, max(Delta.g, Delta.b));
#else
    float4 Delta4 = Delta.rgbr;
    asm { max4 H, Delta4 };
#endif
    return frac(H / 6);
}

float3 RGBtoHCY(in float3 RGB)
{
  float3 HCY = 0;
  float U, V;
#if NO_ASM
  U = -min(RGB.r, min(RGB.g, RGB.b));
  V = max(RGB.r, max(RGB.g, RGB.b));
#else
  float4 RGB4 = RGB.rgbr;
  asm { max4 U, -RGB4 };
  asm { max4 V, RGB4 };
#endif
  HCY.y = V + U;
  HCY.z = dot(RGB, HCYwts);
  if (HCY.y != 0)
  {
    HCY.x = RGBCVtoHUE(RGB, HCY.y, V);
    float Z = dot(HUEtoRGB(HCY.x), HCYwts);
    if (HCY.z > Z)
    {
      HCY.z = 1 - HCY.z;
      Z = 1 - Z;
    }
    HCY.y *= Z / HCY.z;
  }
  return HCY;
}

float3 HCYtoRGB(in float3 HCY)
{
  float RGB = HUEtoRGB(HCY.x);
  float Z = dot(RGB, HCYwts);
  if (HCY.z < Z)
  {
      HCY.y *= HCY.z / Z;
  }
  else if (Z < 1)
  {
      HCY.y *= (1 - HCY.z) / (1 - Z);
  }
  return (RGB - Z) * HCY.y + HCY.z;
}

I've folded the code into my web page on such conversions here.

Monday 25 June 2012

Gratuitous Aphorism #5

Diversity is twice as beneficial as university.

Thursday 21 June 2012

QR Codes

I've been looking at QR codes recently. I'm intrigued at the idea of being able to customise them by deliberately obscuring sections, in the knowledge that the contents can be reconstructed via error correction. In experimentation, I've created two images that are successfully-decoded URLs:


 The latter is more impressive, in my opinion, as it is only Version 2, with low data redundancy.

Sunday 17 June 2012

Skeleton Alphabet 4

Here are some alternative forms of the lowercase skeletons that only use straight lines and circular arcs.

To remove the nasty corner from the bowl of 'u' on the left, below, follow the red arc for the first half but the green arc for the second.


Similarly for 'n':


The 'm' is just two 'n' letters stuck together, with a small fillet in the middle:


We make the 'l' and 't' narrower:



Open the tail of 'g' and straighten the tail of 'y':



Smooth out the head of 'f' and shorten its crossbar:


We also make the central bars of 'a' and 'e' horizontal:



Alas, the last two changes make the 'a' a little unbalanced and the 'e' a bit wide.

Never mind, we continue by making the 'k' such that the arm and leg are at right angles. Draw the leg first, then fit the arm:


Finally, the 's' needs fettling to only use circular arcs. This turns out to be very tricky. Here's one, less than satisfactory, construction:


That needs a bit more work put into it!

Thus far, the amended skeleton lowercase alphabet looks like this:


The whole set looks like this (badly kerned):



Saturday 16 June 2012

Skeleton Alphabet 3

Now for lowercase letters. We take the unit square (blue) and create a smaller construction motif (green) of size 'x' centred at the bottom:


Ann Camp suggests the x-height to be three-fifths, but if you choose 2/π (about 0.6366) things work out nicely later on (see below).


Ascenders go to the top of the unit square, thereby making them the same height as the capitals:


Similarly, descenders go below the baseline by the same amount:


In the 'g' above, the upper bowl is a circle of diameter one-half. The lower bowl is drawn by eye, according to Ann Camp.

Here are the trivial constructions:
















The 'm' and 'w' letters are constructed like two 'n' and 'v' letters glued together. Because of our strange choice of x-height, they are exactly one unit wide:



The upper portion of the 'f' is like the arc of 'r':


The tail of 'j' is a quarter circle:


The crux of 'k' is determined in a similar manner as that of the capital 'K':


The 's' is drawn by eye:


The vertical of 't' is half the height of a standard ascender:


The tail of 'y' is drawn by eye:


All this leads to the following lowercase grid, with shadow glyphs from Arial (pink), Calibri (green) and Lucida (blue):


As with the capitals, there are a few changes I would personally make, by I'll leave that until next time.