Friday, December 11, 2009

Screen-scraping an ASP.NET application in PHP

I recently was asked to assist a friend in developing a PHP script to automate authenticating to an ASP.NET web application and scraping a small piece of data from a secured page that changes periodically. The cURL library in PHP works well for this task, but there are a few important aspects of an ASP.NET application to recognize:

  • ASP.NET applications that use forms-based authentication typically process login input and return an encrypted authentication token in the form of a cookie to the browser. Capturing this cookie properly is necessary to automate the process.

  • ASP.NET also employees a mechanism known as ViewState that must be respected. The mechanism is provided for developers that need to maintain the illusion of a stateful environment across multiple postbacks to a page. This allows, for example, a user to enter a value in a textbox, click a submit button, and have that value remembered in the textbox as the page is refreshed. This kind of functionality in other platforms is typically very complicated to build, but ASP.NET is engineered to support it natively.

    It does so by passing a value to the browser with each participating page as a special hidden <input> tag with the name "__VIEWSTATE". The encrypted value is automatically included then as a POST parameter when the page is submitted back to the server. The ASP.NET application server then processes the encrypted value to restore user interface control values and properties as it reconstructs the "new" page it sends back to the browser (with an updated __VIEWSTATE tag).

    So a PHP script that automates posting to any ASP.NET form must respect this process and properly capture - and resubmit - the appropriate __VIEWSTATE value.

  • ASP.NET has additional mechanisms to validate page posts, supplying additional encrypted information in a hidden <input> named "__EVENTVALIDATION". If this value is present, the PHP script must capture it as well.

To ensure __VIEWSTATE and __EVENTVALIDATION are respected when submitting a form typically requires two calls to an ASP.NET page. The first is a simple GET, the results of which may be parsed with regular expressions to pull the appropriate hidden <input> values. The second is the POST which submits desired <input> values as well as the previously retrieved __VIEWSTATE and __EVENTVALIDATION. Performing these two calls on a login form, and submitting appropriate account information successfully returns the authentication cookie which is then submitted for each subsequent request for a secured page.

The following script uses the cURL library to perform such a login and accesses a secured page. The variables at the top of the script may be modified to match the desired site and account information.

* ASP.NET web site scraping script;
* Developed by
* Copyright 2009 All rights reserved.
* The use of this script is governed by the CodeProject Open License
* See the following link for full details on use and restrictions.
* The above copyright notice must be included in any reproductions of this script.

* values used throughout the script
// urls to call - the login page and the secured page
$urlLogin = "";
$urlSecuredPage = "";

// POST names and values to support login
$nameUsername='txtusername'; // the name of the username textbox on the login form
$namePassword='txtpassword'; // the name of the password textbox on the login form
$nameLoginBtn='btnlogin'; // the name of the login button (submit) on the login form
$valUsername ='myUsername'; // the value to submit for the username
$valPassword ='myPassword'; // the value to submit for the password
$valLoginBtn ='Login'; // the text value of the login button itself

// the path to a file we can read/write; this will
// store cookies we need for accessing secured pages
$cookies = 'someReadableWritableFileLocation\cookie.txt';

// regular expressions to parse out the special ASP.NET
$regexViewstate = '/__VIEWSTATE\" value=\"(.*)\"/i';
$regexEventVal = '/__EVENTVALIDATION\" value=\"(.*)\"/i';

* utility function: regexExtract
* use the given regular expression to extract
* a value from the given text; $regs will
* be set to an array of all group values
* (assuming a match) and the nthValue item
* from the array is returned as a string
function regexExtract($text, $regex, $regs, $nthValue)
if (preg_match($regex, $text, $regs)) {
$result = $regs[$nthValue];
else {
$result = "";
return $result;

* initialize a curl handle; we'll use this
* handle throughout the script
$ch = curl_init();

* first, issue a GET call to the ASP.NET login
* page. This is necessary to retrieve the
* that the server issues
curl_setopt($ch, CURLOPT_URL, $urlLogin);

// from the returned html, parse out the __VIEWSTATE and
$viewstate = regexExtract($data,$regexViewstate,$regs,1);
$eventval = regexExtract($data, $regexEventVal,$regs,1);

* now issue a second call to the Login page;
* this time, it will be a POST; we'll send back
* as post data the __VIEWSTATE and __EVENTVALIDATION
* values the server previously sent us, as well as the
* username/password. We'll also set up a cookie
* jar to retrieve the authentication cookie that
* the server will generate and send us upon login.
$postData = '__VIEWSTATE='.rawurlencode($viewstate)

curl_setOpt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
curl_setopt($ch, CURLOPT_URL, $urlLogin);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile);

$data = curl_exec($ch);

* with the authentication cookie in the jar,
* we'll now issue a GET to the secured page;
* we set curl's COOKIEFILE option to the same
* file we used for the jar before to ensure the
* authentication cookie is sent back to the
* server
curl_setOpt($ch, CURLOPT_POST, FALSE);
curl_setopt($ch, CURLOPT_URL, $urlSecuredPage);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile);

$data = curl_exec($ch);

// at this point the secured page may be parsed for
// values, or additional POSTS made to submit parameters
// and retrieve data. For this sample, we'll just
// echo the results.
echo $data;

* that's it! Close the curl handle


Friday, November 20, 2009

Bring it on, Intel

I continue to find written communication to be challenging (evidenced by the fact that I've started about a dozen blog posts in the past three months, without finishing any!) I have good thoughts, but writing them well is its own skill. Struggling with this, I respect those for whom good writing comes naturally that much more.

So when I read that there are some Intel researchers predicting that by 2020 we will have brain implant chips capable of controlling computers through thoughts, I thought: Bring it on, Intel! Maybe we're controlling computers in 2020, maybe by 2050 we have ubiquitous mind-reading machines. I'm not worried that someone someday could read my thoughts - I welcome it. Then I wouldn't have to struggle to put them into words.

Sunday, August 9, 2009

A Simple Way to Create Crosstab Queries in SQL Server - Part 2

Previously, I demonstrated a simple technique for creating SQL queries that produce crosstab-style output. The technique involves wrapping CASE statements in aggregate functions, which makes for clean, readable source.

In the previous examples, I used hardcoded individual cases to compute each shipper in its own column. While easy to do, this isn't exactly a data-driven solution. It would be nice, for example, to be able to add a fourth shipper to our database without having to recode these crosstab queries.

Fortunately, thanks to dynamic SQL, with a little more work we can construct our CASE statements by querying the Shippers table directly. To do this we'll define a variable @cases to hold the portion of the SELECT clause that comprises our computed columns. The template for each CASE statement looks like the following, with fields from the Shippers table in {curly braces}:
, sum(CASE WHEN ShipVia = {ShipperID} THEN Freight ELSE 0 END) as [{CompanyName}]

Notice the leading comma in the template. Since we're grouping by CustomerID and listing it first in the SELECT clause, we can have a leading comma in our first and each subsequent case. The first part of our SQL script queries the Shippers table to construct each CASE statement, storing the concatenated strings in the @cases variable.
declare @cases varchar(8000)
set @cases = ''
select @cases
= @cases + ', sum(CASE WHEN ShipVia = '
+ Convert(varchar(5),ShipperID)
+ ' THEN Freight ELSE 0 END) as [' + CompanyName + ']'
from Shippers

The second part of the script wraps our cases in the full SQL statement to aggregate by CustomerID. Since this is a dynamic SQL statement, we'll wrap the whole thing in an execute command to produce the desired crosstabular output:
'SELECT CustomerID '
+ @cases
+ ' from Orders Group By CustomerID Order by CustomerID'

This technique becomes even more attractive when there are several crosstab columns to be defined. Consider the following example which computes total sales by customer, broken out by product categories in columns. Because there is some complexity to joining the necessary detail tables together, the first part of this script computes a subquery in the variable @sqlInner. The subquery is then aggregated, with cases computed dynamically by querying the Categories table. The final execute statement produces the crosstab output.

declare @sqlInner varchar(8000)
set @sqlInner = '
select o.CustomerID
, (od.UnitPrice * od.Quantity) - od.Discount as Sales
, p.CategoryID
from [order details] as od
inner join Products p on od.ProductID = p.ProductID
inner join Orders o on od.OrderID = o.OrderID
) x

declare @cases varchar(8000)
set @cases = ''
select @cases
= @cases
+ ', sum(CASE WHEN CategoryID = '
+ Convert(varchar(5),CategoryID)
+ ' THEN Sales ELSE 0 end) as [' + CategoryName + ']'
from Categories

'SELECT CustomerID as [Total Sales by Customer]'
+ @cases
+ ' from ' + @sqlInner
+ ' Group By CustomerID Order by CustomerID'

With the help of dynamic SQL execution, CASE statements that support columns in crosstabular output may be generatated in a data-driven way. By querying the appropriate table for individual cases, users may add or modify data without the need to recode the crosstab query.

Saturday, August 8, 2009

A Simple Way to Create Crosstab Queries in SQL Server - Part 1

This week's programming exercise is using SQL to generate crosstab-style datasets from a SQL Server database. Using the Northwind database as our sample, we'll say that we want to derive crosstab reports to show aggregates of orders by customer.

A clever use of the SQL Server CASE statement is helpful here. The CASE statement in a query allows a column to be computed based on multiple possible conditions. It is like a switch statement in C, C++, C#, Java, or a Select Case statement in VB. Each "case" or condition is laid out, with a corresponding value for the computed column when the case is matched. The first case matched wins if multiple cases can match, and an ELSE default may be supplied to provide a value when no case is matched.

When wrapped in an aggregate function, like sum() or avg(), a CASE statement becomes a convenient means to create crosstab-like output in a simple query. Looking at Northwind then, our first aggregate query computes the total freight charges for each customer, with the shipping companies broken out as columns.

select CustomerID as [Total Freight by Customer]
, sum(CASE WHEN ShipVia = 1 then Freight ELSE 0 END) as [Speedy Express]
, sum(CASE WHEN ShipVia = 2 then Freight ELSE 0 END) as [United Package]
, sum(CASE WHEN ShipVia = 3 then Freight ELSE 0 END) as [Federal Shipping]
from Orders
group by CustomerID
order by CustomerID

Each CASE statement in this example specifies a single condition and a default ELSE. Before applying the sum() function, the CASE statements compute three columns of detail fields to the normal Orders table. Two of the three computed fields will be zero; the column which matches the ShipVia condition will contain the value of the shipping Freight. When the sum() function is then applied, the aggregations are computed in the correct columns.

A simple change to the ELSE defaults from zero to null allows the query to be adjusted to compute averages:

select CustomerID as [Average Freight by Customer]
, avg(CASE WHEN ShipVia = 1 then Freight ELSE null END) as [Speedy Express]
, avg(CASE WHEN ShipVia = 2 then Freight ELSE null END) as [United Package]
, avg(CASE WHEN ShipVia = 3 then Freight ELSE null END) as [Federal Shipping]
from Orders
group by CustomerID
order by CustomerID

To compute counts, one might intuitively think of the count() aggregation function, but I find it simpler to follow this same pattern and use sum(). The CASE statements are modified so that when the appropriate condition is matched the computed value is simply a static value of one (rather than the Freight field in the examples). The default ELSE computes a zero. When the sum() aggregation is applied across all the matching ones, the result is a successful counting of the appropriate records for the shipper:

select CustomerID as [Orders Shipped by Customer]
, sum(CASE WHEN ShipVia = 1 then 1 ELSE 0 END) as [Speedy Express]
, sum(CASE WHEN ShipVia = 2 then 1 ELSE 0 END) as [United Package]
, sum(CASE WHEN ShipVia = 3 then 1 ELSE 0 END) as [Federal Shipping]
from Orders
group by CustomerID
order by CustomerID

Wrapping CASE statements within aggregate functions is a clean and surprisingly readable way of creating crosstabular output. Tune in tomorrow for part 2 where we will use dynamic SQL to generalize this query, allowing the values in the Shippers table to drive the CASEs rather than hardcoding them as we've done here.

Thursday, August 6, 2009

Discoveries: Her & Kings County

Generally speaking, I am not a big fan of country music. When I first heard about the band Her & Kings County, I was told they're a good country band, actively gigging across the U.S.A and opening for the likes of Kid Rock. While the country influence is definitely there, to call Her & Kings County a country band is to unfairly limit them to a single genre when there are all kinds of musical influences present and well-blended: pop, blues, bluegrass, jazz, and good hard rock & roll. Should this group find success with a major label, or as an indie, I am sure they'll be classified as "crossover" in the best sense of the word.

In my opinion, these guys rock! They're definitely worth a listen.

Monday, August 3, 2009

How I became Wii-tarded

The Nintendo Wii has recently reached a milestone, selling over 50 million units since its debut in 2006. Its typically-bundled companion, Wii Sports, has worldwide lifetime sales in excess of 47 million. A quick look through Wikipedia suggests that Wii Sports is the best-selling video game of all time, and thus quite likely the most-played video game of all time.

Of course, Wii Sports' sales figures benefited greatly from the game being bundled with the Wii console, but that just shows the genius of Nintendo back in 2006. The Wii, with its motion-sensing remote control and nunchuck attachment, was a system with a built-in barrier: would users really take to the novelty of playing games with physical gestures as input, unlike any system that came before? Selling the system for hundreds less than the competing XBox 360 and PlayStation 3, combined with the bundling of Wii Sports - a game that was instantly accessible, proved the capability of the new input controls, and turned out to be fun for the whole family - more than anything else made the Wii an instant hit and the must-buy item for Christmas in 2006.

It is in recognition of the Wii's success that I present a homage in the form of my own story - the true story of a father's crazy hope to get that must-buy Christmas item and how it ultimately worked out in the end.

How I became Wii-tarded
A Christmas tale
December 2006

Twas the week before Christmas, and all through the town,
Some parents were stirring, for Wii's to be found.
Their bodies in lines, too stubborn to quit,
Till out came... oh screw this rhyming shit!

There were several big-chain stores near my house, within a block or two of eachother that were carrying Wii's on Sunday the 17th. I knew this because I did my research! Surprisingly, this research wasn't as easy as one might think. Asking the question "do you know if you'll have Wii's in stock on Sunday?" would always result in a "no". But asking the question "So what is the process you will be using this Sunday when you sell your Wii's?" got me much more useful information.

My target was Target, close to home, which would be, according to the phone voice of a clerk in electronics, "using a ticket system" (that answer confirming that they in fact would have Wii's in stock). Okay! Get in line early enough, get a ticket, I'm in business. Some more web research revealed that all kinds of stores would be selling Wii's on Sunday; Target seemed my best bet.

My Target opened at 8:00 Sunday morning, so I figured getting there by 7:00 would be fine. It has been a month since this system launched, after all. The scalpers are probably done. I actually pulled up around ten til 7:00 and saw one guy walking up to the door and no line as of yet. Great! This will give me a little time to get some coffee! I'm set.

On the way to Starbucks a block away, I passed Toys'r'Us, which also opened at 8:00 but had a significant line, about 25 people or so (not huge, but significant). A manager was outside pointing and counting. I pulled up and called out my window, "Is it worth parking and getting in line?". The manager yelled back "Nope- there's more than enough here to cover my stock". Okay - no biggie. There was only one person at Target, and I know they'll have 'em in stock.

I pulled up to Starbucks and got my Venti Americano. Sure, you'd think a Grande would be more than sufficient, and on a typical Sunday morning it would be. But it was cold and windy out, and I needed to pack some heat. It's a hunt after all! Venti it is.

There's a Circuit City right next to the Starbucks, so I walk over and see a more modest line - only about 18 people or so. Guessing that Circuit City might have 20 or so in stock, I get on the line and am informed by the others (who are in a friendly - you might even say jolly - mood) that I'm #14... even better! There are couples in the line, and only one Wii going to a household. And this wasn't a line of eBay scalpers - these were parents and grandparents who wanted the system for their families. They were downright enjoyable. I struck up a conversation with one guy in an Iowa Hawkeyes jacket, informed him I was Hawkeye alumni myself, and it turned out just about everyone on the line (evenly split between males and females) were not only college football fans, but Big Ten fans specifically. We had a fun conversation for about fifteen minutes, and everyone was in a good mood.

Until the manager came out and said he only had 8 units in stock.

Okay. At least he came out and didn't make everyone wait in vain. It's still early (about 7:15 by now). The folks on line had said Best Buy had already had a line of about 50 or so people, so no good going there. Yup - I was smart in casing Target. Still my ace in the hole.

I pull around back to Target, and now there is a substantial line. Again, not a Harry-Potter-when-book-7-comes-out sized line, but several folks, making me wish I had stuck to the plan and ditched the coffee run. I park, get on line, and am told I'm #13... then someone on the line says "A lady came out about 10 minutes ago saying she thought they had 12 Wii's in stock." Gulp. Okay, I'm not so smart. The guy in front of me (again, the line was a very friendly group of mostly parents and grandparents) suggested I wait because she had said she was going back to make sure. Just one more... find just one more...

The guy at the very front of the line (the one I saw walking up earlier) sounded less confident though. There was a GameStop in the same mall area as this Target, and there were about six people lined up there waiting. "GameStop opens at 10:00 - and those folks were there when I got here... why wouldn't they be in line over here?" Ahh... they're not thinking that Target will have Wii's today. Ha ha on them! We in our line knew better. Our line could beat up your line!

But then something really interesting happened. After another ten minutes or so, a group of five boys, looking either like they're seniors in high school or maybe in their first year or two of college, walk up. They're in a group like a gang, showing some attitude, definitely not dressed for the cold (sporting all the teenage angst of hip-hop lowriding shorts in 28-degree weather), and they come up to the line which had added about five or six more behind me at this point. One of the boys then says, "hey, what are we doing in the back of the line?" And the group proceeds to strut around to the front by the store doors.

Now, it's useful for me to point out that I had already resolved in my head that this was probably a losing proposition for me. Even before 7:00 I was prepared to accept that my family wouldn't get their Wii this Christmas. What the hell, I had thought... I'd at least give it a try and see. Hearing that there were only 12 in stock at this Target had made me come to terms with not getting a Wii as a certainty. Frankly, I was starting to get irritated with myself to still be sitting on the line with such little hope. I could be driving back to Starbucks and enjoying my coffee in a warm room with a newspaper, after all. But...

The appearance of this gang of boys added a drama that was too juicy to leave. The "whoosh" sound I heard may have been a gust of wind, or the collective jaws of the people in front of me hitting the ground. The tiny moment of uncomfortable silence was just enough for my brain to play the main theme from The Good, The Bad, and the Ugly.

One of the guys toward the front of the line - a man a few years older than me, burly and stocky, with a gruffness in his voice and stature that suggested he wouldn't take no crap from anyone even on a sunny day - stepped out to confront the gang. "So what do you think you're doing, boys? The line is back there." Now, if these boys had been a bit savvier, they may have realized what it looked like to all the grownups already on this line and responded differently than they did. As it was, one of them said, "We were here first." Clint Eastwood's hand is twitching over his holster!

I'm sure it was fortunate for everyone that there wasn't any real fighting that came of this, though a sick part of me wished I had a box of popcorn.

Parsing through the ensuing flurry of raising voices, the boys were contending that they had actually been standing in line at this Target since the night before. Gulp. The manager had come out at about 6:30 this morning and handed out 12 tickets to the first 12 people who had been in line overnight. Gulp gulp. Several of the grownups were very skeptical, but it was making sense to me. I knew this Target would be using a ticket system, and it would explain why the GameStop folks were where they were, if they knew Target's Wii's were already gone. Still, the skeptical grownups demanded proof, and one of the kids showed his "ticket"... hmmm... it hardly looks official - just a scrap of plain white paper with a number in magic marker and a scribble that might or might not be a signature? Okay, this could be a scam, and I decide to wait while the front-liners sort it out.

As the confrontation progresses (with no sign of a Target manager to confirm anything) it seems to me more and more that the boys are legit. They don't sound like a gang at all, they sound like normal kids to me. I start talking to one of them, and again have a conversation as enjoyable as I had had earlier at Circuit City. These are good kids - they're not scamming anyone. And sure enough, another group of three people - this time grownups - come up to the line and show their tickets. Now it is clear - everyone in the line since 7:00 have been waiting in vain, and some do leave. I'm now enjoying the conversation I'm having with the group of boys, and the newcomers (a young mother, a grandmother, and a man wearing a Michigan sweatshirt - it was Big Ten Sunday for me!) so I stay for a few more minutes.

The boys had had a lousy night. We had freezing rain that night in Vegas, with gusting cold winds. These boys definitely did not dress for the weather (one was noticeably shivering now). "It really wasn't so bad until around 1:00 in the morning... then the winds got really bad..." they told me. They were bright, had loads of energy - it was rejuvenating to be in that youthful company. They were all college-aged; two working full-time at this very Target, one getting an associates degree at community college. I remember standing in line for a Star Wars movie once when I was young ;-) This is a full month after the initial launch of this Wii system, and still there were people waiting in overnight cold lines to get one. Yea, they wanted it more than me, no hard feelings. Thanks for the great conversation! Happy Holidays guys, congratulations, have a blast with your new Wii's... and I started to walk away.

And then a woman pulled up in a VW bug (there was a whole new group in line behind me by this point, who had missed the initial information and weren't buying the ticket story). She called to the line, "is anyone willing to sell their spot?" Hmmmm... there's an idea.... and the lady doesn't even realize she's not offering to buy a spot in line, she's actually offering to buy a guaranteed ticket! Nobody responded, and the lady drove away. I looked at the kids and said, "Hey, any of you want to give me your ticket for fifty bucks?" I happened to have the cash on me (I normally don't) and to my amazement, one of these kids' eyes lit up. It turns out, he was perfectly happy playing his friend's Wii until he got his own next month, and was thrilled to get some extra cash for the holiday. I almost tried talking him out of it ("you waited all night you know... are you sure?") but by that point we had all gotten very friendly and he and I completed our transaction. I got ticket #6 for my troubles, and actually gave him $100, both because it was what I had on me, and because it just seemed like the right thing to do.

Another woman (a determined grandmother) saw our exchange and bought another ticket from one of the other four boys. We both thought it was money well spent.

At 8:22, I left my Target, with a Wii, a box of crayons, and some kids' underwear. Target does have some other stuff you know. I'm not a complete freak.

Saturday, August 1, 2009

Slightly More Sophisticated WinForms Dirty Tracking in C#

The C# programming tip this week is a follow-up to last week's Simple Dirty Tracking for WinForms in which I demonstrated a simple technique for coding the functionality of tracking whether or not a user has changed a document since its last save. This is useful for prompting the user to "save changes" upon closing a "dirty" document. The technique involved creating a controller class that assigns event handlers to the "changed" events for tracked input controls, and exposes an IsDirty property.

This controller can tell if a user has changed the value in a text box, but it can't tell if the user changed the value back. The document wouldn't need saving in this case, but the simple controller would still report that the form is dirty based on the simple condition of text changing (in any way) in the TextBox control.

So this week's programming exercise is to create a slightly more sophisticated Dirty Tracker for WinForms - one that determines if the document is dirty by comparing values in input controls to the values remembered at the time of the last save (or initialization). This approach does not require event handling, but instead requires tracking a collection of controls and their "clean" values for later comparison.

To start, we create a ControlDirtyTracker class that performs the ... um... dirty work. It is within this class that we'll establish which control types are trackable, and record the clean control values as a private string. A slightly, slightly more sophisticated class might store control values as the more generic object type, but we'll use a string here for simplicity.

public class ControlDirtyTracker                                                               
private Control _control;
private string _cleanValue;

// read only properties
public Control Control { get { return _control; } }
public string CleanValue { get { return _cleanValue; } }



We'll also decide which control types to support, and how to obtain a given input control's current value:
public class ControlDirtyTracker                                                                   

// static class utility method; return whether or not the control type
// of the given control is supported by this class;
// developers may modify this to extend support for other types
public static bool IsControlTypeSupported(Control ctl)
// list of types supported
if (ctl is TextBox) return true;
if (ctl is CheckBox) return true;
if (ctl is ComboBox) return true;
if (ctl is ListBox) return true;

// ... add additional types as desired ...

// not a supported type
return false;

// private method to determine the current value (as a string) of the control;
// developers may modify this to extend support for other types
private string GetControlCurrentValue()
if (_control is TextBox)
return (_control as TextBox).Text;

if (_control is CheckBox)
return (_control as CheckBox).Checked.ToString();

if (_control is ComboBox)
return (_control as ComboBox).Text;

if (_control is ListBox)
// for a listbox, create a list of the selected indexes
StringBuilder val = new StringBuilder();
ListBox lb = (_control as ListBox);
ListBox.SelectedIndexCollection coll = lb.SelectedIndices;
for (int i = 0; i < coll.Count; i++)
val.AppendFormat("{0};", coll[i]);

return val.ToString();

// ... add additional types as desired ...

return "";

Finally we add the constructor, passing the control to track, a method to establish the current control value as the "clean" value, and a method to determine if the control value has changed since the remembered "clean" value.
public class ControlDirtyTracker                                                                   

// constructor establishes the control and uses its current value as "clean"
public ControlDirtyTracker(Control ctl)
// if the control type is not one that is supported, throw an exception
if (ControlDirtyTracker.IsControlTypeSupported(ctl))
_control = ctl;
throw new NotSupportedException(
string.Format("The control type for '{0}' is not supported by the ControlDirtyTracker class."
, ctl.Name)


// method to establish the the current control value as "clean"
public void EstablishValueAsClean()
_cleanValue = GetControlCurrentValue();

// determine if the current control value is considered "dirty";
// i.e. if the current control value is different than the one
// remembered as "clean"
public bool DetermineIfDirty()
// compare the remembered "clean value" to the current value;
// if they are the same, the control is still clean;
// if they are different, the control is considered dirty.
return (string.Compare(_cleanValue, GetControlCurrentValue(), false) != 0);

// end of the class

Since we'll be tracking multiple input controls, we'll create the collection class ControlDirtyTrackerCollection. In it we'll define methods to add controls from a form, to list all the tracked controls that are currently dirty, and to establish all tracked controls as clean.
public class ControlDirtyTrackerCollection: List<ControlDirtyTracker>

// constructors
public ControlDirtyTrackerCollection() : base() { }
public ControlDirtyTrackerCollection(Form frm) : base()
// initialize to the controls on the passed in form

// utility method to add the controls from a Form to this collection
public void AddControlsFromForm(Form frm)

// recursive routine to inspect each control and add to the collection accordingly
public void AddControlsFromCollection(Control.ControlCollection coll)
foreach (Control c in coll)
// if the control is supported for dirty tracking, add it
if (ControlDirtyTracker.IsControlTypeSupported(c))
this.Add(new ControlDirtyTracker(c));

// recurively apply to inner collections
if (c.HasChildren)

// loop through all controls and return a list of those that are dirty
public List<Control> GetListOfDirtyControls()
List<Control> list = new List<Control>();

foreach (ControlDirtyTracker c in this)
if (c.DetermineIfDirty())

return list;

// mark all the tracked controls as clean
public void MarkAllControlsAsClean()
foreach (ControlDirtyTracker c in this)


At this point, we now have in our collection class the means to track input controls on the form with little additional work. We instantiate the collection in the Load event of the form:
// form private member
private ControlDirtyTrackerCollection _trackedControls;

private void Form1_Load(object sender, EventArgs e)
// in the Load event initialize our tracking object
_trackedControls = new ControlDirtyTrackerCollection(this);

We call _trackedControls.MarkAllControlsAsClean() whenever the document is saved. Then, when the form is closing, we can prompt the user to save again if any of the values in the tracked controls have changed.
private void Form1_FormClosing(object sender, FormClosingEventArgs e)
// in the closing event, we should prompt to save if there are any dirty controls
if (_trackedControls.GetListOfDirtyControls().Count > 0)
// prompt the user
if (MessageBox.Show("Would you like to save changes before closing?"
, "Save Changes"
, MessageBoxButtons.YesNoCancel
, MessageBoxIcon.Question)
== DialogResult.Yes)
// if the user says Yes to save...


And that's it. It is a relatively small amount of code to add to the form for tracking input control changes between saves, and the tracking objects are reusable. There's a little more work to do to create the controller classes than our previous attempt, but it is still simple to code and doesn't turn out the false positives we had before.

Thursday, July 30, 2009

Discoveries: Public Domain Pictures

It is amazing to me what a difference a polished photo or image can make in a blog article. You can see a difference - blogs with great article content but no pictures feel like they are lacking something. It may be that the splash of color or other artistic composition in an image appeals to one part of the brain while the words appeal to another, providing a more holistic experience as a reader.

Companies with a good instinct (or department) for public relations understand this and will provide polished images as part of press kits. For the blog writer who is not a talented photographer, there are also sites like and which provide quality photos, free for personal or commercial use. Each also provides photos from sponsors at a cost, but their libraries of free images are sizable.

Tuesday, July 28, 2009

The Emerging Competition over Streamed Video Games

It feels like we're on the cusp of a new industry - or at least an upheaval in an existing one. A battle is emerging among startups and established giants over streamed video games, one that I see as a precursor toward streamed computing.

The players are staking their claims. We have Onlive, whose approach centers around solving the growing problem for publishers that games are costing more and more money to make but profits ironically harder and harder to find. Onlive's approach also seems to value social interaction, with "spectating" and "brag clips" among their features. If technically and commercially viable, Onlive could usher in a console-less era for video games.

Then there's Gaikai, taking a browser-based approach, looking not so much to challenge console supremacy as to bring in new players to the market. By offering publishers a service, Gaikai proposes to give publishers flexibility to let newcomers try their games at little cost.

With over 100 employees, Onlive seems like a goliath compared to software company Otoy with only seven. But Otoy sports a strategic partnership with AMD to build "a new kind of supercomputer". AMD builds the massive graphics-crunching hardware, Otoy supplies the scalable graphics software, and the combination potentially affords another platform for distributing video games and PC applications, particularly those with intense graphics requirements.

Recently the company PlayCast launched a pilot in Israel, working with the cable network Hot to deploy streamed games on demand through cable set-top boxes. Streaming games/computing seem a natural extension for a cable company and could be yet another service they can provide, along with media distribution, Internet service, and telephony. I would think within five years most if not all the major cable providers will be active in this space, be it through their own development or through strategic partnerships with companies like PlayCast.

And then there's the cell phone providers and manufacturers. Companies like Verizon are already providing games-on-demand services, and manufacturers like Apple have shown that cell phone devices like the iPhone can prove a capable platform for video games. These efforts aren't exactly the "streaming" I have in mind when writing of companies like Onlive and Gaikai. Currently with Verizon's and Apple's services, users download complete games to their device (PC or cell phone) then play them through the processing power of the device. The streaming revolution will come when Verizon and others build their own server farms and wireless networks are fast enough to support latency-free streamed gaming directly to the cell phone.

Console makers Microsoft (XBox Live) and Nintendo (WiiWare) each offer a downloadable games service too, but again, users are downloading the games through the service to play through the processing power of the console. Sony is presently shunning suggestions to deploy a games-on-demand service, but they as well as Microsoft or Nintendo would be natural competitors (and logical participants?) should any decide to offer a streaming solution.

It is exciting to consider the number of different approaches being taken toward streamed video games, and to realize how many different companies, and different types of companies, can take a serious stake in the development of this new industry. As streamed video games become mainstream (within five years perhaps?) we will also have the solution (solutions?) for streamed computing in general.

Sunday, July 26, 2009


I consider myself a terrible writer. Part of the reason I began this blog was the possibly misguided hope that forcing myself to write more frequently would improve my ability to communicate. It may be that I'm too hard on myself, but the process of writing is something I continue to find more tedious than joyful. I am hoping that in time, and with practice, that can change.

I do respect those however who have a talent for writing and can somehow find joy in the writing process itself. The well-chosen language we hear or read from someone else can have a profound impact on how we may then think about a given topic. A good command of language helps to frame thought.

Well-chosen language can be used against us, as marketers and politicians have known for ages. But I prefer today to think of the positive inspiration that can come from a well-written book or article. So, with an eye for improvement in vocabulary, my discovery for the day is the web site

I think I became a fan when I read the definition for intexticated.

Saturday, July 25, 2009

Simple Dirty Tracking for WinForms in C#

A common requirement for Windows Forms applications is to track whether or not a user has made changes to a document. Upon closing, the application can check if the document has been changed (considered "dirty") and prompt the user to save. If the document isn't "dirty" - if it hasn't been changed since its last save - the application can forego such a prompt.

A simple way to create this behavior is to track a "dirty" flag for the form, and trap the appropriate event on input controls to catch value changes. Rather than create an event handler for every input control, and write all that code again for the next form, a developer can create a simple helper class - one that assigns the appropriate event handling code for each of a form's input controls, and tracks the "dirty" state on behalf of the form. Such a class can then be reused across many forms and projects.

Here's a sample of such a class that demonstrates this simple technique.

public class FormDirtyTracker
private Form _frm;
private bool _isDirty;

// property denoting whether the tracked form is clean or dirty
public bool IsDirty
get { return _isDirty; }
set { _isDirty = value; }

// methods to make dirty or clean
public void SetAsDirty()
_isDirty = true;

public void SetAsClean()
_isDirty = false;

// initialize in the constructor by assigning event handlers
public FormDirtyTracker(Form frm)
_frm = frm;

// recursive routine to inspect each control and assign handlers accordingly
private void AssignHandlersForControlCollection(Control.ControlCollection coll)
foreach (Control c in coll)
if (c is TextBox)
(c as TextBox).TextChanged += new EventHandler(FormDirtyTracker_TextChanged);

if (c is CheckBox)
(c as CheckBox).CheckedChanged += new EventHandler(FormDirtyTracker_CheckedChanged);

// ... apply for other input types similarly ...

// recurively apply to inner collections
if (c.HasChildren)

// event handlers
private void FormDirtyTracker_TextChanged(object sender, EventArgs e)
_isDirty = true;

private void FormDirtyTracker_CheckedChanged(object sender, EventArgs e)
_isDirty = true;


The class is instantiated in the form's Load handler like this:

private FormDirtyTracker _dirtyTracker;

public Form1()

private void Form1_Load(object sender, EventArgs e)
// instantiate a tracker to determine if values have changed in the form
_dirtyTracker = new FormDirtyTracker(this);

Then when the form is closing, the developer can simply check _dirtyTracker.IsDirty to determine if the user should be prompted to save.

This isn't a particularly sophisticated technique, and considers the document "dirty" with any change - even if the user changes a value back to its original. It does track user input however, is simple to code, and is reusable across multiple forms.

Friday, July 24, 2009

Discoveries: Information Aesthetics

A good portion of my job involves deriving useful information from data - through extraction, analysis, modeling, waterboarding... whatever it takes to make the data talk.

I am fascinated by the presentation of data and discovered the web site "Infosthetics" is a portmanteau of "information aesthetics" and the site is a blog highlighting interesting presentations of data.

It's an interesting site, worth a look even if you aren't a data geek.

Wednesday, July 22, 2009

Barnes & Noble vs. Amazon

Barnes & Noble announced this week the launch of their eBookstore, selling digitally distributed books, akin to Amazon's Kindle service. B&N provides reader software for the iPhone/iPod Touch and Blackberry devices as well as PC and Mac operating systems. Their announcement also included a statement of a strategic partnership with Plastic Logic, whose e-Reader device currently under development will directly compete with the Kindle and Sony's Reader.

It is significant that, at least for the time being, Barnes & Noble is not manufacturing their own reader device. In the long run, I don't see the devices (Kindle, Plastic Logic) as the focus of competition for Amazon and Barnes & Noble. In fact, as e-book adoption becomes widespread, it wouldn't surprise me if Amazon got out of the business of device manufacturing entirely. Their long-term success is in the sales of the content, and it is here that they and B&N will compete: their online store experience and the service afforded to customers.

For example, the service that Amazon provides with distribution of content via cellular networks to the Kindle is terrifically convenient for consumers, and of clear value to those who wish to avoid having to sync with a computer. Cynics may point out that it is also a great way to lock-in consumers to the Amazon store, discouraging competition from others - and they would be right.

But while that sort of lock-in may be in Amazon's near-term interests, over time they will provide better service for individuals with devices other than the Kindle (they already have decent reader software for the iPhone). And though Barnes & Noble's electronic format is currently incompatible with the Kindle, I believe that in time consumers will demand choice here - to read their purchased content on whichever device suits them, regardless of the content vendor. Companies like Amazon and B&N will ultimately support that choice, because over the long term their success is in the sale of the content, not in the devices themselves.

Barnes & Noble's entry into this market is a good development for consumers. At the moment, the space is Amazon's and, to a lesser extent, Sony's. With a big player like B&N directly competing in areas of consumer service and convenience, Amazon will have good motivation to avoid complacency at the top.

Monday, July 20, 2009


From time to time I stumble upon a web site that offers something of unique utility for me. The other day I discovered, which as the domain name implies, offers a toolset for creating and printing custom blank sheet music.

There are some terrific music notation software packages available, but for composition I still appreciate the creative freedom that comes when sketching musical ideas with a pen and real paper. I appreciate the variety of options in's toolset, and it saves trips to the music store.

Thursday, July 16, 2009

Stop "Run as Administrator" in Vista

I had a strange exchange with Windows Vista in which I was prompted to run my favorite text editor as Administrator. I ended up clicking “Yes”, and then realized that my text editor was no longer recognizing the drives mapped by my favorite FTP-mapping-to-drive utility.

Assuming the elevation in access was putting me in a different user account context, I decided the solution was to re-run the text editor under my normal privileges. The shortcut icon in my Start menu didn’t seem to want me to though. It retained that familiar shield icon, and launched each time from that point on with Administrator privileges. I’m not sure why I expected to see a “Stop Running as Administrator” when right-clicking the icon, but was frustrated that the command didn’t exist. Ironically, “Run as Administrator” was still an available option, however redundant and, in this moment, tauntingly sardonic like an evil curly-mustached landlord in a melodrama.

When displaying Properties for an application, there’s a convenient checkbox on the Compatibility tab to permanently set the app to always run with elevated privileges. Of course – this box must now be checked for my text editor… I was surprised to see that it wasn’t.

The actual solution lay in the Show settings for all users button immediately below the checkbox. Until this problem, I hadn’t realized that Vista maintains and applies a duplicate set of default compatibility settings for programs. Windows had applied my “Run as Administrator” acceptance on the default user setting rather than on my user account. Unchecking the box there solved the problem.

Incidentally, my favorite text editor is UltraEdit, and I like WebDrive as an FTP-mapping utility. Each is inexpensive and has helped me greatly in my daily productivity.

Thursday, July 9, 2009

Google is Announcing the Chrome OS

A month ago I wrote an article about several trends in web computing that were marginalizing the importance of the traditional PC operating system, at least from an average user's perspective as a platform for running applications. Significantly this month, Google has announced the Google Chrome OS - a lean operating system based on a Linux kernel that treats the Chrome browser as the application platform. Targeted initially for netbooks and users who spend most of their PC time currently in a web browser, Chrome OS should afford a significant, low-cost offering for anyone currently using web-based applications.

As Google states on its blog: "For application developers, the web is the platform." Current momentum suggests that the web will continue to overtake the desktop OS as the ubiquitous application platform for average users. Google's development of Chrome is a logical step in that direction.

Saturday, June 6, 2009

Personal Computing: The End of the Operating System

Ah, the Operating System. That layer of software that speaks directly to the hardware in a computer, functioning as a host for applications. It is the interface that lets application developers avoid having to deal with the complex inner workings of the hardware itself. OS's like CP/M, MS-DOS, System 1.0, Windows 3.1 all seem humble today, but their evolution and their descendants have helped make personal computing ubiquitous.

In today's world, to choose a computer is to choose an operating system. The average user buys a PC with Windows on it, or maybe Linux, or buys a Mac with the latest flavor of OS X. The choice for the user typically starts with the OS. But with the rise of the Internet, we are nearing a new phase of application delivery and execution that begs the question: are we nearing the end of the personal computer operating system as we know it?

For the average user, the operating system is nothing more than a way to get to one's applications. The typical computer user doesn't care about OS intricacies like driver communication or internal file management. For the typical user, the operating system is simply a platform for executing applications and manipulating information through those applications. It is the application that is important, and the OS choice is made largely based on the applications the user wishes to run.

Over time, new kinds of application platforms have emerged. The concept of the virtual machine is one example of an application platform that, from an end user's perspective marginalizes the operating system. Virtual machine environments like Java, and more recently .NET/Mono, focus on a "write once, run everywhere" promise with clear benefits for developers. End users see benefits as well, getting to run desired applications without having to fore-go their choice of operating system. That's the promise, anyway; actual implementation proves more complicated, but virtual machine environments have succeeded enough for their evolution to continue on a large scale.

As Internet computing has grown, additional opportunities have arisen which challenge the personal computing paradigm. Google and Yahoo have led the way with AJAX frameworks that enable software to be developed and hosted by a provider, with the end user executing applications through the browser of his or her choice. The evolution of HTML standards, particularly HTML5, combined with intelligent JavaScript development and server hosting provides an application platform that diminishes the relevance of the end user's operating system.

With the development of the Chrome browser, Google has taken this effort one leap further. Chrome isn't so much a web browser as it is a multi-threaded JavaScript application execution environment. It could become the preferred front-end for executing these kinds of JavaScript-centric, web-hosted applications. Google's work with the Android operating system encourages robust application experiences on mobile phones, and with Chrome running on cheap netbooks (or possibly a Google-marketed netbook with Chrome embedded in place of an OS), users have an application execution paradigm that completely bypasses the traditional Personal Computer + Operating System model.

In addition to virtual machines and web-hosted JavaScript applications, an even more ambitious cloud computing model may be emerging. I have written previously about the potential for application processing in server farms, with audio/video streamed back to the user. OnLive is currently developing the infrastructure for streaming live video game experiences to subscribed users through broadband connections. In such a model, the application provider effectively becomes the operating system, with a user interfacing through cheap dumb terminal devices while still experiencing a rich, graphical application experience. Should this model prove technically and commercially viable, one may expect this new paradigm to emerge as a significant replacement for Personal Computers and Operating Systems.

One way or another, there is tremendous momentum to replace the Operating System as the dominant personal computing application platform. As Internet computing continues to grow, personal computing as we know it today will soon be considered quaint. As new application platforms emerge and provide useful experiences for average users, the Operating System as the focus of user choice will soon fade into obsolescence.

Wednesday, May 27, 2009

Cloud Computing for the Masses

I was in attendance recently at the Usual Suspects radio show, where the spry host asked each of us present for a definition of cloud computing. The ensuing conversation was interesting, and though we touched on Web 2.0-style applications, generally speaking we settled on something like the following definition: cloud computing is the delivery of hosted services over the Internet. Our conversation had been from a tech/server/provider perspective - not surprising as we were all computing professionals. Subsequently I have started to think about the question more from the perspective of an average computer user.

For the most part, current cloud computing implementations require a computer for the end user; even though there may be services provided through the Internet, there still is the need for the end user to have a machine with sufficient computing power to execute at least a browser and the application. But suppose the application were delivered without the need for the end user to have such a computer? What if the entire application experience were delivered over the Internet and required no application processing on the user's end, not even a browser?

This certainly isn't a new idea. Mainframes have been delivering a complete application experience to dumb terminals for decades. For an end-user-friendly experience however, a full, responsive GUI with rich audio and video must be delivered, to a variety of different client configurations:

  • in your office or home office: a 24" monitor, some speakers, a keyboard, a mouse, a printer
  • in your family room: a 50" plasma display with a remote, surround-sound speakers, game controllers, possibly wireless keyboards/mice
  • in your kitchen: a touch-screen flat display hanging on the wall
  • in your favorite Internet cafe: a 20" screen with a keyboard and game controllers (and cup holder)
  • in your pocket: a phone or other portable device with a small touch screen, built-in keyboard/game buttons

In all cases the "dumb terminal" client device would contain embedded software (probably on a specialized chip in the monitor) that manages a connection to an application provider, sends input, and receives and decompresses audio/video frame updates. But that is all it does - this embedded software is dumb in that it is not processing the user input or running the application. It only needs to know how to pass information both directions: sending input to the cloud, and receiving output from the cloud.

Standards would likely emerge among among consumer electronics manufacturers and application providers so one device could connect to any of a number of providers. Conversely, you as the consumer could sign on to your application desktop from any of a number of client devices (office, family room, kitchen, portable, etc.) It is likely that as the functionality in the specialized chip became ubiquitous it would be inexpensive to produce and thus not add much in cost to the device. As for application processing, the application provider does the heavy-lifting, establishing powerful server farms for accepting client connections, processing application input, and sending back audio/video updates.

Assuming application providers can profitably deliver these experiences in a cost-effective way for consumers, imagine what the consumer no longer has to worry about:

  • Purchasing/repairing/frequently updating expensive computing hardware
  • Troubleshooting hardware/software conflicts
  • Workstation administration duties (file backups, applying operating system patches, virus and malware protection)
  • Managing multiple workstations, between job and home or multiple computers in the home
  • Maintaining a library of software and data disks
  • Important documents and data are always available (instead of "the data I need is on the wrong computer...")
  • Document security (handled now by the provider)
  • The horrible customer assistance experience of being bounced back and forth between different hardware and software companies, each saying the problem is the other's fault

But is this vision of delivering a full application experience with a rich audio/video interface through the Internet really feasible?

The technical challenge that either enables it or makes it impossible boils down to this: Can input be transmitted over the Internet, processed in a server farm, with the resulting output transmitted back, decompressed and displayed fast enough to appear to the end user as if he or she controlled the action? This is a critical measure for a positive cloud computing experience from an end-user's perspective - the applications must appear responsive.

So how fast is "fast enough?" Robert B. Miller's Response time in man-computer conversational transactions, published by the Association for Computing Machinery in 1968, remains a useful reference for assessing response delay. I'll pick Miller's Topic 1 - Response to control activation as the yardstick for "fast enough" here. Miller suggested that an action such as the clicking of a typewriter key should be met with a response that appears "immediate" to the user - "perceived as a part of the mechanical action induced by the operator" [p. 271]. Miller suggested a time delay of no more than 0.1 second is perceived by the end user as a simultaneous response.

For example, if the user clicks a mouse on a spreadsheet cell and the visual display of highlighting that cell appears to the user within 0.1 second, the user's perception is that he or she controlled the action - he or she made the cell highlight. If the delay between user input and the resulting recognizable effect is greater than that, the user begins to feel more like the computer controlled the action - like he or she submitted a command that the computer processed, rather than he or she directly highlighted the cell. The user no longer feels in control.

This model of cloud computing is possible then, if the process of transmitting input over the Internet for processing in a server farm and transmitting and displaying the resulting output takes at most around 0.1 seconds, or 100 milliseconds of time. A would-be application provider would look to the following to limit this delay to 100 or fewer milliseconds:

  • A faster Internet. Better Internet bandwidth, all the way to the typical home or office; faster wireless as well.
  • Well constructed, powerful server farms. Have the most powerful hardware possible combined with the fastest grid-style operating software for managing connections to take less time processing user input. Have several located throughout the country (world?) to maximize proximity for consumers.
  • Exceptional audio/video compression and decompression. Reduce the amount of data being sent back to the clients, thus requiring less bandwidth, and reduce the time taken to display the compressed video.

The current commercial effort that most closely matches this vision of cloud computing is that of the company OnLive, which had something of a coming-out party at the Game Developers Conference in March 2009. OnLive is forging a potentially industry-shaking distribution model for streaming high-end video games, one with clear benefits for game publishers. OnLive claims to have enabled their streaming model by, among other things, developing a technological breakthrough in video compression. That claim will be put to the test in late 2009 and 2010 as their system scales up with actual subscribers. Whether or not OnLive succeeds, they have sparked the imagination for what cloud computing can ultimately mean. And even if they don't succeed as a distributor of video games, if they have accomplished their stated compression breakthrough, others will certainly license the technology or mimic it.

And should OnLive succeed in providing a great gaming experience through their model, overcoming the 100-millisecond challenge with pricing that is reasonable for consumers, haven't they effectively proven that this cloud computing model is the future for personal computing? After all, video games are just software applications. In fact, the highest-end video games are particularly complex software applications requiring a great deal of computing power. If OnLive succeeds technically and commercially with the toughest of applications in video games, the model can certainly work with a word processor or spreadsheet.

Isn't it just a matter of time before we see significant improvements in the areas that would concern providers? Faster Internet to the home, improved bandwidth, specialized server farm components developed cheaply, improvements in compression... we'll see positive steps if not leaps in all these areas over the coming years, the combination of which will enable a completely new paradigm for executing applications.