Friday, December 11, 2009

Screen-scraping an ASP.NET application in PHP

I recently was asked to assist a friend in developing a PHP script to automate authenticating to an ASP.NET web application and scraping a small piece of data from a secured page that changes periodically. The cURL library in PHP works well for this task, but there are a few important aspects of an ASP.NET application to recognize:

  • ASP.NET applications that use forms-based authentication typically process login input and return an encrypted authentication token in the form of a cookie to the browser. Capturing this cookie properly is necessary to automate the process.

  • ASP.NET also employees a mechanism known as ViewState that must be respected. The mechanism is provided for developers that need to maintain the illusion of a stateful environment across multiple postbacks to a page. This allows, for example, a user to enter a value in a textbox, click a submit button, and have that value remembered in the textbox as the page is refreshed. This kind of functionality in other platforms is typically very complicated to build, but ASP.NET is engineered to support it natively.

    It does so by passing a value to the browser with each participating page as a special hidden <input> tag with the name "__VIEWSTATE". The encrypted value is automatically included then as a POST parameter when the page is submitted back to the server. The ASP.NET application server then processes the encrypted value to restore user interface control values and properties as it reconstructs the "new" page it sends back to the browser (with an updated __VIEWSTATE tag).

    So a PHP script that automates posting to any ASP.NET form must respect this process and properly capture - and resubmit - the appropriate __VIEWSTATE value.

  • ASP.NET has additional mechanisms to validate page posts, supplying additional encrypted information in a hidden <input> named "__EVENTVALIDATION". If this value is present, the PHP script must capture it as well.

To ensure __VIEWSTATE and __EVENTVALIDATION are respected when submitting a form typically requires two calls to an ASP.NET page. The first is a simple GET, the results of which may be parsed with regular expressions to pull the appropriate hidden <input> values. The second is the POST which submits desired <input> values as well as the previously retrieved __VIEWSTATE and __EVENTVALIDATION. Performing these two calls on a login form, and submitting appropriate account information successfully returns the authentication cookie which is then submitted for each subsequent request for a secured page.

The following script uses the cURL library to perform such a login and accesses a secured page. The variables at the top of the script may be modified to match the desired site and account information.

<?php
/************************************************
* ASP.NET web site scraping script;
* Developed by MishaInTheCloud.com
* Copyright 2009 MishaInTheCloud.com. All rights reserved.
* The use of this script is governed by the CodeProject Open License
* See the following link for full details on use and restrictions.
* http://www.codeproject.com/info/cpol10.aspx
*
* The above copyright notice must be included in any reproductions of this script.
************************************************/

/************************************************
* values used throughout the script
************************************************/
// urls to call - the login page and the secured page
$urlLogin = "http://www.server.com/blahblah/login.aspx";
$urlSecuredPage = "http://www.server.com/blahblah/securedPage.aspx";

// POST names and values to support login
$nameUsername='txtusername'; // the name of the username textbox on the login form
$namePassword='txtpassword'; // the name of the password textbox on the login form
$nameLoginBtn='btnlogin'; // the name of the login button (submit) on the login form
$valUsername ='myUsername'; // the value to submit for the username
$valPassword ='myPassword'; // the value to submit for the password
$valLoginBtn ='Login'; // the text value of the login button itself

// the path to a file we can read/write; this will
// store cookies we need for accessing secured pages
$cookies = 'someReadableWritableFileLocation\cookie.txt';

// regular expressions to parse out the special ASP.NET
// values for __VIEWSTATE and __EVENTVALIDATION
$regexViewstate = '/__VIEWSTATE\" value=\"(.*)\"/i';
$regexEventVal = '/__EVENTVALIDATION\" value=\"(.*)\"/i';


/************************************************
* utility function: regexExtract
* use the given regular expression to extract
* a value from the given text; $regs will
* be set to an array of all group values
* (assuming a match) and the nthValue item
* from the array is returned as a string
************************************************/
function regexExtract($text, $regex, $regs, $nthValue)
{
if (preg_match($regex, $text, $regs)) {
$result = $regs[$nthValue];
}
else {
$result = "";
}
return $result;
}



/************************************************
* initialize a curl handle; we'll use this
* handle throughout the script
************************************************/
$ch = curl_init();


/************************************************
* first, issue a GET call to the ASP.NET login
* page. This is necessary to retrieve the
* __VIEWSTATE and __EVENTVALIDATION values
* that the server issues
************************************************/
curl_setopt($ch, CURLOPT_URL, $urlLogin);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$data=curl_exec($ch);

// from the returned html, parse out the __VIEWSTATE and
// __EVENTVALIDATION values
$viewstate = regexExtract($data,$regexViewstate,$regs,1);
$eventval = regexExtract($data, $regexEventVal,$regs,1);


/************************************************
* now issue a second call to the Login page;
* this time, it will be a POST; we'll send back
* as post data the __VIEWSTATE and __EVENTVALIDATION
* values the server previously sent us, as well as the
* username/password. We'll also set up a cookie
* jar to retrieve the authentication cookie that
* the server will generate and send us upon login.
************************************************/
$postData = '__VIEWSTATE='.rawurlencode($viewstate)
.'&__EVENTVALIDATION='.rawurlencode($eventval)
.'&'.$nameUsername.'='.$valUsername
.'&'.$namePassword.'='.$valPassword
.'&'.$nameLoginBtn.'='.$valLoginBtn
;

curl_setOpt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
curl_setopt($ch, CURLOPT_URL, $urlLogin);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile);

$data = curl_exec($ch);


/************************************************
* with the authentication cookie in the jar,
* we'll now issue a GET to the secured page;
* we set curl's COOKIEFILE option to the same
* file we used for the jar before to ensure the
* authentication cookie is sent back to the
* server
************************************************/
curl_setOpt($ch, CURLOPT_POST, FALSE);
curl_setopt($ch, CURLOPT_URL, $urlSecuredPage);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile);

$data = curl_exec($ch);

// at this point the secured page may be parsed for
// values, or additional POSTS made to submit parameters
// and retrieve data. For this sample, we'll just
// echo the results.
echo $data;



/************************************************
* that's it! Close the curl handle
************************************************/
curl_close($ch);


?>

Friday, November 20, 2009

Bring it on, Intel

I continue to find written communication to be challenging (evidenced by the fact that I've started about a dozen blog posts in the past three months, without finishing any!) I have good thoughts, but writing them well is its own skill. Struggling with this, I respect those for whom good writing comes naturally that much more.

So when I read that there are some Intel researchers predicting that by 2020 we will have brain implant chips capable of controlling computers through thoughts, I thought: Bring it on, Intel! Maybe we're controlling computers in 2020, maybe by 2050 we have ubiquitous mind-reading machines. I'm not worried that someone someday could read my thoughts - I welcome it. Then I wouldn't have to struggle to put them into words.

Sunday, August 9, 2009

A Simple Way to Create Crosstab Queries in SQL Server - Part 2

Previously, I demonstrated a simple technique for creating SQL queries that produce crosstab-style output. The technique involves wrapping CASE statements in aggregate functions, which makes for clean, readable source.

In the previous examples, I used hardcoded individual cases to compute each shipper in its own column. While easy to do, this isn't exactly a data-driven solution. It would be nice, for example, to be able to add a fourth shipper to our database without having to recode these crosstab queries.

Fortunately, thanks to dynamic SQL, with a little more work we can construct our CASE statements by querying the Shippers table directly. To do this we'll define a variable @cases to hold the portion of the SELECT clause that comprises our computed columns. The template for each CASE statement looks like the following, with fields from the Shippers table in {curly braces}:
, sum(CASE WHEN ShipVia = {ShipperID} THEN Freight ELSE 0 END) as [{CompanyName}]


Notice the leading comma in the template. Since we're grouping by CustomerID and listing it first in the SELECT clause, we can have a leading comma in our first and each subsequent case. The first part of our SQL script queries the Shippers table to construct each CASE statement, storing the concatenated strings in the @cases variable.
declare @cases varchar(8000)
set @cases = ''
select @cases
= @cases + ', sum(CASE WHEN ShipVia = '
+ Convert(varchar(5),ShipperID)
+ ' THEN Freight ELSE 0 END) as [' + CompanyName + ']'
from Shippers


The second part of the script wraps our cases in the full SQL statement to aggregate by CustomerID. Since this is a dynamic SQL statement, we'll wrap the whole thing in an execute command to produce the desired crosstabular output:
execute(
'SELECT CustomerID '
+ @cases
+ ' from Orders Group By CustomerID Order by CustomerID'
)


This technique becomes even more attractive when there are several crosstab columns to be defined. Consider the following example which computes total sales by customer, broken out by product categories in columns. Because there is some complexity to joining the necessary detail tables together, the first part of this script computes a subquery in the variable @sqlInner. The subquery is then aggregated, with cases computed dynamically by querying the Categories table. The final execute statement produces the crosstab output.

declare @sqlInner varchar(8000)
set @sqlInner = '
(
select o.CustomerID
, (od.UnitPrice * od.Quantity) - od.Discount as Sales
, p.CategoryID
from [order details] as od
inner join Products p on od.ProductID = p.ProductID
inner join Orders o on od.OrderID = o.OrderID
) x
'

declare @cases varchar(8000)
set @cases = ''
select @cases
= @cases
+ ', sum(CASE WHEN CategoryID = '
+ Convert(varchar(5),CategoryID)
+ ' THEN Sales ELSE 0 end) as [' + CategoryName + ']'
from Categories


execute(
'SELECT CustomerID as [Total Sales by Customer]'
+ @cases
+ ' from ' + @sqlInner
+ ' Group By CustomerID Order by CustomerID'
)


With the help of dynamic SQL execution, CASE statements that support columns in crosstabular output may be generatated in a data-driven way. By querying the appropriate table for individual cases, users may add or modify data without the need to recode the crosstab query.

Saturday, August 8, 2009

A Simple Way to Create Crosstab Queries in SQL Server - Part 1

This week's programming exercise is using SQL to generate crosstab-style datasets from a SQL Server database. Using the Northwind database as our sample, we'll say that we want to derive crosstab reports to show aggregates of orders by customer.

A clever use of the SQL Server CASE statement is helpful here. The CASE statement in a query allows a column to be computed based on multiple possible conditions. It is like a switch statement in C, C++, C#, Java, or a Select Case statement in VB. Each "case" or condition is laid out, with a corresponding value for the computed column when the case is matched. The first case matched wins if multiple cases can match, and an ELSE default may be supplied to provide a value when no case is matched.

When wrapped in an aggregate function, like sum() or avg(), a CASE statement becomes a convenient means to create crosstab-like output in a simple query. Looking at Northwind then, our first aggregate query computes the total freight charges for each customer, with the shipping companies broken out as columns.

select CustomerID as [Total Freight by Customer]
, sum(CASE WHEN ShipVia = 1 then Freight ELSE 0 END) as [Speedy Express]
, sum(CASE WHEN ShipVia = 2 then Freight ELSE 0 END) as [United Package]
, sum(CASE WHEN ShipVia = 3 then Freight ELSE 0 END) as [Federal Shipping]
from Orders
group by CustomerID
order by CustomerID


Each CASE statement in this example specifies a single condition and a default ELSE. Before applying the sum() function, the CASE statements compute three columns of detail fields to the normal Orders table. Two of the three computed fields will be zero; the column which matches the ShipVia condition will contain the value of the shipping Freight. When the sum() function is then applied, the aggregations are computed in the correct columns.

A simple change to the ELSE defaults from zero to null allows the query to be adjusted to compute averages:

select CustomerID as [Average Freight by Customer]
, avg(CASE WHEN ShipVia = 1 then Freight ELSE null END) as [Speedy Express]
, avg(CASE WHEN ShipVia = 2 then Freight ELSE null END) as [United Package]
, avg(CASE WHEN ShipVia = 3 then Freight ELSE null END) as [Federal Shipping]
from Orders
group by CustomerID
order by CustomerID


To compute counts, one might intuitively think of the count() aggregation function, but I find it simpler to follow this same pattern and use sum(). The CASE statements are modified so that when the appropriate condition is matched the computed value is simply a static value of one (rather than the Freight field in the examples). The default ELSE computes a zero. When the sum() aggregation is applied across all the matching ones, the result is a successful counting of the appropriate records for the shipper:

select CustomerID as [Orders Shipped by Customer]
, sum(CASE WHEN ShipVia = 1 then 1 ELSE 0 END) as [Speedy Express]
, sum(CASE WHEN ShipVia = 2 then 1 ELSE 0 END) as [United Package]
, sum(CASE WHEN ShipVia = 3 then 1 ELSE 0 END) as [Federal Shipping]
from Orders
group by CustomerID
order by CustomerID


Wrapping CASE statements within aggregate functions is a clean and surprisingly readable way of creating crosstabular output. Tune in tomorrow for part 2 where we will use dynamic SQL to generalize this query, allowing the values in the Shippers table to drive the CASEs rather than hardcoding them as we've done here.

Thursday, August 6, 2009

Discoveries: Her & Kings County

Generally speaking, I am not a big fan of country music. When I first heard about the band Her & Kings County, I was told they're a good country band, actively gigging across the U.S.A and opening for the likes of Kid Rock. While the country influence is definitely there, to call Her & Kings County a country band is to unfairly limit them to a single genre when there are all kinds of musical influences present and well-blended: pop, blues, bluegrass, jazz, and good hard rock & roll. Should this group find success with a major label, or as an indie, I am sure they'll be classified as "crossover" in the best sense of the word.

In my opinion, these guys rock! They're definitely worth a listen.
http://www.myspace.com/her

Monday, August 3, 2009

How I became Wii-tarded

The Nintendo Wii has recently reached a milestone, selling over 50 million units since its debut in 2006. Its typically-bundled companion, Wii Sports, has worldwide lifetime sales in excess of 47 million. A quick look through Wikipedia suggests that Wii Sports is the best-selling video game of all time, and thus quite likely the most-played video game of all time.

Of course, Wii Sports' sales figures benefited greatly from the game being bundled with the Wii console, but that just shows the genius of Nintendo back in 2006. The Wii, with its motion-sensing remote control and nunchuck attachment, was a system with a built-in barrier: would users really take to the novelty of playing games with physical gestures as input, unlike any system that came before? Selling the system for hundreds less than the competing XBox 360 and PlayStation 3, combined with the bundling of Wii Sports - a game that was instantly accessible, proved the capability of the new input controls, and turned out to be fun for the whole family - more than anything else made the Wii an instant hit and the must-buy item for Christmas in 2006.

It is in recognition of the Wii's success that I present a homage in the form of my own story - the true story of a father's crazy hope to get that must-buy Christmas item and how it ultimately worked out in the end.

How I became Wii-tarded
A Christmas tale
December 2006

Twas the week before Christmas, and all through the town,
Some parents were stirring, for Wii's to be found.
Their bodies in lines, too stubborn to quit,
Till out came... oh screw this rhyming shit!

There were several big-chain stores near my house, within a block or two of eachother that were carrying Wii's on Sunday the 17th. I knew this because I did my research! Surprisingly, this research wasn't as easy as one might think. Asking the question "do you know if you'll have Wii's in stock on Sunday?" would always result in a "no". But asking the question "So what is the process you will be using this Sunday when you sell your Wii's?" got me much more useful information.

My target was Target, close to home, which would be, according to the phone voice of a clerk in electronics, "using a ticket system" (that answer confirming that they in fact would have Wii's in stock). Okay! Get in line early enough, get a ticket, I'm in business. Some more web research revealed that all kinds of stores would be selling Wii's on Sunday; Target seemed my best bet.

My Target opened at 8:00 Sunday morning, so I figured getting there by 7:00 would be fine. It has been a month since this system launched, after all. The scalpers are probably done. I actually pulled up around ten til 7:00 and saw one guy walking up to the door and no line as of yet. Great! This will give me a little time to get some coffee! I'm set.

On the way to Starbucks a block away, I passed Toys'r'Us, which also opened at 8:00 but had a significant line, about 25 people or so (not huge, but significant). A manager was outside pointing and counting. I pulled up and called out my window, "Is it worth parking and getting in line?". The manager yelled back "Nope- there's more than enough here to cover my stock". Okay - no biggie. There was only one person at Target, and I know they'll have 'em in stock.

I pulled up to Starbucks and got my Venti Americano. Sure, you'd think a Grande would be more than sufficient, and on a typical Sunday morning it would be. But it was cold and windy out, and I needed to pack some heat. It's a hunt after all! Venti it is.

There's a Circuit City right next to the Starbucks, so I walk over and see a more modest line - only about 18 people or so. Guessing that Circuit City might have 20 or so in stock, I get on the line and am informed by the others (who are in a friendly - you might even say jolly - mood) that I'm #14... even better! There are couples in the line, and only one Wii going to a household. And this wasn't a line of eBay scalpers - these were parents and grandparents who wanted the system for their families. They were downright enjoyable. I struck up a conversation with one guy in an Iowa Hawkeyes jacket, informed him I was Hawkeye alumni myself, and it turned out just about everyone on the line (evenly split between males and females) were not only college football fans, but Big Ten fans specifically. We had a fun conversation for about fifteen minutes, and everyone was in a good mood.

Until the manager came out and said he only had 8 units in stock.

Okay. At least he came out and didn't make everyone wait in vain. It's still early (about 7:15 by now). The folks on line had said Best Buy had already had a line of about 50 or so people, so no good going there. Yup - I was smart in casing Target. Still my ace in the hole.

I pull around back to Target, and now there is a substantial line. Again, not a Harry-Potter-when-book-7-comes-out sized line, but several folks, making me wish I had stuck to the plan and ditched the coffee run. I park, get on line, and am told I'm #13... then someone on the line says "A lady came out about 10 minutes ago saying she thought they had 12 Wii's in stock." Gulp. Okay, I'm not so smart. The guy in front of me (again, the line was a very friendly group of mostly parents and grandparents) suggested I wait because she had said she was going back to make sure. Just one more... find just one more...

The guy at the very front of the line (the one I saw walking up earlier) sounded less confident though. There was a GameStop in the same mall area as this Target, and there were about six people lined up there waiting. "GameStop opens at 10:00 - and those folks were there when I got here... why wouldn't they be in line over here?" Ahh... they're not thinking that Target will have Wii's today. Ha ha on them! We in our line knew better. Our line could beat up your line!

But then something really interesting happened. After another ten minutes or so, a group of five boys, looking either like they're seniors in high school or maybe in their first year or two of college, walk up. They're in a group like a gang, showing some attitude, definitely not dressed for the cold (sporting all the teenage angst of hip-hop lowriding shorts in 28-degree weather), and they come up to the line which had added about five or six more behind me at this point. One of the boys then says, "hey, what are we doing in the back of the line?" And the group proceeds to strut around to the front by the store doors.

Now, it's useful for me to point out that I had already resolved in my head that this was probably a losing proposition for me. Even before 7:00 I was prepared to accept that my family wouldn't get their Wii this Christmas. What the hell, I had thought... I'd at least give it a try and see. Hearing that there were only 12 in stock at this Target had made me come to terms with not getting a Wii as a certainty. Frankly, I was starting to get irritated with myself to still be sitting on the line with such little hope. I could be driving back to Starbucks and enjoying my coffee in a warm room with a newspaper, after all. But...

The appearance of this gang of boys added a drama that was too juicy to leave. The "whoosh" sound I heard may have been a gust of wind, or the collective jaws of the people in front of me hitting the ground. The tiny moment of uncomfortable silence was just enough for my brain to play the main theme from The Good, The Bad, and the Ugly.

One of the guys toward the front of the line - a man a few years older than me, burly and stocky, with a gruffness in his voice and stature that suggested he wouldn't take no crap from anyone even on a sunny day - stepped out to confront the gang. "So what do you think you're doing, boys? The line is back there." Now, if these boys had been a bit savvier, they may have realized what it looked like to all the grownups already on this line and responded differently than they did. As it was, one of them said, "We were here first." Clint Eastwood's hand is twitching over his holster!

I'm sure it was fortunate for everyone that there wasn't any real fighting that came of this, though a sick part of me wished I had a box of popcorn.

Parsing through the ensuing flurry of raising voices, the boys were contending that they had actually been standing in line at this Target since the night before. Gulp. The manager had come out at about 6:30 this morning and handed out 12 tickets to the first 12 people who had been in line overnight. Gulp gulp. Several of the grownups were very skeptical, but it was making sense to me. I knew this Target would be using a ticket system, and it would explain why the GameStop folks were where they were, if they knew Target's Wii's were already gone. Still, the skeptical grownups demanded proof, and one of the kids showed his "ticket"... hmmm... it hardly looks official - just a scrap of plain white paper with a number in magic marker and a scribble that might or might not be a signature? Okay, this could be a scam, and I decide to wait while the front-liners sort it out.

As the confrontation progresses (with no sign of a Target manager to confirm anything) it seems to me more and more that the boys are legit. They don't sound like a gang at all, they sound like normal kids to me. I start talking to one of them, and again have a conversation as enjoyable as I had had earlier at Circuit City. These are good kids - they're not scamming anyone. And sure enough, another group of three people - this time grownups - come up to the line and show their tickets. Now it is clear - everyone in the line since 7:00 have been waiting in vain, and some do leave. I'm now enjoying the conversation I'm having with the group of boys, and the newcomers (a young mother, a grandmother, and a man wearing a Michigan sweatshirt - it was Big Ten Sunday for me!) so I stay for a few more minutes.

The boys had had a lousy night. We had freezing rain that night in Vegas, with gusting cold winds. These boys definitely did not dress for the weather (one was noticeably shivering now). "It really wasn't so bad until around 1:00 in the morning... then the winds got really bad..." they told me. They were bright, had loads of energy - it was rejuvenating to be in that youthful company. They were all college-aged; two working full-time at this very Target, one getting an associates degree at community college. I remember standing in line for a Star Wars movie once when I was young ;-) This is a full month after the initial launch of this Wii system, and still there were people waiting in overnight cold lines to get one. Yea, they wanted it more than me, no hard feelings. Thanks for the great conversation! Happy Holidays guys, congratulations, have a blast with your new Wii's... and I started to walk away.

And then a woman pulled up in a VW bug (there was a whole new group in line behind me by this point, who had missed the initial information and weren't buying the ticket story). She called to the line, "is anyone willing to sell their spot?" Hmmmm... there's an idea.... and the lady doesn't even realize she's not offering to buy a spot in line, she's actually offering to buy a guaranteed ticket! Nobody responded, and the lady drove away. I looked at the kids and said, "Hey, any of you want to give me your ticket for fifty bucks?" I happened to have the cash on me (I normally don't) and to my amazement, one of these kids' eyes lit up. It turns out, he was perfectly happy playing his friend's Wii until he got his own next month, and was thrilled to get some extra cash for the holiday. I almost tried talking him out of it ("you waited all night you know... are you sure?") but by that point we had all gotten very friendly and he and I completed our transaction. I got ticket #6 for my troubles, and actually gave him $100, both because it was what I had on me, and because it just seemed like the right thing to do.

Another woman (a determined grandmother) saw our exchange and bought another ticket from one of the other four boys. We both thought it was money well spent.

At 8:22, I left my Target, with a Wii, a box of crayons, and some kids' underwear. Target does have some other stuff you know. I'm not a complete freak.

Saturday, August 1, 2009

Slightly More Sophisticated WinForms Dirty Tracking in C#

The C# programming tip this week is a follow-up to last week's Simple Dirty Tracking for WinForms in which I demonstrated a simple technique for coding the functionality of tracking whether or not a user has changed a document since its last save. This is useful for prompting the user to "save changes" upon closing a "dirty" document. The technique involved creating a controller class that assigns event handlers to the "changed" events for tracked input controls, and exposes an IsDirty property.

This controller can tell if a user has changed the value in a text box, but it can't tell if the user changed the value back. The document wouldn't need saving in this case, but the simple controller would still report that the form is dirty based on the simple condition of text changing (in any way) in the TextBox control.

So this week's programming exercise is to create a slightly more sophisticated Dirty Tracker for WinForms - one that determines if the document is dirty by comparing values in input controls to the values remembered at the time of the last save (or initialization). This approach does not require event handling, but instead requires tracking a collection of controls and their "clean" values for later comparison.

To start, we create a ControlDirtyTracker class that performs the ... um... dirty work. It is within this class that we'll establish which control types are trackable, and record the clean control values as a private string. A slightly, slightly more sophisticated class might store control values as the more generic object type, but we'll use a string here for simplicity.

public class ControlDirtyTracker                                                                
{
private Control _control;
private string _cleanValue;

// read only properties
public Control Control { get { return _control; } }
public string CleanValue { get { return _cleanValue; } }

...

}

We'll also decide which control types to support, and how to obtain a given input control's current value:
public class ControlDirtyTracker                                                                    
{
...

// static class utility method; return whether or not the control type
// of the given control is supported by this class;
// developers may modify this to extend support for other types
public static bool IsControlTypeSupported(Control ctl)
{
// list of types supported
if (ctl is TextBox) return true;
if (ctl is CheckBox) return true;
if (ctl is ComboBox) return true;
if (ctl is ListBox) return true;

// ... add additional types as desired ...

// not a supported type
return false;
}


// private method to determine the current value (as a string) of the control;
// developers may modify this to extend support for other types
private string GetControlCurrentValue()
{
if (_control is TextBox)
return (_control as TextBox).Text;

if (_control is CheckBox)
return (_control as CheckBox).Checked.ToString();

if (_control is ComboBox)
return (_control as ComboBox).Text;

if (_control is ListBox)
{
// for a listbox, create a list of the selected indexes
StringBuilder val = new StringBuilder();
ListBox lb = (_control as ListBox);
ListBox.SelectedIndexCollection coll = lb.SelectedIndices;
for (int i = 0; i < coll.Count; i++)
val.AppendFormat("{0};", coll[i]);

return val.ToString();
}

// ... add additional types as desired ...

return "";
}

Finally we add the constructor, passing the control to track, a method to establish the current control value as the "clean" value, and a method to determine if the control value has changed since the remembered "clean" value.
public class ControlDirtyTracker                                                                    
{
...

// constructor establishes the control and uses its current value as "clean"
public ControlDirtyTracker(Control ctl)
{
// if the control type is not one that is supported, throw an exception
if (ControlDirtyTracker.IsControlTypeSupported(ctl))
_control = ctl;
else
throw new NotSupportedException(
string.Format("The control type for '{0}' is not supported by the ControlDirtyTracker class."
, ctl.Name)
);

}


// method to establish the the current control value as "clean"
public void EstablishValueAsClean()
{
_cleanValue = GetControlCurrentValue();
}


// determine if the current control value is considered "dirty";
// i.e. if the current control value is different than the one
// remembered as "clean"
public bool DetermineIfDirty()
{
// compare the remembered "clean value" to the current value;
// if they are the same, the control is still clean;
// if they are different, the control is considered dirty.
return (string.Compare(_cleanValue, GetControlCurrentValue(), false) != 0);
}


// end of the class
}

Since we'll be tracking multiple input controls, we'll create the collection class ControlDirtyTrackerCollection. In it we'll define methods to add controls from a form, to list all the tracked controls that are currently dirty, and to establish all tracked controls as clean.
public class ControlDirtyTrackerCollection: List<ControlDirtyTracker>
{

// constructors
public ControlDirtyTrackerCollection() : base() { }
public ControlDirtyTrackerCollection(Form frm) : base()
{
// initialize to the controls on the passed in form
AddControlsFromForm(frm);
}


// utility method to add the controls from a Form to this collection
public void AddControlsFromForm(Form frm)
{
AddControlsFromCollection(frm.Controls);
}

// recursive routine to inspect each control and add to the collection accordingly
public void AddControlsFromCollection(Control.ControlCollection coll)
{
foreach (Control c in coll)
{
// if the control is supported for dirty tracking, add it
if (ControlDirtyTracker.IsControlTypeSupported(c))
this.Add(new ControlDirtyTracker(c));

// recurively apply to inner collections
if (c.HasChildren)
AddControlsFromCollection(c.Controls);
}
}

// loop through all controls and return a list of those that are dirty
public List<Control> GetListOfDirtyControls()
{
List<Control> list = new List<Control>();

foreach (ControlDirtyTracker c in this)
{
if (c.DetermineIfDirty())
list.Add(c.Control);
}

return list;
}


// mark all the tracked controls as clean
public void MarkAllControlsAsClean()
{
foreach (ControlDirtyTracker c in this)
c.EstablishValueAsClean();
}

}

At this point, we now have in our collection class the means to track input controls on the form with little additional work. We instantiate the collection in the Load event of the form:
// form private member
private ControlDirtyTrackerCollection _trackedControls;

private void Form1_Load(object sender, EventArgs e)
{
// in the Load event initialize our tracking object
_trackedControls = new ControlDirtyTrackerCollection(this);
_trackedControls.MarkAllControlsAsClean();
}

We call _trackedControls.MarkAllControlsAsClean() whenever the document is saved. Then, when the form is closing, we can prompt the user to save again if any of the values in the tracked controls have changed.
private void Form1_FormClosing(object sender, FormClosingEventArgs e)
{
// in the closing event, we should prompt to save if there are any dirty controls
if (_trackedControls.GetListOfDirtyControls().Count > 0)
{
// prompt the user
if (MessageBox.Show("Would you like to save changes before closing?"
, "Save Changes"
, MessageBoxButtons.YesNoCancel
, MessageBoxIcon.Question)
== DialogResult.Yes)
{
// if the user says Yes to save...
SaveTheDocumentHoweverYouUsuallyDo();
_trackedControls.MarkAllControlsAsClean();
}

}
}

And that's it. It is a relatively small amount of code to add to the form for tracking input control changes between saves, and the tracking objects are reusable. There's a little more work to do to create the controller classes than our previous attempt, but it is still simple to code and doesn't turn out the false positives we had before.