Wednesday, March 14, 2007

Sports and power ranking systems

I've had an interest for a long time in the science of sports team rankings, for various reasons, which was forcibly brought to mind when I was filling out my tournament bracket. I'm always extremely mediocre in such game-picking contests, and when one blogger whom I respect said something similar, I started to think about how rankings could be done for college basketball. I looked a little closer at the Pomeroy Rankings, which seem pretty nice, although if Ken reveals his exact formula for creating them, I couldn't determine it. IMO a ranking system can't be taken seriously if the method that is used isn't known. Take Jeff Sagarin: Everyone always prints his rankings up very seriously, but we don't know what he's doing, so he might as well be making them up and just pretending it's math.

But Ken has something much more valuable on his site than a ranking system: a game database. For the most part, this information is not available in any easy-to-get-at-form, so if you want to create the rankings, you have to get down and do the data entry every year, which is why I've never created any system that lasted more than a year. But now, with Ken's files, maybe something useful could be done.

So I did a little research, thinking that the most effective system probably was going to be some kind of balance between a single-game Pythagorean expectation and strength of the opponent, repeating until the numbers converged. I'm sure I read a paper about that some years ago, but I can't find it now. Instead, I found this, a technique which doesn't take into account the scores at all!

But it's interesting, because it's based on the age-old theory of game commutativity; to wit: my team beat team X and team X beat your team, so my team is better than yours. Yah. It's a principle that's been widely derided for years, and people make hobbies out of finding weird cycles of games proving that Prairie View A&M is really better than Michigan after all. But there's obviously a kernel of truth in it. The paper goes into a lot of detail about setting up the graphs and putting weights on things and, you know, math, but really the principle is pretty simple. It works like this:

For each game that my team wins, it gets partial credit for each win the team it beat has.
For each game that my team loses, it gets partial debit for each loss the team it lost to has.

That's it. The questions are, do you want to go deeper and credit my team for a third or fourth level, and just how much credit do you give for each "indirect win"? The second question is easier for our purposes, because the authors of the paper do a lot more of that math stuff and come up with a simple equation for us:

Let k equal the average number of games played by each team.
The credit is (2k) / ((k^2) - k ).

For a third level, you'd square the credit, etc. But do you want to do the third level? Say the credit is .1, or 10% of a win. For the third level the credit would be .01, which doesn't seem like much, but you're talking quite a few games, too. So I'm going to have to use Ken's game database and do some research on this. Any code I create will be open-source, of course. I won't be able to do anything useful before this year's games start, but next year, watch out!

Monday, March 12, 2007

Generating classes from XML in .Net

The Oracle version of SQL has some nice keywords for returning your data in an XML format. (I suppose the other servers do too, but I've not used that feature.) When I get the XML back, I want to turn the XML into a set of business objects for easy serialization. XSD is the tool for that. Write the XML to a file, run XSD on it to generate a schema, then run XSD /c to generate the C# class file, and you've got a nice class. You can muck around with the XMLElement and XMLAttribute attributes to create nice field names, and 30 seconds to put together a static Get() method that returns a class from the XML.

Except it didn't work. The serializer threw a File Not Found error. When the XMLSerializer class has a new type it needs to serialize, it just generates the code on-the-fly and throws it into a new assembly with a name like olkdzxc.dll, and returns the class from it; but when I called the serializer, it told me that olkdzxc.dll wasn't found. Very mysterious.

Luckily, I remembered Chris Sells' old tool that was made for debugging exactly this problem, XMLSerializerPreCompiler, which lets you see the compiler errors that occur while the code is being serialized, and one of those led me to the problem: When generating the class code for an array of objects, XSD was adding an extra set of brackets in. So instead of having a class member myFoo[], I had a member myFoo[][]. Why did XSD do this? I have a hard time believing it's just a silly bug. I'd love to hear if anyone knows.

Thursday, March 08, 2007

Online and offline communities

I wrote earlier about creating online communities around business-to-business applications, with a vague promise to add more, but didn't. What rejuvenated my interest in the topic was going to a lecture and buying a book by one Joseph Myers, who lectures on creating small groups for churches. Churches generally don't like to be impersonal - it kind of misses the point - but there's no real alternative once the congregation grows beyond 100 or so people, so they like to subdivide into smaller groups so everyone has a group they can be comfortable with. What Joe Myers pointed out, as I understood it, is that it's very hard to create this sort of a group from the outside - intimacy generally arises from a group of people who like to be around each other. So, if you are a corporate executive, or a church leader, who is tasked with creating a community, how do you go about it? On the one hand, it's our job to create these communites, on the other, we know that they're not created, they just appear.

So the recommendation, for church leaders at least, is to define more exactly what groups already exist in the church. Church leaders want every small group to provide intimacy, but that's really only one way to relate to a group: the group can be more of a public group, or more of a social group, or just a personal group, and churches can take advantage of knowing how these groups relate to each other to encourage more fellowship in the church.

Is it applicable for online groups? I'm not sure. Here's the issue: at least if you're working with a church congregation, you can call a meeting, bring everyone in, discuss the issues, maybe figure out what the existing groups are and what they're doing. You can't do that online. Maybe the best thing someone tasked with creating an online group can do is simply to monitor the group, or groups, and make sure that the company is willing to go wherever the group takes them. Seems obvious, but is it? Check out Yahoo's handling of Flickr accounts, or Facebook's decision to allow non-college students to join. Or check out a lot of different online forums that die because people thought they were cool at first, but then they never changed again and everyone left for more responsive pastures.

I don't know the answers. But it's an interesting bunch of questions.

Tuesday, March 06, 2007

Quality of Local Political Blogs: Compare and Contrast

I've been writing various articles for Bloomingpedia in the last month or so, in an effort not so much to improve that site as to understand better the town in which I live. One of the keys to understanding a city, I think, is to gather a lot of different perspectives from a lot of different individuals with a stake in the matter. Take Indianapolis, for example: there are a lot of places you can go to get an impression of how the city is doing. Ruth Holladay; Matt Tully. Taking Down Words; Indy Undercover. You have to gather them all together before you can make a critical analysis of what's really going on; but they're there, that's the important thing.

Or you could read the Indianapolis Star. But I don't have much trust in the Main Stream Media. Their goal never seems to be so much the truth as it is finding someone who disagrees, no matter how foolish or inane that person may be, and unless you already know the subject matter pretty well, you can't tell from the way the article is written which is the inane perspective and which is sensible. So that leads you back to blogs.

Here are four local politicians who have been on my mind lately: Marty Hawk, Dave Rollo, Scott Tibbs, Sophia Travis. How easy is it to get their perspectives on local issues?

Far and away the best online writer in this group is Sophia Travis. If you just looked at the MSM, you wouldn't think much of her except that she's a little flaky (an accordion player with political aspirations? Weird!) But when you read her blog, not only is she talking about the tough political issues, but she's following up on comments people leave; leaving comments on other local blogs; sending in questions to local online chats; really being a part of the conversation about what Monroe County is, and what it should be. It would be great if every politician had an online presence like Sophia's.

Second best is Scott Tibbs. I actually started this post thinking about what I don't like about Scott's blog: there's no real comment area on it, just a link to a bulletin board, which I assume is also run by him, and which you have to register on before you can comment. He says that's to avoid spammers, but obviously a lot of bloggers manage to allow real comments without going to that extreme. But the point is, he writes, and discusses, and allows discussion of his views in some form. So I can't take too much umbrage, especially compared to:

Dave Rollo. He's got a web page; it's a start. The page is very static; the main page has a "last updated" date on it, but there's nothing to find what was there before. There's only a few paragraphs discussing his views, and there's no way to leave public comments, and if he's ever left a comment online I haven't seen it. Start a blog, Dave. He did participate in an online chat recently, and having a web page puts him ahead of:

Marty Hawk. Not much to say here, because I really couldn't find out anything. She gets quoted in the local paper from time to time, and you can go read the minutes of the Monroe Council meetings and find some things she said. But right now, the number 2 hit on Google when you search for her name is the article I wrote on her last week. So we really don't know too much about her at all. It leaves me defining her, rather than having her defining herself. If that's what she wants, then that's fine.

So that's where we are in online local politics in Bloomington. It's a start. But I wish there were a lot more politicians in the conversation.