what i did

This visualization was a really fun one I did a couple of years ago. It's a visualization of all the teams in a bracket over a season.


This was one of the first visualizations I did and I used PHP, and, unfortunately, I've since disabled PHP on this site so until I port it it's not going to work.

why i did

My daughter was playing competitive soccer and I wanted to see what the team's prospects were in relation to the rest of the bracket as the season progressed. I originally created this in Excel (yikes!) but maintaining it as the season went on became untenable because not only did I have to update her team's results every week, but I had to update every game in the bracket each week. This got to be too much so I wrote a Python screen-scraper to pull the data and then I pasted the output into Excel.

All data from Presidio League.

This also got to be too much of a burden because I couldn't control Excel as much as I wanted to and because I wanted to display more info than Excel's graphics would allow. D3 and Python to the rescue.

how i did

My system works like this:

  1. Python screen-scraper to mysql table
  2. mysql scripts to for analysis and error checking
  3. export data to json
  4. D3/javascript to show data

Once I had the above working how I liked I then created a Python wrapper screen-scraper which found all of the Presidio brackets and called the original scraper as a function, doing the above steps on each bracket. I had used mysql to store and analyze the results of the single bracket so it was trivial to extend that out to manage ALL the Presidio brackets by creating a new table for each bracket.

my take away

Here's my daughter's team results. What you can't tell from this post-season graph, is that as the season progresses it shows you the best possible outcome for any particular team. Towards the tail end of a close season it's hard to figure out which teams still have a shot at winning the bracket.

I showed this to a couple of parents who either have kids in club sports or were competitive players themselves. They loved it and got it right away. It's much easier than having the "but these two teams still haven't played and if Team X wins against Team Y then Team Z can still run away with it...".

In this [bracket](results it looks like Fusion had a lock on the season in the first half but the "missing" data lines showed several teams were still in the hunt.

Each league has it's own peculiarities in how they score brackets. When I did the single bracket it was pretty easy because there weren't any exceptions. In some of the other brackets there were points deducted for red cards, games that were not played and so points were averaged, and so on. I had to find all of these little quirks and codify them (remember kids: code is law!). In a couple of brackets there were some arbitrary point changes that I could not track down, even after asking the league why they'd made the changes.

When my daughter switched to a new league it was pretty easy to create some new screen scraper logic for the new website. I didn't bother asking for direct access to the database. :)

side note

In this graph my daughter's team came in first. I felt really bad for the Matrix team because they didn't win a single game. This last year my daughter's team was in the same situation. They were terrible (what's the opposite of "undefeated"?), but not winning was just a symptom of bigger problems. The upside was that, as keeper, my daughter got more experience than any other keeper in the bracket!