Evidence Based Decision Making
Feb. 2, 2009 by ravishan
Since the time I read the book Super Crunchers by Ian Ayres and then re-read Freakanomics by Steven Levitt I have been energized by the notion that there is so much data that is available already or can be collected to help guide many of the decisions. Many of our technology decisions tend to be driven by either a very biased or not well represented set of data. We talk about new technologies typically because we are excited about it and feel that it will benefit our user community and then start collecting the data to support it in an incomplete fashion – mostly, contacting those who have already done this, rather than a proper sample that also includes those who have not or those who wanted to but chose not to. Of course, as with everything else, there are exceptions to all such statements.
I would therefore encourage you to start thinking about using the enormous amount of data that exists out there or suggest collecting them to guide our decision making. It goes without saying that when it comes to data, we need to be extremely careful to respect the privacy issues. So, simply because the technology allows us to collect data should not mean that we can be careless about these issues. We need to make sure to consult with the appropriate offices if we are talking about data that involves our own community.
I can provide some simple examples of where we have successfully used the data to guide us or can potentially use the data to guide us.
- We looked at how many students were forwarding their email elsewhere and to which service and consulted with WSA and move the students to Google Apps for Education.
- We have been looking at the network usage by the students between 9 PM and 2 AM to reshape the bandwidth by tweaking per user allocation and other techniques so they can have a better experience. We have engaged our student staff in both the decision making (OK, we were late to do this, but we are doing it now) and testing. We are also trying to purchase additional bandwidth just during this period from the state (I wish they would move a little faster!!!)
- We also have other network monitors that we use to do capacity planning. But we need to do more. For example, we increased wireless access points in Park Washington and plan to do the same in High Rise. The network flow data we have should have told us that there are some issues with both the extent of coverage and overcrowding, but we started to look at them after we began receiving complaints.
- We have a plan to use our own Web logs, Link analyzers as well as Google Analytics to help us archive literally thousands of web pages on our web site. We are preparing a proposal to the Cabinet on this (as recommended by the Web Committee) so we can work with individual departments to do this “Operation Clean Start” (this is not an official name!!!)
- Using the web logs provides us some unique perspectives on the visitors to our websites – for example, we know that the predominant monitors used for viewing our website is of resolution 1024 x 768 or better. Of course, many of us probably see this in our own experience, but seeing the actual data from a diverse set of users goes a long way to help design our websites. Similarly, there are other useful data such as the types of browsers the users are using that should help guide the application development process.
- We are beginning to use our Keyserver as a source of data (this data has been collected for a while, we just have not had an opportunity to analyze it yet) to understand the public computer lab usage. After we have enough data, we will prepare a proposal (or not, depending on what the data says) on how best to reconfigure the computers in the labs to reflect the usage data. For example, can we replace some of them with thin clients? Armed with the data, we can have a more meaningful conversation with the WSA and ATAC when we make proposals. Otherwise, it will simply be incomplete data (”Whenever I walk by, I see the Science Tower Lab not full”).
Most of these examples refer to data that is collected already and we try hard not to ask questions that crosses privacy boundaries. In other words, the analyses on web usage, for example, are summaries and do not necessarily have personal identities associated with them. In the end, these summary data are much more useful for what we are trying to do than any data that has personal information associated with it.
We should also be looking at other sources of data to help understand the trends – ECAR (EDUCAUSE Center for Applied Research), The Horizon Report to name a couple, which have tremendous amount of data. I also poll our colleagues in CLAC (Consortium of Liberal Arts Colleges) when I want to get a sense of what they are doing. I typically get 25-35 responses, while not a lot, very valuable because of our proximity to each others in terms of size and educational philosophy.
