Meeting with Administrative Systems Staff
Aug. 26, 2006 by ravishan
It has been a while since I wrote anything here. It had to do with me being very busy with meetings and vacation. There are many things about which I want to write, so I better get going….
I had a very interesting meeting with the administrative systems staff. Because of the size of the group and the length of time we met, I don’t think we got to hear from everyone. So Steve and I decided that we will follow this up with meetings with smaller groups.
The two main things that came up during the meeting have to do with how our very high expectations for system uptime causes problems and what exactly should be our long term strategy on the choice of programming languages and development environments.
The first issue has to do with the fact that our users’ expectations on service uptimes are so high that this interferes with the need to do routine maintenance and other upgrades. Basically, it is very hard to find a time when we can take a system down so we can do maintenance. In many cases this is true, however, we do have scheduled down times for selected services and by properly informing our users we have been able to deal with this problem. We decided to examine the specifics of where exactly this is currently causing the problem and find a solution to it. It is true that service uptime expectations by our users are very high, but, we have been able to deliver this by working together as a group and devising and investing in clever backend architecture. Because all our systems are so intertwined, if one of the underlying systems is down, then many interconnected services also go down with it. This simply means we need better coordination and communication to accomplish what we need.
The second issue is one about which we have had many earlier discussions. As someone who has done a lot of programming over the past several years (and continues to do…) I have enjoyed the independence to choose the programming language as well as the development toolset. It has always been the case that organizations such as ours have always used different languages and toolsets – typically the systems programmers used a different set than the administrative programmers and those supporting academic users. However, the recent developments in the web based service delivery have blurred the boundaries and naturally this causes concern.
Almost all of what we do now has a database component, but the richness of the available toolset no longer requires that the application development is tightly coupled to the database language. In other words, it is no longer necessary to do all the database programming in PL/SQL if Oracle is the backend DB. One can argue about the efficiency of doing it in PL/SQL vs Java or Perl or PHP, and it is indeed important. However, the fact that we have expertise in diverse areas and that they can all participate in application development and deliver many more applications is a very powerful argument for doing it using these languages. At the same time, it is true that the traditional methods of application delivery and support practised by the administrative systems programmers are not followed in many cases. And this is the cause for concern.
I do not belive that there is an easy answer to the question of whether we should centralize on a single platform for all our application development. First of all, it is impractical – we have a ton of applications that need to be converted if the answer to that question is yes and retrain many of our staff in whatever the choice of that single platform is. And above all, this will turn out to be a very unhappy place to work, at least for a while… At the same time, we cannot also have a policy whereby the decentralized application development is getting out of control – for example, someone loves Python and is developing somethign in Python! The current choice of languages seem to work fine for us and may be in the next few years we should strive to converge towards Java…
In my mind, the more important issue is one of quality control – are we developing applications in such a way that it is supportable, scalable and sustainable. And this is where we have to continue to be very vigilant.

All I know is that I’ve been trying to get downtime for post1/post2 for well over a month so that we can cluster them together. Currently, if one fails, there is no take over. Fixing this would take probably 5 minutes of downtime, if that. It has a good chance of being seamless, in truth.