cbrad: May 2008

Friday, May 30, 2008

JAOO Brisbane 2008, Day 1

The first Brisbane JAOO finished yesterday. This is the program I attended on Day 1 and some high-level thoughts.

Keynote: Why Functional Programming (Still) Matters, Erik Meijer
Erik argued that it will be critical to reduce side-effects (mutating state, IO, etc) and make the remaining explicit in our programming languages in order to succeed in a software world that has increasingly larger programs, is more connected, concurrent and asynchronous. IMHO this was the most valuable talk of the conference as most developers I meet don't consciously realise the impact of side-effects, as they are so used to them.

Designing for Scalability, Patrick Linskey
Given Erik's talk, it was interesting to observe that dealing with mutable shared state was an underlying theme.

Interaction Techniques Using the Wii Remote, Johnny Chung Lee
This was a fun talk to attend. The interactive whiteboard aspects of Johnny's work look very cool.

Enterprise Patterns, Martin Fowler
I went to this session primarily because I hadn't heard Martin speak before. He is a very strong and confident presenter. So much so, it was like an emotional steamroller. Trying to compel the audience by force of genuine belief, some anecdotal evidence or observations, but with little supporting rational was disappointing. To be fair, I could have given part of this talk about 3 years ago (perhaps not as eloquently) and would have used the same methods.

Introduction to F#, Don Syme
The material Don had to talk about was interesting, although I found it hard to get a feel for the language itself with the one large demo. I think it might have been a bit easier with a series of small demos that illustrate specific points. However, a colleague of mine remarked that it was a pleasant change to see a functional programming talk that didn't involve the fibonacci sequence.

Introduction to Real-time Programming on the Java Platform, David Holmes
I have never done anything close to real-time programming, so I went to this session to learn a bit about it. David really knows his stuff, so some of the content was lost on me. I did learn what priority inversion meant and how it affected the Mars Pathfinder mission.

Languages Panel, Don Syme, Joel Pobar, Wayne Kelly, David Holmes
Four australians, nice.

Keynote: Simplicity in Design, Erik Dornenburg & Martin Fowler
By this time, I was pretty tired. This session started out with a similar style to the Enterprise Patterns talk, so I ducked out early for a bit of a rest before the social event.

Social Event
Dinner was at the Belgian Beer Cafe Brussels. Met some more people and had some interesting discussions which was great.

Tuesday, May 27, 2008

Task Lists in The Milk

I think I now have my tasks well organised in Remember The Milk.

After signing up, there are a number of default task lists, some of which are: Study, Personal and Work. I deleted the Study one immediately and started putting tasks into the other two. It quickly became frustrating to be switching between the two task lists during the day. What I needed was a consolidated view of both lists, so I tried out Smart Lists.

Smart Lists are essentially saved search queries. The query language is quite powerful, so it was easy to setup the consolidated view. Unfortunately though, if I was in that smart list and created a new task, then the new task would be created in the default Inbox task list, not one of the underlying tasks lists (Personal and Work).

After some experimenting, it seems that creating a task in a smart list defined by a simple query over one task list works in the desired manner - the new task is in the underlying task list, not the Inbox. However, once the query becomes more complex (I don't understand the rules), a newly created task is place in the Inbox.

The second issue I had was seeing future dated tasks and current tasks all mixed in together. Ideally I want to work predominantly with one task list/view of current tasks during the day.

So my current approach is to only have one task list for all my tasks (I have called it Tasks, surprisingly enough) and two smart lists. The first smart list is my current work (show tasks with no due date). This smart list is open in a browser tab all day and I can pretty much do all my task operations in it, including creating tasks. The second smart list shows all tasks with a due date.

Each day I need to remember to spend a short block of time dealing with tasks that have a due date for that day, by either removing their due date so they appear in the current work task list or postponing them. Perhaps I need to set up some form of automated reminder on days where dated tasks are due.

This system worked well today.

Saturday, May 24, 2008

Remember The Milk

Earlier this week I signed up for Remember The Milk, a web-based task manager. So far I have been quite happy with it.

Background
Currently my work predominantly consists of many small tasks, many interruptions and adapting to changing priorities on a daily basis. So to handle this situation, I generally view my tasks as a queue and prioritise them by their position in the queue. When I am doing a particular task, often it will decompose into smaller tasks, so I am regularly adding, completing and prioritising tasks. Given the frequency of these activities, I am after a lightweight, efficient solution. Generally I am only interested in the title and priority of a task. Occasionally tasks have a due date, such as paying a bill, but otherwise I do not wish to expend effort in calculating and updating task attributes as my circumstances change.

In the past I have had a MacBook as my work machine and used iCal and Stickies (yes Stickies the bundled Mac app) to manage tasks. All tasks for the current day went on to a desktop sticky note, the rest into iCal. I split the sticky note into two sections: TODO and DONE. During the day I could then add a new task (as a line on the sticky note) to the TODO section, re-order (i.e. prioritise) by moving lines up and down and marking a task as done by cutting the line from TODO and pasting it under DONE. Very low tech, but very fast and low overhead.

On longer running activities, I also found the sticky notes useful to collect ideas and links.

This system has generally worked well, except that it was tied to the MacBook. Sometimes I would work from a different machine and in those cases would end up having sticky notes in different places. Secondly, there was no backup (and no I don't want to use Time Machine).

I have now switched from iCal to Google Calendar. This makes working on different (trusted) computers easy from a calendar perspective. However, Google Calendar does not do task management.

Experience with The Milk so far
It was quick, easy and free to sign up. The keyboard shortcuts are excellent, they greatly reduce the time it takes to create, prioritise and complete tasks. Generally the site is quite responsive, although not as fast as using my Stickies approach. The help is quite good.

Most of my tasks are not that private or sensitive in nature, but some are a little. The Milk site follows the same convention as the Google Apps, and will stay with HTTPS if you arrive at the site that way. Occasionally I have noticed it has reverted back to HTTP, but haven't figured out the pattern yet.

I setup a notes Task List just for storing ideas for longer running activities. You can add text snippets (called notes) to a task. So the tasks in this list are not tasks as such, but containers for the notes. The text box for editing notes is a little small, it would be great if it could be bigger, resisable or a little more accessible.

I haven't tried the Google Calendar integration yet and will do further posts once I have settled in to a good way of working with The Milk on a day-to-day basis.

So far the overall verdict is good and I can now get to my tasks from different computers. It is great to see some more Australian (the Milk crew are in Sydney) software success. Well done guys.

Thursday, May 22, 2008

Setting up an IP Printer in Windows XP

Yesterday I helped a colleague connect to a network printer directly via its IP address in Windows XP. In the Add Printer Wizard, we had to select:

Local printer attached to this computer

Does that seem just plain wrong (i.e. contradictory) to anyone else?

Wednesday, May 21, 2008

Haskell and Performance

I have just read Haskell and Performance via Planet Haskell. I haven't been following the recent discussions on the mailing lists that Neil refers to, but I did spend a good year or so working part-time on a pet project I had in Haskell.

Some Haskell libraries are poorly optimised
At one stage I wanted to use bitwise operations for performance. I tried Data.Bits, but that didn't help a great deal. Then I googled around and found a post about the implementation not being particularly fast.

Haskell's multi-threaded performance is amazing
I did not find this the case at all. I tried forkIO and friends as well as STM. From memory, the STM version was elegant, perhaps even beautiful, but not fast enough. I think the issue had to do with laziness - thunks not being evaluated in the worker threads - which I spent considerable time trying to solve.

Reading the Core is not easy
Some documentation I read suggested reading the core, as well as comments from various people. I tried it, but generally couldn't understand it enough to be of use.

Optimisation without profiling is pointless
I agree with the point that optimising for performance is a waste of time if not necessary. Secondly, using GHC profiling was very helpful.

Final Thoughts
These comments are really only relevant to me and the codebase I was working on and not a generalisation for others. In fairness, when I started my project, I didn't know Haskell or functional programming in general. I had spent the previous years predominantly in Java. However I spent over a year working part-time on my project and in that time it went through many changes and a complete re-write.

For a portion of the project, performance was critical (in the end it was running on about 14 cores over 5 machines). I expended a significant percentage of my time on the project battling with optimising the Haskell code. I tried strictness annotations in the places where strictness would seem to be useful, arrays, unboxing, etc. In the end, I could not build a deterministic mental model of how applying these techniques would affect the program at runtime, especially when using GHC with -02.

To solve the problem, I rewrote the performance critical portion in Java (I would have used Scala, but did not wish to spend the time getting up to speed with it at that point). My fingers got sore typing the amount of Java code necessary to do even the simplest of things in Haskell (eg. partial application). When finished, the Java version ran many times faster than the Haskell one (I wish I could remember by how much) and furthermore, I could actually reason about the impact on performance when making changes to the Java version.

Hopefully I can fix my gap in knowledge about performance optimisation in Haskell, so I don't have to resort to Java again.

Tuesday, May 6, 2008

Convert and Filter Subversion to Git

The Challenge
I have one large (25GB) Subversion repository that partly has a structure like this:


/brad
         /docs
                   /finances
                   /foo
                   ...

I wish to convert the docs subtree (including history) into its own Git repository, without the foo directory.

The Solution
One way to achieve this would have been to dump the repository, filter the history, create a new repository, load the filtered history and then convert with git-svnimport.

Instead, I did the following:

1. Convert the docs subtree into a Mercurial repository, excluding the foo directory.
$ hg convert --filemap filemap --config convert.hg.usebranchnames=False file:///path/to/svnrepos/brad/docs docs-hg

filemap is a file in the current directory with only one line in it:
exclude foo

The effect of --config onvert.hg.usebranchnames=False is to import onto the default branch in Mercurial. Without it, a docs branch would have been used and carried over to Git in the subsequent steps. I wish the final Git repository to just have the conventional master branch.

2. Convert the Mercurial repository to Git.

$ mkdir docs
$ cd docs
$ git init

I installed Mercurial via MacPorts, so to get fast-export to work, I needed to use the right Python:

$ export PYTHON="/opt/local/bin/python2.5"
$ /path/to/fast-export/hg-fast-export.sh -A ../authors.txt -r ../docs-hg

The -A ../authors.txt simply maps the Subversion commit username to a normal Git author format. Same as git-svnimport.

$ git checkout master

3. Remove the intermediate Mercurial repository:

$ cd ..
$ rm -rf docs-hg

I did a diff of the docs subdirectories in the Subversion and Git working copies and did a quick check of the history. Looks like it worked successfully.

Monday, May 5, 2008

And the Winner is: Git

I have decided to move from Subversion to a distributed VCS and have been considering Git and Mercurial. I have settled on Git for the following reasons:

Git seems more granular. I expect this to provide more flexibility to adapt to different circumstances, but at a greater learning time cost.

The way tags are managed in Mercurial (.hgtags) looks a bit odd.

The notion Git has of tracking content rather than files is interesting, although I don't understand the ramifications yet.

To be fair, either Mercurial or Git would be suitable for my current needs. Mercurial was initially more attractive as it seemed simpler to get up and going and the Subversion import works better on my existing repository.

There were a couple of interesting posts I found along the way: Experimenting with Git and The Differences Between Mercurial and Git.

cbrad