Regarding the current state of bioinformatics training

Todd Harris (@tharris) is on a bit of a roll at the moment. Last month I linked to his excellent blog post regarding community annotation, and today I find myself linking to his latest blog post:

Todd makes a convincing argument that bioinformatics education has largely failed, and he lists three reasons for this, the last of which is as follows:

Finally, the nature of much bioinformatics training is too rarefied. It doesn’t spend enough time on core skills like basic scripting and data processing. For example, algorithm development has no place in a bioinformatics overview course, more so if that is the only exposure to the field the student will have.

I particularly empathize with this point. There should be a much greater emphasis on core data processing skills in bioinformatics training, but ideally students should be getting access to some of these skills at an even earlier age. Efforts such as the Hour of Code initiative are helping raise awareness regarding the need to teach coding skills — and it's good to see the President join in with this — but it would be so much better if coding was part of the curriculum everywhere. As Steve Jobs once said:

"I think everybody in this country should learn … a computer language because it teaches you how to think … I view computer science as a liberal art. It should be something that everybody learns, takes a year in their life, one of the courses they take is learn how to program" — Steve Jobs, 1995.

Taken from 'Steve Jobs: The Lost Interview

Maybe this is still a pipe dream, but if we can't teach useful coding skills for everyone, we should at least be doing this for everyone who is considering any sort of career in the biological sciences. During my time at UC Davis, I've helped teach some basic Unix and Perl skills to many graduate students, but frustratingly this teaching has often come at the end of their first year in Grad School. By this point in their graduate training, they have often already encountered many data management problems and have not been equipped with the necessary skills to help them deal with those problems.

I think that part of the problem is that we still use the label 'bioinformatics training' and this reinforces the distinction from a more generic 'biological training'. It may once have been the case that bioinformatics was its own specialized field, but today I find that bioinformatics mostly just describes a useful set of data processing skills…skills which will be needed by anybody working in the life sciences.

Maybe we need to rebrand 'bioinformatics training', and use a name which better describes the general importance of these skills ('Essential data training for biologists?'). Whatever we decide to call it, it is clear that we need it more than ever. Todd ends his post with a great piece of advice for any current graduate students in the biosciences:

You should be receiving bioinformatics training as part of your core curriculum. If you aren’t, your program is failing you and you should seek out this training independently. You should also ask your program leaders and department chairs why training in this field isn’t being made available to you.