You can know the name of a bird in all the languages of the world, but when you’re finished, you’ll know absolutely nothing whatever about the bird… So let’s look at the bird and see what it’s doing — that’s what counts. I learned very early the difference between knowing the name of something and knowing something.

Richard Feynman: ”What is Science?”, presented at the fifteenth annual meeting of the National Science Teachers Association, in New York City (1966) published in The Physics Teacher Vol. 7, issue 6 (1969)

Although there are standards for abbreviation of author names (notably Brummitt in botany) these are not always followed and often embellished. Furthermore it is believed that the added nomenclatural precision author names add is not worth the cost of their inclusion. If author names were included then every variation of authority string would result in a new URI implying the existence of a new taxon. This would defeat the principle goal of speciesindex.org – to get people using the same URIs for the same things. Homonyms are rare it is even rarer that they cause problems outside of taxonomy and nomenclature.
Consider the following classification of confidence limits from International Panel on Climate Change (taken from here)
virtually certain – more than 99%
extremely likely – more than 95%
very likely – more than 90%
likely – more than 60%
more likely than not – more than 50%
unlikely – less than 33%
very unlikely – less than 10%
extremely unlikely – less than 5%
Now consider the estimate in Paton et al (2008) Taxon 57:602-611 that 4.1% of plant names have homonyms i.e. it is “extremely unlikely” that any one name is a homonym. Also consider the following list of kinds of homonyms:
Nomenclatural Artefacts These occur where the same taxon is published multiple times. Perhaps the same publication comes out in two languages or is published a second time with a slightly different title and set of authors. For all intent and purposes these do not matter as the names are intended to refer to the same taxon.
Competitive Publication New material is found. Two authors publish accounts based on it using the same names. The taxa are substantively the same.
Quickly Synonymised. An author publishes new species only for someone to quickly realise that this is a homonym and publish the fact. Subsequent publications place it in synonymy and it is never widely used. The name in circulation will almost always refer to the correct taxon but the homonym will be kept in circulation due to always being mentioned as being a homonym in monographs, floras and faunas. Modern indexing will exasperate this situation.
Back From The Dead Everyone is happy using a junior (or later) homonym without knowing it when a taxonomist finds a publication containing the senior (earlier) homonym and overturns the nomenclatural apple cart. The rules of nomenclature say that the taxon now needs a new name even if the senior homonym is not currently the name of an accepted taxon. There is a case for nomenclatural conservation of the junior homonym or rejecting the senior homonym. Either way the original usage of the name is the most common.
Problematic Homonyms The same name string is widely used for multiple taxon concepts. This is rarer in terms of nomenclatural homonyms (where different names have actually been published) than it is where authors have simply used the same name in different senses (taxon concepts and/or misapplied names). This is particularly common with European names being used for the “wrong” taxa in the New World. Author strings are of no help here as the nomenclature is correct only the usage incorrect. A full-blown taxon concept based approach is needed to handle these situations.
Speciesindex.org takes the premise that names specified to nomenclatural code, rank, spelling and, in the case of zoological names, year are “virtually certain” to be referring to the same general taxon.

It is customary for scientists to cite the author of a scientific name whenever that name is used. Indeed it is considered grossly amateurish in some circles to omit such details. This causes problems because, although there are standards for abbreviation of author names (notably Brummitt in botany), these are not always followed and often embellished. This means that the entire string of name characters is never guaranteed to be unique. To a machine every variation of authority string would results in a new combination of characters and implies the existence of a new taxon

What if we just stopped using author strings (other than in monographs) and ignore them when other people use them? Continue reading »

Frankenstein's Monster Required tremendous energy to re-animate.

Tremendous energy is required to re-animate the dead.

At last month’s TDWG2009 conference I was on a panel for a brief discussion at the end of a session. There were around 200 people in the audience and handful of us up front as lambs for the slaughter.

One of the questions from the floor concerned the automation of the taxonomic process. I don’t recall the precise question but it triggered one of my (probably boring) canned responses.

I pointed out that the usual practice in software engineering, when asked to automate a system, is to produce a Domain Model based on an analysis of some Use Cases that then leads on to some Object Model or implementation model that is actually created in software. The assumption behind this is that whatever was being done was good but needs to be done faster – with computers!

In biodiversity informatics, and particularly in biological taxonomy, this is not such a good idea. Current working practice was developed in the light of the prevailing technology of the time. If computers and the internet had been available from the start things would probably have been done differently. The worst thing we can do now is automate a paper based system. Continue reading »

Keeping up with the nearly-year-old tradition of putting all outputs on my blog here are the latest two reports I have submitted as part of the PESI.

Read and enjoy!

strict_baptist_chapelI have been wrestling for some time with how to handle taxonomic hierarchies when combining multiple classifications. This is partly motivated by a pressure to produce consensus hierarchies for navigation (a task that I think is probably not worth doing but which is beyond the scope of this post) and partly from a need to carry out inference over multiple classifications using OWL (something that I think is an important research topic if we are to overcome the ‘taxonomic impediment’).

Take the simplest scenario where we have classification C1 that contains family Z with two genera X and Y that contain a total of three species Xa, Xb and Yc. Now let there be another classification C2 that is identical but for the species Xb being moved to the genus Y as Yb. Continue reading »