<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Roger Hyam &#187; Biodiversity Informatics</title>
	<atom:link href="http://www.hyam.net/blog/archives/category/biodiv/feed" rel="self" type="application/rss+xml" />
	<link>http://www.hyam.net/blog</link>
	<description>"truly pathetic verbiage"</description>
	<lastBuildDate>Wed, 01 Sep 2010 16:58:22 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>My Rhododendron Hymenanthes Thesis</title>
		<link>http://www.hyam.net/blog/archives/914</link>
		<comments>http://www.hyam.net/blog/archives/914#comments</comments>
		<pubDate>Tue, 24 Aug 2010 16:24:40 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/914">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=914</guid>
		<description><![CDATA[I&#8217;m looking to do some work on Rhododendron again and so have had to dig out the old digital copy of my PhD thesis. I have mangled the MS Word files together using OpenOffice derivative NeoOffice. For the record here is a PDF version of it. Molecular and Conventional Data Sets and the Systematics of <a href='http://www.hyam.net/blog/archives/914'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m looking to do some work on <em>Rhododendron</em> again and so have had to dig out the old digital copy of my PhD thesis. I have mangled the MS Word files together using OpenOffice derivative NeoOffice. For the record here is a PDF version of it. <a href="http://www.hyam.net/publications/Hyam_Rhododendron_Thesis_2010_Edition.pdf">Molecular and Conventional Data Sets and the Systematics of </a><em><a href="http://www.hyam.net/publications/Hyam_Rhododendron_Thesis_2010_Edition.pdf">Rhododendron</a></em><a href="http://www.hyam.net/publications/Hyam_Rhododendron_Thesis_2010_Edition.pdf"> L. Subgenus </a><em><a href="http://www.hyam.net/publications/Hyam_Rhododendron_Thesis_2010_Edition.pdf">Hymenanthes</a></em><a href="http://www.hyam.net/publications/Hyam_Rhododendron_Thesis_2010_Edition.pdf"> (Blume) K.Koch</a> (5.2 megabytes).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/914/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ratty&#8217;s Real Name &#8211; Arvicola amphibius/terrestris</title>
		<link>http://www.hyam.net/blog/archives/896</link>
		<comments>http://www.hyam.net/blog/archives/896#comments</comments>
		<pubDate>Thu, 17 Jun 2010 10:30:07 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/896">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=896</guid>
		<description><![CDATA[I put this talk together for a meeting just in case I needed to elaborate on a point in one of my reports. I never used it but post it here for the record.

The Water Vole &#8211; Arvicola terrestris / amphibius?
View more presentations from rogerhyam.


The slides are pretty self explanatory. The rules of nomenclature applied by some studious <a href='http://www.hyam.net/blog/archives/896'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>I put this talk together for a meeting just in case I needed to elaborate on a point in one of my reports. I never used it but post it here for the record.<br />
<center></p>
<div style="width:425px" id="__ss_4524512"><strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/rogerhyam/arvicola-terrestris-amphibius" title="The Water Vole - Arvicola terrestris / amphibius?">The Water Vole &#8211; Arvicola terrestris / amphibius?</a></strong><object id="__sse4524512" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rat-100617050927-phpapp02&#038;stripped_title=arvicola-terrestris-amphibius" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed name="__sse4524512" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rat-100617050927-phpapp02&#038;stripped_title=arvicola-terrestris-amphibius" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object>
<div style="padding:5px 0 12px">View more <a href="http://www.slideshare.net/">presentations</a> from <a href="http://www.slideshare.net/rogerhyam">rogerhyam</a>.</div>
</div>
<p></center></p>
<p>The slides are pretty self explanatory. The rules of nomenclature applied by some studious nomenclaturists lead to a change in the official name of a protected rodent. A name that has been stable for years. Who does this name change help? What purpose does it serve outside of playing the nomenclatural game? </p>
<p>Your comments are most welcome -especially if you are rodent specialist.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/896/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Taxonomy, Nomenclature and PESI &#8211; An explanation for mortals.</title>
		<link>http://www.hyam.net/blog/archives/894</link>
		<comments>http://www.hyam.net/blog/archives/894#comments</comments>
		<pubDate>Wed, 09 Jun 2010 08:52:05 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/894">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/archives/894</guid>
		<description><![CDATA[I just wrote 500 words explaining the relationship between Taxonomy, Nomenclature and PESI for use in the PESI portal. Here they are:
The process of creating a classification of life is split into two parts. Firstly experts decide which species exist. This process is called taxonomy. Secondly the experts work out what to call the species <a href='http://www.hyam.net/blog/archives/894'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>I just wrote 500 words explaining the relationship between Taxonomy, Nomenclature and PESI for use in the PESI portal. Here they are:</p>
<p>The process of creating a classification of life is split into two parts. Firstly experts decide which species exist. This process is called taxonomy. Secondly the experts work out what to call the species they recognise. This is called nomenclature.</p>
<p>The relationship between taxonomy and nomenclature is complex.<span id="more-894"></span> The same species may have been discovered and named more than once by different people. A single species may be found to contain more than one cryptic species. Changes in our understanding of the relationships of species can also result in changes of the names – even if the species themselves do not change.</p>
<p>Nomenclature is governed by a set of rules. For animals these rules are given in the International Commission on Zoological Nomenclature code (ICZN) for plants and fungi they are given by the International Code of Botanical Nomenclature (ICBN). Because the naming of a species is only a matter of applying the rules in these codes (and not a matter of judgement or scientific opinion) nomenclature is objective and could even be automated.</p>
<p>Taxonomy reflects current scientific opinion. Two experts may disagree about whether populations represent separate species or how species should be classified. As our knowledge of biodiversity increases classifications may change reflecting changing opinions through time. Taxonomy is therefore subjective.</p>
<p>Because nomenclature is objective it should be straightforward to build a database of all the published names of organisms along with where they were published and what their voucher specimens are but without any indication as to whether they are the currently accepted names of species. This would make it very easy for scientists to work out the correct name for a species when they define it. Such databases of purely nomenclatural information are called nomenclators. Attempts have been made to build nomenclators for each of the nomenclatural codes. Examples include the International Plant Names Index (IPNI) which attempts to list all names of vascular plants, Index Fungorum to list names of fungi and ZooBank names of animals.</p>
<p>PESI is a taxonomic resource. It is an annotated checklist of European organisms that reflects current scientific opinion. As with other scholarly taxonomic works PESI contains much nomenclatural information for organisms and so is also a valuable source of purely nomenclatural information.<br />
Because no nomenclator is yet complete and there is no single point of contact for all scientific names (plants and animals) there is an initiative to build a Global Names Architecture (GNA) that will unify all these efforts. There are further aspirations to produce a Global Names Usage Bank (GNUB) that will include a register of taxonomic opinion on these names.</p>
<p>PESI therefore has a two way relationship with nomenclators. It uses them as sources for correct nomenclature and it can also provide nomenclatural data for inclusion in them where there are gaps. Going forward PESI will be a part of both the GNA and the GNUB both by contributing data and by benefiting from the contributions of others.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/894/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Words About Names &#8211; What I do for a living?</title>
		<link>http://www.hyam.net/blog/archives/855</link>
		<comments>http://www.hyam.net/blog/archives/855#comments</comments>
		<pubDate>Fri, 28 May 2010 14:05:25 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/855">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=855</guid>
		<description><![CDATA[Sometimes two things cross your desk at the same time and they say more than either one of them would on their own.
Firstly I was looking for a list of British birds and happened across the British Ornithologists&#8217; Union (BOU) list of bird names and how they have changed between 1923 and 2007. This is <a href='http://www.hyam.net/blog/archives/855'>[...]</a>]]></description>
			<content:encoded><![CDATA[<div id="attachment_860" class="wp-caption alignleft" style="width: 198px"><a href="http://www.hyam.net/blog/wp-content/uploads/2010/05/frog_lurking.jpg"><img class="size-medium wp-image-860    " title="frog_lurking" src="http://www.hyam.net/blog/wp-content/uploads/2010/05/frog_lurking-640x640.jpg" alt="" width="188" height="188" /></a><p class="wp-caption-text">The Frog in the Pond</p></div>
<p>Sometimes two things cross your desk at the same time and they say more than either one of them would on their own.</p>
<p>Firstly I was looking for a list of British birds and happened across the <a href="http://www.bou.org.uk">British Ornithologists&#8217; Union</a> (BOU) <a href="http://www.bou.org.uk/recbrlstbni.html">list of bird names</a> and how they have changed between 1923 and 2007. This is most delightful list as it shows the English names are as stable as the scientific names &#8211; or both are equally unstable. If it hadn&#8217;t been for an attempt to standardise the use of the hyphen the English names would have been much more stable in my opinion (though by no means totally static). Here is a quote:<span id="more-855"></span></p>
<blockquote><p>English vernacular and international use names have proved more stable than scientific names in recent years (see the changes to the scientific names of terns hirundines and tits in Sangster <em>et al </em>, 2005, <em>Ibis </em>1 47: 821-826 – this paper alone contained 15 changes to the 29 scientific names of species across these three groups, with no changes to any of the 29 English names) and will probably continue to be more stable as ongoing taxonomic research is more likely to see changes to scientific names than it is to English names.</p></blockquote>
<p>It is easy to see why the BOU should rate English names equally with scientific names. It is also interesting to see that there is no consideration of splitting or merging of taxa here. They are only looking at the names they call (presumably indisputably stable) entities. They map names to names.</p>
<p>The second thing to cross my desk was the publication of a David Hawksworth&#8217;s  <a href="http://www.gbif.org/communications/resources/print-and-online-resources/bionomenclature/">Terms Used In Bionomenclature</a>. This is a very useful book indeed for those trying to get to grips with the different codes of nomenclature and how they have been applied. Here&#8217;s the quote:</p>
<blockquote><p>This is a glossary of over 2,100 terms used in biological nomenclature &#8211; the naming of whole organisms of all kinds. It covers terms in use in the current editions of the different internationally mandated and proposed organismal Codes; i.e. those for botany (including mycology), cultivated plants, prokaryotes (archaea and bacteria), virology, and zoology, as well as the Draft BioCode and PhyloCode.</p></blockquote>
<p>At over two thousand words this is quite some vocabulary of technical terms. To get a handle on how expressive this is compare it with <a href="http://en.wikipedia.org/wiki/Basic_English">Basic English</a> which has under a thousand words. This quote from the Wikipedia page:</p>
<blockquote><p>The 850 core words of Basic English are found in Wiktionary&#8217;s <em><a title="wikt:Appendix:Basic English word list" href="http://en.wiktionary.org/wiki/Appendix:Basic_English_word_list">Appendix:Basic English word list</a></em>. This core is theoretically enough for everyday life. However, Ogden prescribed that any student should learn an additional 150 word list for everyday work in some particular field, by adding a word list of 100 words particularly useful in a general field (e.g., science, verse, business, etc.), along with a 50-word list from a more specialised subset of that general field, to make a basic 1000 word vocabulary for everyday work and life.</p></blockquote>
<p>So you can get by in<strong><span style="font-weight: normal;"> everyday work and life</span> </strong>with 1000 words but a specialist vocabulary twice the size is needed for a system of scientific nomenclature which <strong>fails to produce stable names for everyday objects</strong> (British birds).</p>
<p>Last time I visited my mother-in-law she asked me what I did for a living and, for the umpteenth time, I tried to explain. I am finding it becomes harder to justify these things even to myself. Nomenclature seems to be a kind of <a href="http://en.wikipedia.org/wiki/The_Glass_Bead_Game">Glass Bead Game </a>only loosely connected to the real world. The real action is probably in <a href="http://www.boldsystems.org">barcoding</a> and <a href="http://en.wikipedia.org/wiki/Metagenomics">metagenomics</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/855/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PESI Deliverable D4.3 &#8211; Application and Adoption of Taxonomic Standards</title>
		<link>http://www.hyam.net/blog/archives/848</link>
		<comments>http://www.hyam.net/blog/archives/848#comments</comments>
		<pubDate>Thu, 27 May 2010 09:11:36 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/848">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=848</guid>
		<description><![CDATA[I finally submitted deliverable D4.3 for the PESI project and in the great tradition of putting my outputs on my blog here is a PDF copy: Application and Adoption of Taxonomic Standards.
This will be of interest to those involved in taxonomic and nomenclature projects as it shows our collective attempts to get to grips with <a href='http://www.hyam.net/blog/archives/848'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.hyam.net/blog/wp-content/uploads/2010/05/PESI_WP4_D4.3_App_Adopt_06.pdf"><img class="alignleft size-full wp-image-849" title="d4.3" src="http://www.hyam.net/blog/wp-content/uploads/2010/05/d4.3.jpg" alt="" width="144" height="208" /></a>I finally submitted deliverable D4.3 for the <a href="http://www.eu-nomen.eu/pesi/">PESI</a> project and in the great tradition of putting my outputs on my blog here is a PDF copy: <a href="http://www.hyam.net/blog/wp-content/uploads/2010/05/PESI_WP4_D4.3_App_Adopt_06.pdf">Application and Adoption of Taxonomic Standards</a>.</p>
<p>This will be of interest to those involved in taxonomic and nomenclature projects as it shows our collective attempts to get to grips with GUIDs and RDF probably in a more political than technical sense. It advocates the use of <a href="http://code.google.com/p/gbif-ecat/wiki/DwCArchive">Darwin Core Archive</a> format as proposed by the <a href="http://www.gbif.org/">GBIF</a> ECAT project.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/848/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Now what would a physicist know about taxonomy?</title>
		<link>http://www.hyam.net/blog/archives/820</link>
		<comments>http://www.hyam.net/blog/archives/820#comments</comments>
		<pubDate>Fri, 19 Feb 2010 09:48:06 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/820">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=820</guid>
		<description><![CDATA[You can know the name of a bird in all the languages of the world, but when you&#8217;re finished, you&#8217;ll know absolutely nothing whatever about the bird&#8230; So let&#8217;s look at the bird and see what it&#8217;s doing &#8212; that&#8217;s what counts. I learned very early the difference between knowing the name of something and <a href='http://www.hyam.net/blog/archives/820'>[...]</a>]]></description>
			<content:encoded><![CDATA[<blockquote><p>You can know the name of a bird in all the languages of the world, but when you&#8217;re finished, you&#8217;ll know absolutely nothing whatever about the bird&#8230; So let&#8217;s look at the bird and see what it&#8217;s doing &#8212; that&#8217;s what counts. I learned very early the difference between knowing the name of something and knowing something.</p></blockquote>
<p>Richard Feynman: &#8221;What is Science?&#8221;, presented at the fifteenth annual meeting of the National Science Teachers Association, in New York City (1966) published in <em>The Physics Teacher</em> Vol. 7, issue 6 (1969)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/820/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Are author names really necessary?</title>
		<link>http://www.hyam.net/blog/archives/796</link>
		<comments>http://www.hyam.net/blog/archives/796#comments</comments>
		<pubDate>Thu, 21 Jan 2010 15:43:18 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/796">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=796</guid>
		<description><![CDATA[Although there are standards for abbreviation of author names (notably Brummitt in botany) these are not always followed and often embellished. Furthermore it is believed that the added nomenclatural precision author names add is not worth the cost of their inclusion. If author names were included then every variation of authority string would result in <a href='http://www.hyam.net/blog/archives/796'>[...]</a>]]></description>
			<content:encoded><![CDATA[<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Although there are standards for abbreviation of author names (notably Brummitt in botany) these are not always followed and often embellished. Furthermore it is believed that the added nomenclatural precision author names add is not worth the cost of their inclusion. If author names were included then every variation of authority string would result in a new URI implying the existence of a new taxon. This would defeat the principle goal of speciesindex.org &#8211; to get people using the same URIs for the same things. Homonyms are rare it is even rarer that they cause problems outside of taxonomy and nomenclature.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Consider the following classification of confidence limits from International Panel on Climate Change (taken from here)</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">virtually certain &#8211; more than 99%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">extremely likely &#8211; more than 95%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">very likely &#8211; more than 90%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">likely &#8211; more than 60%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">more likely than not &#8211; more than 50%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">unlikely &#8211; less than 33%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">very unlikely &#8211; less than 10%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">extremely unlikely &#8211; less than 5%</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Now consider the estimate in Paton et al (2008) Taxon 57:602-611 that 4.1% of plant names have homonyms i.e. it is &#8220;extremely unlikely&#8221; that any one name is a homonym. Also consider the following list of kinds of homonyms:</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Nomenclatural Artefacts These occur where the same taxon is published multiple times. Perhaps the same publication comes out in two languages or is published a second time with a slightly different title and set of authors. For all intent and purposes these do not matter as the names are intended to refer to the same taxon.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Competitive Publication New material is found. Two authors publish accounts based on it using the same names. The taxa are substantively the same.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Quickly Synonymised. An author publishes new species only for someone to quickly realise that this is a homonym and publish the fact. Subsequent publications place it in synonymy and it is never widely used. The name in circulation will almost always refer to the correct taxon but the homonym will be kept in circulation due to always being mentioned as being a homonym in monographs, floras and faunas. Modern indexing will exasperate this situation.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Back From The Dead Everyone is happy using a junior (or later) homonym without knowing it when a taxonomist finds a publication containing the senior (earlier) homonym and overturns the nomenclatural apple cart. The rules of nomenclature say that the taxon now needs a new name even if the senior homonym is not currently the name of an accepted taxon. There is a case for nomenclatural conservation of the junior homonym or rejecting the senior homonym. Either way the original usage of the name is the most common.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Problematic Homonyms The same name string is widely used for multiple taxon concepts. This is rarer in terms of nomenclatural homonyms (where different names have actually been published) than it is where authors have simply used the same name in different senses (taxon concepts and/or misapplied names). This is particularly common with European names being used for the &#8220;wrong&#8221; taxa in the New World. Author strings are of no help here as the nomenclature is correct only the usage incorrect. A full-blown taxon concept based approach is needed to handle these situations.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Speciesindex.org takes the premise that names specified to nomenclatural code, rank, spelling and, in the case of zoological names, year are &#8220;virtually certain&#8221; to be referring to the same general taxon.</div>
<p><span style="font-family: Verdana, Helvetica, Arial; line-height: normal; font-size: small;">It is customary for scientists to cite the author of a scientific name whenever that name is used. Indeed it is considered grossly amateurish in some circles to omit such details. This causes problems because, although there are standards for abbreviation of author names (notably Brummitt in botany), these are not always followed and often embellished. This means that the entire string of name characters is never guaranteed to be unique. To a machine every variation of authority string would results in a new combination of characters and implies the existence of a new taxon</span></p>
<p><strong>What if we just stopped using author strings (other than in monographs) and ignore them when other people use them?<span id="more-796"></span><br />
</strong></p>
<p>The added nomenclatural precision author names bring is not worth the cost of their inclusion. Homonyms, where two taxa have the same name string excluding the author string are rare especially if, in zoology, we still include the year of publication of the name.  It is even rarer that homonyms cause problems outside of taxonomy and nomenclature.</p>
<p>Consider the following classification of confidence limits from International Panel on Climate Change (taken from <a href="http://news.bbc.co.uk/1/hi/sci/tech/6324029.stm">here</a>)</p>
<ul>
<li>virtually certain &#8211; more than 99%</li>
<li>extremely likely &#8211; more than 95%</li>
<li>very likely &#8211; more than 90%</li>
<li>likely &#8211; more than 60%</li>
<li>more likely than not &#8211; more than 50%</li>
<li>unlikely &#8211; less than 33%</li>
<li>very unlikely &#8211; less than 10%</li>
<li>extremely unlikely &#8211; less than 5%</li>
</ul>
<p>Now consider the estimate in Paton <em>et al</em> (2008) &#8211; Taxon 57:602-611 &#8211; that 4.1% of plant names have homonyms i.e. it is &#8220;extremely unlikely&#8221; that any one name is a homonym. Also consider the following list of kinds of homonyms I just made up:</p>
<ul>
<li><strong>Nomenclatural Artefacts</strong> These occur where the same taxon is published multiple times. Perhaps the same publication comes out in two languages or is published a second time with a slightly different title and set of authors. For all intent and purposes these do not matter as the names are intended to refer to the same taxon.</li>
<li><strong>Competitive Publication</strong> New material is found. Two authors publish accounts based on it using the same names. The taxa are substantively the same.</li>
<li><strong>Quickly Synonymised.</strong> An author publishes new species only for someone to quickly realise that this is a homonym and publish the fact. Subsequent publications place it in synonymy and it is never widely used. The name in circulation will almost always refer to the correct taxon but the homonym will be kept in circulation due to always being mentioned as being a homonym in monographs, floras and faunas. Modern indexing will exasperate this situation.</li>
<li><strong>Back From The Dead</strong> Everyone is happy using a junior (or later) homonym without knowing it when a taxonomist finds a publication containing the senior (earlier) homonym and overturns the nomenclatural apple cart. The rules of nomenclature say that the taxon now needs a new name even if the senior homonym is not currently the name of an accepted taxon. There is a case for nomenclatural conservation of the junior homonym or rejecting the senior homonym. Either way the original usage of the name is the most common.</li>
<li><strong>Problematic Homonyms</strong> The same name string is widely used for multiple taxon concepts. This is rarer in terms of nomenclatural homonyms (where different names have actually been published) than it is where authors have simply used the same name in different senses (taxon concepts and/or misapplied names). This is particularly common with European names being used for the &#8220;wrong&#8221; taxa in the New World. Author strings are of no help here as the nomenclature is correct &#8211; only the usage incorrect. A full-blown taxon concept based approach is needed to handle these situations.</li>
</ul>
<p>Why not take the premise that names specified to nomenclatural code, rank, spelling and, in the case of zoological names, year are &#8220;virtually certain&#8221; to be referring to the same taxon. It might make our lives a little simpler.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/796/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Biodiversity Informatics &#8211; A &#8217;sackable offence&#8217;</title>
		<link>http://www.hyam.net/blog/archives/730</link>
		<comments>http://www.hyam.net/blog/archives/730#comments</comments>
		<pubDate>Tue, 08 Dec 2009 16:23:18 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/730">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=730</guid>
		<description><![CDATA[At last month&#8217;s TDWG2009 conference I was on a panel for a brief discussion at the end of a session. There were around 200 people in the audience and handful of us up front as lambs for the slaughter.
One of the questions from the floor concerned the automation of the taxonomic process. I don&#8217;t recall <a href='http://www.hyam.net/blog/archives/730'>[...]</a>]]></description>
			<content:encoded><![CDATA[<div id="attachment_736" class="wp-caption alignleft" style="width: 204px"><a href="http://www.hyam.net/blog/wp-content/uploads/2009/12/Frankensteins_monster_Boris_Karloff.jpg"><img class="size-medium wp-image-736  " title="Frankenstein's Monster (Boris_Karloff)" src="http://www.hyam.net/blog/wp-content/uploads/2009/12/Frankensteins_monster_Boris_Karloff-478x640.jpg" alt="Frankenstein's Monster Required tremendous energy to re-animate." width="194" height="260" /></a><p class="wp-caption-text">Tremendous energy is required to re-animate the dead.</p></div>
<p>At last month&#8217;s <a href="http://www.tdwg.org/conference2009/">TDWG2009</a> conference I was on a panel for a brief discussion at the end of a session. There were around 200 people in the audience and handful of us up front as lambs for the slaughter.</p>
<p>One of the questions from the floor concerned the automation of the taxonomic process. I don&#8217;t recall the precise question but it triggered one of my (probably boring) canned responses.</p>
<p>I pointed out that the usual practice in software engineering, when asked to automate a system, is to produce a <a href="http://en.wikipedia.org/wiki/Domain_model">Domain Model</a> based on an analysis of some <a href="http://en.wikipedia.org/wiki/Use_case">Use Cases</a> that then leads on to some <a href="http://en.wikipedia.org/wiki/Object_model">Object Model</a> or implementation model that is actually created in software. The assumption behind this is that whatever was being done was good but needs to be done faster &#8211; with computers!</p>
<p>In biodiversity informatics, and particularly in biological taxonomy, this is not such a good idea. Current working practice was developed in the light of the prevailing technology of the time. If computers and the internet had been available from the start things would probably have been done differently. The worst thing we can do now is automate a paper based system. <span id="more-730"></span>We should take the opportunity to re-engineer our working practices. To ask the dangerous questions (that are usually only asked by those students who drop out and become multi-millionaires) like &#8220;Why do we bother doing this bit?&#8221;</p>
<p>Imagine my delight when I got this email from <a href="http://mcs.open.ac.uk/djk263/#Publications">David King</a> at the <a href="http://www.open.ac.uk/">Open University</a> agreeing with me!</p>
<blockquote><p>You struck a chord with me in your wrap up session. Several decades ago my first job was with British Steel (remember them?). Anyway, I was in the Computer Department looking after a production service  mainframe with 8Mb of memory and leading edge technology like that. When we recruited applications programmers they were told in no uncertain terms that <strong>to simply take an existing paper based workflow and replace it with one that just mapped one piece of a paper to one screen was tantamount to a sackable offence</strong>. If we were going to the expense, bother and risk of changing an existing workflow then we should take the opportunity to review it in the light of what is now possible with a computer&#8230; [my emphasis added]</p></blockquote>
<p>Clearly great minds think alike (as fools seldom differ).</p>
<p>It is a good job nobody in the biodiversity informatics community is working to reproduce paper publications like this  <a href="http://www.efloras.org/florataxon.aspx?flora_id=2&amp;taxon_id=10757">Ranunculaceae</a> page on eFloras! I am not picking on <a href="http://www.efloras.org">eFloras</a> here. There are many similar projects and I am tacitly involved in some of them. Perhaps we are all committing <strong>sackable offences</strong>. My point is this. If (and it is a big IF), it is necessary to present data like an &#8220;old fashioned&#8221; printed flora or fauna then it should only be seen as a byproduct of the taxonomic process. The specimen, character and observational data should be primary as it can be re-purposed. Data in document form (even if it is hyperlinked) is effectively dead and requires an enormous effort to re-animate &#8211; just like Frankinstien&#8217;s monster. Perhaps we should stop producing documents like this completely.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/730/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Lastest PESI Reports on Nomenclators</title>
		<link>http://www.hyam.net/blog/archives/723</link>
		<comments>http://www.hyam.net/blog/archives/723#comments</comments>
		<pubDate>Thu, 05 Nov 2009 13:53:52 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/723">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=723</guid>
		<description><![CDATA[Keeping up with the nearly-year-old tradition of putting all outputs on my blog here are the latest two reports I have submitted as part of the PESI.

Report on authoritative taxonomic standards from multiple sources suitable for deployment within European Research Area &#8211; resubmission with some clarifications. (PDF = PESI_D4.1_Standards_Report_v2.1.pdf )
Report on Procedures and Mechanisms for the <a href='http://www.hyam.net/blog/archives/723'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>Keeping up with the nearly-year-old tradition of putting all outputs on my blog here are the latest two reports I have submitted as part of the PESI.</p>
<ul>
<li>Report on authoritative taxonomic standards from multiple sources suitable for deployment within European Research Area &#8211; resubmission with some clarifications. (PDF = <a href="http://www.hyam.net/blog/wp-content/uploads/2009/11/PESI_D4.1_Standards_Report_v2.1.pdf">PESI_D4.1_Standards_Report_v2.1.pdf</a> )</li>
<li>Report on Procedures and Mechanisms for the Functioning of Nomenclators within the e-Infrastructure. (PDF = <a href="http://www.hyam.net/blog/wp-content/uploads/2009/11/PESI_D4.2_Nomenclators_Role_Report_v1.1.pdf">PESI_D4.2_Nomenclators_Role_Report_v1.1</a> )</li>
</ul>
<p>Read and enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/723/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Synonyms Are SubClasses And Higher Taxa Are Just Tags</title>
		<link>http://www.hyam.net/blog/archives/707</link>
		<comments>http://www.hyam.net/blog/archives/707#comments</comments>
		<pubDate>Mon, 26 Oct 2009 17:05:54 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://www.hyam.net/blog/archives/707">Roger Hyam</span></dc:creator>
				<category><![CDATA[Biodiversity Informatics]]></category>

		<guid isPermaLink="false">http://www.hyam.net/blog/?p=707</guid>
		<description><![CDATA[I have been wrestling for some time with how to handle taxonomic hierarchies when combining multiple classifications. This is partly motivated by a pressure to produce consensus hierarchies for navigation (a task that I think is probably not worth doing but which is beyond the scope of this post) and partly from a need to <a href='http://www.hyam.net/blog/archives/707'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.hyam.net/blog/wp-content/uploads/2009/10/strict_baptist_chapel.jpg"><img class="alignleft size-medium wp-image-714" title="strict_baptist_chapel" src="http://www.hyam.net/blog/wp-content/uploads/2009/10/strict_baptist_chapel-640x479.jpg" alt="strict_baptist_chapel" width="210" height="157" /></a>I have been wrestling for some time with how to handle taxonomic hierarchies when combining multiple classifications. This is partly motivated by a pressure to produce consensus hierarchies for navigation (a task that I think is probably not worth doing but which is beyond the scope of this post) and partly from a need to carry out inference over multiple classifications using OWL (something that I think is an important research topic if we are to overcome the &#8216;taxonomic impediment&#8217;).</p>
<p>Take the simplest scenario where we have classification C1 that contains family Z with two genera X and Y that contain a total of three species Xa, Xb and Yc. Now let there be another classification C2 that is identical but for the species Xb being moved to the genus Y as Yb.<span id="more-707"></span></p>
<h3>Classification C1</h3>
<ul>
<li>Family Z
<ul>
<li>Genus X
<ul>
<li>Species Xa</li>
<li>Species Xb</li>
</ul>
</li>
<li>Genus Y
<ul>
<li>Species Yc</li>
</ul>
</li>
</ul>
</li>
</ul>
<h3>Classification C2</h3>
<ul>
<li>Family Z
<ul>
<li>Genus X
<ul>
<li>Species Xa</li>
</ul>
</li>
<li>Genus Y
<ul>
<li>Species Yc</li>
<li>Species Yb
<ul>
<li>syn: Xb</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>It doesn&#8217;t matter here whether C1 comes before or after C2 historically or whether one is preferred by the current expert over the other. The mere fact that both these classifications exist and that they <strong>may</strong> have been used to score data in different studies is enough for us, in the biodiversity community, to have to account for them.</p>
<h2>What is comparable between these two classifications?</h2>
<p>If we can say that some of the taxa between these two classifications are equivalent then we can map them using the OWL equivalentClass or OWL sameAs assertions. Unfortunately because it isn&#8217;t clear how higher taxa are defined it is not possible to put our finger on what has changed.</p>
<h3>Hypothesis 1: The Genera Are The Same</h3>
<p>Genera are either circumscribed by (1) the list of species they contain (denoted membership) or (2) they are circumscribed by their written description (connoted or membership by extension) or they are circumscribed by a combination of the two (1&amp;2).</p>
<p>If 1 (denotation) then the genera have to be different because they have different membership between classifications. If 2 (extension) then the genera have to have changed because the description of X sensu C1 included species Xb but genus X sensu C2 excludes it and the complementary argument for genus Y. If 1&amp;2 then both the above arguments apply.</p>
<p>Therefore we must reject hypothesis 1 &#8211; the genera are not equivalents.</p>
<h3>Hypothesis 2: The Families Are The Same</h3>
<p>Either families are circumscribed by 1 (denotation) or 2 (extension) or 1&amp;2. Here the family contains genera of the same names in both classifications but we have just established that although these genera bear the same names they are in fact not equivalents. If we circumscribe families by denotation then Z is not equivalent between the two classifications because the genera are different.</p>
<p>Circumscribing families by extension is a little more tricky. In this very simplified example, if the family description in C1 contained the variation in X and Y it will still contain it in C2. We could merely have tweaked the genus descriptions by moving descriptors/characters from one genus to the other. Z sensu C1 and Z sensu C2 have the same membership in terms of species (though this is uncertain because we haven&#8217;t established that the species are the same). In real life it is unlikely that families in one classification will recognize exactly the same genera as in other classifications and so the same argument as was used in hypothesis 1 would apply.</p>
<p>Therefore we must substantively reject hypothesis 2 &#8211; the families probably are different although there are cases where they may be considered the same.</p>
<h3>Hypothesis 3: The Unmoved Species Are The Same</h3>
<p>Species Xa and Yc have not moved and so we could assume they are the same but this involves a whopper of an assumption. <strong>We must assume that a species description is free standing.</strong> By free standing I mean it does not derive some of its descriptors/characters from the genus description. We have just established (I hope) that genera change. If the species descriptions are bound to the genera then we would have to assume the species change as well &#8211; unless we examine them on a case by case basis to see what has changed. In real life taxonomists do not always repeat all the characters of the genus in every species description although some do.</p>
<p>A safer bet would be to take a <em>sensu lato</em> approach so Xa <em>sensu lato</em> includes all interpretations of Xa (both <em>sensu</em> C1 and <em>sensu</em> C2). A specimen identified to Xa <em>sensu</em> C1 can safely be assumed to belong to Xa <em>sensu lato</em> and it may belong to Xa <em>sensu</em> C2 but we can&#8217;t assert that for definite.</p>
<p>Therefore we probably have to reject hypothesis 3 &#8211; the unmoved species may be the same but we can&#8217;t guarantee it so should take a more cautious approach.</p>
<h3>Hypothesis 4: Synonymous Species Are The Same</h3>
<p>Species Xb and Yb are the same species just moved between genera and so we can assume they are equivalent &#8211; but only if we make the same assumption we made for hypothesis 3.</p>
<p>Therefore we have to at least partially reject hypothesis 4 &#8211; synonymous species may be the same but only by making a huge assumption and it would be better to take a <em>sensu lato</em> approach.</p>
<p>It appears there is no straight (i.e. equivalence) mapping that can be done between the two classifications.</p>
<h2>Disjoint Siblings</h2>
<p>There is another powerful reason we can&#8217;t join two classifications using equality relationships (i.e. owl:sameAs). In any single taxonomic classification a particular specimen should belong to only one taxon. All the taxa at each rank are disjoint from each other: meaning they don&#8217;t overlap at all. Nothing can be a member of two taxa at the same rank at the same time.</p>
<p>If we import two contradictory classifications into the same ontology and we assert that all taxa that have the same names are equivalent then we will generate ambiguity errors &#8211; the resultant ontology will be logically inconsistent.</p>
<p>Take the current example. If we say that all taxa having the same names are equivalents including the species Xb and Yb (the binomial only changes because of the rule of nomenclature &#8211; most Zoologists would consider them to have the same name). Genera X and Y are disjoint (nothing can belong to both at the same time) but species b belongs to both genus X and Y &#8211; buzzzzz &#8211; logic error please disambiguate before continuing&#8230;.</p>
<p>This says a lot about why taxonomy gets so confusing as soon as you try to step outside a single classification world view and, as we will always have to account for classifications changing through time (even if we were all working on a single consensus classification) we will always have to handle multiple classifications. An uncharitable interpretation is that the current (traditional) approach biological classification simply isn&#8217;t fit for purpose any more.</p>
<h2>Synonymous Relationships</h2>
<p>What we don&#8217;t have in our example are any rejected names. Suppose a classification C3 where we sink Xb into Xa.</p>
<h3>Classification C3</h3>
<ul>
<li>Family Z
<ul>
<li>Genus X
<ul>
<li>Species Xa
<ul>
<li>syn: Xb</li>
<li>syn: Xe</li>
<li>syn: Qf</li>
</ul>
</li>
</ul>
</li>
<li>Genus Y
<ul>
<li>Species Yc</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>What does this mean? How can we combine it with C1 and C2? In a strict nomenclature sense it means the types of Xa and Xb occur in the same species now and that Xa is the older name. The author of C3 clearly has a single vision for Xa/Xb and that Xb is some sub-part of it. That sub-part has to be the minimum of a single specimen (the holotype) or the maximum of the whole taxon (where the author effectively thinks Xa is a sub-part of Xb but Xa has the older name). As biodiversity infonauts we can&#8217;t know the answer to this beyond knowing to treat Xb <em>sensu</em> C3 as a subclass (or subset) of Xa <em>sensu</em> C3.  The synonyms Xe and Qf can be treated similarly. In this way species synonyms form a layer in the taxonomy just as subspecies do but there is an important difference. Synonyms are not disjoint from each other. The same specimen can be a member of multiple synonyms at the same time. Remember that Xb, Xe and Qf also exist in other classifications where they are accepted taxa and that they are joined to these classifications in the same way we joined the species in the above example. Xb <em>sensu lato</em> is equivalent to the union of Xb <em>sensu</em> C1 and Xb <em>sensu</em> C2.</p>
<p>We can treat all synonyms as subClasses as the arguments here apply equally then Xb sensu C2 is also a subClass of Yb sensu C2.</p>
<h2>Practical Strategy For Multiple Classifcations In OWL</h2>
<p>Higher taxa don&#8217;t seem to mean much (this will make some people&#8217;s blood boil). It is therefore probably safest to simply treat all taxa above the level of species as &#8216;tags&#8217; or simple classes that are not mutually disjoint. It is therefore safe to provide them all with owl:sameAs relationships to some common list of higher taxa. This means that all the subClassing assertions as genera move about between families in different classifications etc will just be additive and will allow discovery of species by any route that have been used. Species can happily belong to multiple genera and genera to multiple families upwards. This is similar to a SKOS type vocabulary or semantic network approach.</p>
<p>At species level and below we define <em>sensu lato </em>taxa for each <strong>name</strong> and these taxa are defined as the union of all the taxa (including when they occur as synonyms) for that name.</p>
<p>Species are disjoint from each other within any one classification.</p>
<p>Species and below synonyms are subclasses of accepted taxa but are not disjoint from their siblings.</p>
<p>A more radical, and simpler, approach would be to abandon the notion of disjoint sibling classes and simply say that everything that has the same name (including homotypic binomial taxa as having the same name) is the same (owl:sameAs) but continue to treat all synonyms as subclasses of accepted taxa. This would be throwing away information that we do have (disjointedness and the notion of a  <em>sensu lato</em> taxon as used here) but may produce a more understandable ontology.</p>
<p>Either of these approaches should produce a logically consistent ontology (containing an arbitrary number of possibly contradictory classifications) that can be reasoned over in finite time using a OWL DL inference engine and could act as the basis for a global biological classification registry. Indeed such a registry could possibly support multiple methods of binding different classifications.</p>
<p>Whether inference across such ontologies could produce anything worthwhile is another matter entirely and something that needs researching.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hyam.net/blog/archives/707/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
