Continuing the theme of taxonomic classification in Wikipedia, I'm perversely delighted that Wikipedia demonstrates Gregg's paradox so nicely. The late John R. Gregg wrote several papers and a book exploring the logical structure of taxonomy.
Continuing the theme of taxonomic classification in Wikipedia, I'm perversely delighted that Wikipedia demonstrates Gregg's paradox so nicely. The late John R. Gregg wrote several papers and a book exploring the logical structure of taxonomy.
Wikipedia is wonderful, but parts of it are horribly broken. Take, for example, taxonomic classifications. A classification is a rooted tree, which means that each node in the tree has a single parent. We can store trees in databases in a variety of ways. For example, for each node we could store a list of its children, or we could store the single unique parent of each node. Ideally we'd choose to store one or other, but not both.
Time for a quick and dirty Friday afternoon hack. Based on responses to the BHL timeline I released two days ago, I've created a version that can compare the history of two names using sparklines (created using Google's Chart API). I use sparklines to give a quick summary of hits over time (grouped by decade).The demo is here. It's crude (minimal error checking, no progress bars while it talks to BHL), but it's home time.
Stumbled across this cool visualisation project by Petra Isenberg at Calgary University. Collaborative tree comparison uses a tabletop system to enable two (or more) people to interact when comparing (in this case) phylogenies.
One thing about the Encyclopedia of Life which bugs me no end is the awful way it displays the bibliography generated from the Biodiversity Heritage Library (BHL). The image on the right shows the bibliography for the frog Hyla rivularis Taylor, 1952. It's one long, alphabetical list of pages. How can a user make sense of this?
Hot on the heels of Geoffrey Nunberg's essay about the train wreck that is Google books metadata (see my earlier post) comes Google Scholar’s Ghost Authors, Lost Authors, and Other Problems by Péter Jacsó. It's a fairly scathing look at some of the problems with the quality of Google Scholar's metadata.Now, Google Scholar isn't perfect, but it's come to play a key role in a variety of bibliographic tools, such as Mendeley, and Papers.
I've been playing recently with the Biodiversity Heritage Library (BHL), and am starting to get a sense for the complexities (and limitations) of the metadata BHL stores about publications.
At the start of this week I took part in a biodiversity informatics workshop at the Naturhistoriska riksmuseets, organised by Kevin Holston. It was a fun experience, and Kevin was a great host, going out of his way to make sure myself and other contributors were looked after.
Stumbled across Alex Wild's post Pyramica vs Strumigenys : why does it matter?, which takes as it's starting point a minor edit war on the Wikipedia page for Pyramica . Alex gives the background to the argument about whether Pyramica is a synonym of Strumigenys , and investigates the issue using the surprisingly small about of data available in GenBank.
Andrew Su has posted an analysis of Gene Wiki, a project to provide Wikipedia pages on every human gene:This result is interesting in that an existing resource (Gene Cards) beats Wikipedia, but only just.