Rogue Scholar

Published July 23, 2012

Half-baked idea time. Thinking about projects such as the Earth Microbiome Project and Genomic Observatories, the recent GBIC2012 meeting (I'm still digesting that meeting), and mulling over the book A Vast Machine I keep thinking about the possible parallels between climate science and biodiversity science.

GBIC2012GBIFOntologyComputer and Information Sciences

Post GBIC2012 thoughts

https://doi.org/10.59350/yywhk-57n80

Published July 6, 2012

Author Roderic Page

I'm back from Copenhagen and GBIC2012.

Biodiversity InformaticsBowkerGBIC2012GBIFPlanet ManagementComputer and Information Sciences

Planet management, GBIF, and the future of biodiversity informatics

https://doi.org/10.59350/zhnm6-0vf92

Published June 29, 2012

Author Roderic Page

Next week I'm in Copenhagen for GBIC, the Global Biodiversity Informatics Conference. The goal of the conference is to:The collaboration referred to is the agreement to mobilise data and informatics capability to met the Aichi Biodiversity Targets.I confess I have mixed feelings about the upcoming meeting. There will be something like 100 people attending the conference, with backgrounds ranging from pure science to intergovernmental policy.

BHLErrorsFictional TaxaGBIFGoogleComputer and Information Sciences

Fictional taxa

https://doi.org/10.59350/qy4n8-y2770

Published June 18, 2012

Author Roderic Page

Anyone who works with taxonomic databases is aware of the fact that they have errors. Some taxonomic databases are restricted in scope to a particular taxon in which one or more people have expertise, these then get aggregated into larger databases, which may in turn be aggregated by databases whose scope is global.

GBIFLinkingLinkoutNCBITreeBASEComputer and Information Sciences

Linking NCBI taxonomy to GBIF

https://doi.org/10.59350/sg04y-k2b09

Published June 2, 2012

Author Roderic Page

In response to Rutger Vos's question I've started to add GBIF taxon ids to the iPhylo Linkout website. If you've not come across iPhylo Linkout, it's a Semantic Mediawiki-based site were I maintain links between the NCBI taxonomy and other resources, such as Wikipedia and the BBC Nature Wildlife finder. For more background seePage, R. D. M. (2011). Linking NCBI to Wikipedia: a wiki-based approach. PLoS Currents, 3, RRN1228.

BioStorClassificationData CleaningErrorGBIFComputer and Information Sciences

The GBIF classification is broken — how do we fix it?

https://doi.org/10.59350/5a5re-kp839

Published May 30, 2012

Author Roderic Page

This post arose from an ongoing email conversation with Tony Rees about extracting and annotating taxonomic names. In BioStor I use the GBIF classification to display the taxonomic names found in the OCR text in the form of a tree.

BHLBiomedicalGBIFLinkingMekong River SchistosomiasisComputer and Information Sciences

BHL and GBIF as biomedical databases

https://doi.org/10.59350/8pp2p-9dh09

Published March 27, 2012

Author Roderic Page

When I think of the Biodiversity Heritage Library (BHL) or GBIF I tend to think of taxonomy and biodiversity. Folk wisdom has it that BHL is full of old books, mostly pre-1923. Great for finding old taxonomic names, or nice artwork, but not exactly "modern" biology. GBIF is mainly about displaying organism distributions based on museum specimens, the primary data of taxonomic research.

AnnotationErrorGBIFGenbankIdentifiersComputer and Information Sciences

Yet more reasons to have specimen identifiers: annotating GenBank sequences

https://doi.org/10.59350/k46hh-dz648

Published March 1, 2012

Author Roderic Page

One reason I'm pursuing the theme of specimen identifiers (and identifiers in general) is the central role they play in annotating databases. To give a concrete example, I (among others) have argued for a wiki-style annotation layer on top of GenBank to capture things such as sequencing errors, updated species names, etc. Annotation is a lot easier if we have consistent identifiers for the things being annotated.

BioStorDigitisationGBIFHostLiceComputer and Information Sciences

GBIF specimens in BioStor: who are the top ten museums with citable specimens?

https://doi.org/10.59350/d97rd-ea309

Published February 28, 2012

Author Roderic Page

Brief update on yesterday's post about finding specimens in BioStor. BioStor has some 66,000 articles from BHL, from which I've extracted 143,000 cases of a specimen code being cited in the text.

BHLBioStorGBIFIdentifiersLinkingComputer and Information Sciences

Linking GBIF and the Biodiversity Heritage Library

https://doi.org/10.59350/ehbwx-fjv34

Published February 27, 2012

Author Roderic Page

Following on from exploring links between GBIF and GenBank here I'm going to look at links between GBIF and the primary literature, in this case articles scanned by the Biodiversity Heritage Library (BHL). The OCR text in BHL can be mined for a variety of entities. BHL itself has used uBio's tools to identity taxonomic names in the OCR text, and in my BioStor project I've extracted article-level metadata and geographic co-ordinates.

iPhylo

Microbiome as climate, macrobiome as weather, and a global model of biodiversity

Post GBIC2012 thoughts

Planet management, GBIF, and the future of biodiversity informatics

Fictional taxa

Linking NCBI taxonomy to GBIF

The GBIF classification is broken — how do we fix it?

BHL and GBIF as biomedical databases

Yet more reasons to have specimen identifiers: annotating GenBank sequences

GBIF specimens in BioStor: who are the top ten museums with citable specimens?

Linking GBIF and the Biodiversity Heritage Library