Computer and Information SciencesBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Home PageAtom FeedMastodonISSN 2051-8188
language
Published

Quick notes on an experimental feature I've added to BioNames. It attempts to identify possible taxonomic synonyms by extracting pairs of names with the same species name that appear together on the same page of text. The text could be full text for an open access article, OCR text from BHL, or the title and abstract for an article.

Published

Following on from the discussion of the African chameleon data, I've started to explore Angelique Hjarding's data in more detail. The data is available from figshare (doi:10.6084/m9.figshare.1141858), so I've grabbed a copy and put it in github. Several things are immediately apparent.There is a lot of ungeoreferenced data.

Published

Note to self for upcoming discussion with JournalMap.As of Monday August 25th, BioStor has 106,617 articles comprising 1,484,050 BHL pages. From the full text for these articles, I have extracted 45,452 distinct localities (i.e., geotagged with latitude and longitude). 15,860 BHL pages in BioStor pages have at least one geotag, these pages belong to 5,675 BioStor articles.In summary, BioStor has 5,675 full-text articles that are geotagged.

Published

This is guest post by Angelique Hjarding in response to discussion on this blog about the paper below.Thank you for highlighting our recent publication and for the very interesting comments. We wanted to take the opportunity to address some of the issues brought up in both your review and from reader comments. One of the most important issues that has been raised is the sharing of cleaned and vetted datasets.