Rogue Scholar

Published February 18, 2011

Author Roderic Page

Quick note to express the frustration I experience sometimes when dealing with taxonomic literature.

Atlas Of Living AustraliaAustralian Faunal DirectoryGooglePagerankSearchComputer and Information Sciences

Why is the Atlas of Living Australia is invisible to Google?

https://doi.org/10.59350/j5sn7-kws35

Published February 6, 2011

Author Roderic Page

Jeff Atwood, one of the co-founders of Stack Overflow recently wrote a blog post Trouble In the House of Google, where he noted that several sites that scrape Stack Overflow content (which Stack Overflow's CC-BY-SA license permits) appear higher in Google's search rankings than the original Stack Overflow pages . When Stack Overflow chose the CC-BY-SA license they made the assumption that:Jeff Atwood's post goes on to argue that something

BioStorOpenURLScreencastVideoWeb HooksComputer and Information Sciences

Web Hooks and OpenURL: the screencast

https://doi.org/10.59350/9dejt-p7t32

Published February 4, 2011

Author Roderic Page

Yesterday I posted notes on Web Hooks and OpenURL. That post was written when I was already late (you know, when you say to yourself "yeah, I've got time, it'll just take 5 minutes to finish this..."). The Web Hooks + OpenURL project is still very much a work in progress, but I thought a screen cast would help explain why I think this is going to make my life a lot easier.

OpenURLProgrammingWeb HooksComputer and Information Sciences

Web Hooks and OpenURL: making databases editable

https://doi.org/10.59350/2aj9w-gch38

Published February 3, 2011

Author Roderic Page

For me one of the most frustrating things about online databases is that they often can't be edited. For example, I've recently created a version of the Australian Faunal Directory on CouchDB, which contains a list of all animals in Australia, and a fairly comprehensive bibliography of taxonomic publication on those animals. What I'd like to do is locate those publications online.

Australian Faunal DirectoryBHLCSSInterfaceInternet ExplorerComputer and Information Sciences

Quantum treemaps meet BHL and the Australian Faunal Directory

https://doi.org/10.59350/vgcn3-21t24

Published January 18, 2011

Author Roderic Page

One of the things I'm enjoying about the Australian Faunal Directory on CouchDB is the chance to play with some ideas without worrying about breaking lots of code or, indeed, upsetting any users ('cos, let's face it, there aren't any). As a result, I can start to play with ideas that may one day find their way into other projects.One of these ideas is to use quantum treemaps to display an author's publications.

Cool URIsCrossrefDOIDomain NamesIdentifiersComputer and Information Sciences

The demise of phthiraptera.org and the perils of using Internet domain names as identifiers

https://doi.org/10.59350/v5jjp-2mm35

Published January 14, 2011

Author Roderic Page

Geoffery Bilder's comments about the unsuitability of URLs as long term identifiers (as opposed, say, to DOIs) came to mind when I discovered that the domain phthiraptera.org is up for sale: This domain used to be home to a wealth of resources on lice (order Phthiraptera). I discovered that ownership of the domain had expired when a bunch of links to PDFs returned by an iSpecies search for Collodennyus all bounced to the holding page

Creative CommonsThe Plant ListComputer and Information Sciences

Why won't The Plant List won't let me do this?

https://doi.org/10.59350/6pkgr-gbk94

Published January 11, 2011

Author Roderic Page

In my last post I discussed why I thought the decision of The Plant List to use a restrictive license (CC-BY-NC-ND) was such a poor choice. CC-BY-NC-ND states that To make this point more concrete, I've created this site:Experiments with The Plant Listto show the kinds of things that The Plant List's choice of license prevents the taxonomic community from doing.

Creative CommonsDataKewLicenseMOBOTComputer and Information Sciences

The Plant List: nice data, shame it's not open

https://doi.org/10.59350/n4c3e-cnf74

Published December 29, 2010

Author Roderic Page

The Plant List (http://www.theplantlist.org/) has been released today, complete with glowing press releases. The list includes some 1,040,426 names. I eagerly looked for the Download button, but none is to be found.

BHLGoogleNamesOCRPeter NorvigComputer and Information Sciences

BHL and OCR

https://doi.org/10.59350/mt9qk-tww14

Published December 23, 2010

Author Roderic Page

Some quick notes on OCR. Revisiting my DjVu viewer experiments it really struck me how "dirty" the OCR text is. It's readable, but if we were to display the OCR text rather than the images, it would be a little offputting.

BHLBioStorComputer and Information Sciences

BioStor one year on: has it been a success?

https://doi.org/10.59350/r49f7-sgy24

Published December 20, 2010

Author Roderic Page

One year ago I released BioStor, which scratched my itch regarding finding articles in the Biodiversity Heritage Library. This anniversary seems to be a good time to think about where next with this project, but also to ask whether it's been successful.

iPhylo

Why metadata matters

Why is the Atlas of Living Australia is invisible to Google?

Web Hooks and OpenURL: the screencast

Web Hooks and OpenURL: making databases editable

Quantum treemaps meet BHL and the Australian Faunal Directory

The demise of phthiraptera.org and the perils of using Internet domain names as identifiers

Why won't The Plant List won't let me do this?

The Plant List: nice data, shame it's not open

BHL and OCR

BioStor one year on: has it been a success?