galah
is an R interface to biodiversity data hosted by
the ‘living atlases’; a set of organisations that share a common
codebase, and act as nodes of the Global Biodiversity Information
Facility (GBIF). These organisations
collate and store observations of individual life forms, using the ‘Darwin Core’ data standard.
galah
enables users to locate and download species
observations, taxonomic information, record counts, or associated media
such as images or sounds. Users can restrict their queries to particular
taxa or locations by specifying which columns and rows are returned by a
query, or by restricting their results to observations that meet
particular quality-control criteria. With a few minor exceptions, all
functions return a tibble
as their standard format.
To install from CRAN:
install.packages("galah")
Or install the development version from GitHub:
install.packages("remotes")
::install_github("AtlasOfLivingAustralia/galah") remotes
Load the package
library(galah)
By default, galah
downloads information from the Atlas
of Living Australia (ALA). To show the full list of Atlases currently
supported by galah
, use show_all(atlases)
.
show_all(atlases)
## # A tibble: 10 × 4
## atlas institution acronym url
## <chr> <chr> <chr> <chr>
## 1 Australia Atlas of Living Australia ALA https://www.ala.org.au
## 2 Austria Biodiversitäts-Atlas Österreich BAO https://biodiversityatlas.at
## 3 Brazil Sistemas de Informações sobre a Biodiversidade Brasileira SiBBr https://sibbr.gov.br
## 4 Estonia eElurikkus <NA> https://elurikkus.ee
## 5 France Inventaire National du Patrimoine Naturel INPN https://inpn.mnhn.fr
## 6 Guatemala Sistema Nacional de Información sobre Diversidad Biológica de Guatemala SNIBgt https://snib.conap.gob.gt
## 7 Portugal GBIF Portugal GBIF.pt https://www.gbif.pt
## 8 Spain GBIF Spain GBIF.es https://www.gbif.es
## 9 Sweden Swedish Biodiversity Data Infrastructure SBDI https://biodiversitydata.se
## 10 United Kingdom National Biodiversity Network NBN https://nbn.org.uk
Use galah_config()
to set the Atlas to use. This will
automatically populate the server configuration for your selected Atlas.
By default, the atlas is Australia.
galah_config(atlas = "United Kingdom")
Functions that return data from the chosen atlas have the prefix
atlas_
; e.g. to find the total number of records in the
atlas, use:
atlas_counts()
## # A tibble: 1 × 1
## count
## <int>
## 1 114080945
To pass more complex queries, start with the
galah_call()
function and pipe additional arguments to
modify the query. modifying functions have a galah_
prefix
and support non-standard evaluation (NSE).
galah_call() |>
galah_filter(year >= 2020) |>
atlas_counts()
## # A tibble: 1 × 1
## count
## <int>
## 1 14604620
To narrow the search to a particular taxonomic group, use
galah_identify()
. Note that this function only accepts
scientific names and is not case sensitive. It’s good practice to first
use search_taxa()
to check that the taxa you provide
returns the correct taxonomic results.
search_taxa("reptilia") # Check whether taxonomic info is correct
## # A tibble: 1 × 9
## search_term scientific_name taxon_concept_id rank match…¹ kingdom phylum class issues
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 reptilia REPTILIA https://biodiversity.org.au/afd/taxa/682e1228-5b3c-45ff-833b… class exactM… Animal… Chord… Rept… noIss…
## # … with abbreviated variable name ¹match_type
galah_call() |>
galah_filter(year >= 2020) |>
galah_identify("reptilia") |>
atlas_counts()
## # A tibble: 1 × 1
## count
## <int>
## 1 89643
The most common use case for galah
is to download
‘occurrence’ records; observations of plants or animals made by
contributors to the atlas. To download, first register with the relevant
atlas, then provide your registration email.
galah_config(email = "email@email.com")
Then you can customise records you require and query the atlas in question:
<- galah_call() |>
result galah_identify("Litoria") |>
galah_filter(year >= 2020, cl22 == "Tasmania") |>
galah_select(basisOfRecord, group = "basic") |>
atlas_occurrences()
|> head() result
## # A tibble: 6 × 9
## decimalLatitude decimalLongitude eventDate scientificName taxonConceptID recor…¹ dataR…² occur…³ basis…⁴
## <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 -43.4 147. 2020-09-04T14:00:00Z Litoria ewingii https://biodiversity.org.a… e8045b… FrogID PRESENT HUMAN_…
## 2 -43.4 146. 2020-01-01T13:00:00Z Litoria ewingii https://biodiversity.org.a… 44187a… FrogID PRESENT HUMAN_…
## 3 -43.4 146. 2020-01-01T13:00:00Z Litoria ewingii https://biodiversity.org.a… bc34a7… FrogID PRESENT HUMAN_…
## 4 -43.4 146. 2020-01-01T13:00:00Z Litoria ewingii https://biodiversity.org.a… ca4707… FrogID PRESENT HUMAN_…
## 5 -43.4 146. 2020-01-01T13:00:00Z Litoria burrowsae https://biodiversity.org.a… 9c71f5… FrogID PRESENT HUMAN_…
## 6 -43.4 146. 2020-01-01T13:00:00Z Litoria ewingii https://biodiversity.org.a… 4bbaad… FrogID PRESENT HUMAN_…
## # … with abbreviated variable names ¹recordID, ²dataResourceName, ³occurrenceStatus, ⁴basisOfRecord
Check out our other vignettes for more detail on how to use these functions.