Name: The challenge of combining 176 x #otherpeoplesdata to create the Biomass And Allometry Database (BAAD)
Start: 2016-06-28T17:21:00-0700
End: 2016-06-28T17:39:00-0700

Click here to return to main conference site. For a one page, printable overview of the schedule, see this.

Back To Schedule

The challenge of combining 176 x #otherpeoplesdata to create the Biomass And Allometry Database (BAAD)

Despite the hype around "big data", a more immediate problem facing many scientific analyses is that large-scale databases must be assembled from a collection of small independent and heterogeneous fragments -- the outputs of many and isolated scientific studies conducted around the globe. Together with 92 other co-authors, we recently published the Biomass And Allometry Database (BAAD) as a data paper in the journal Ecology, combining data from 176 different scientific studies into a single unified database. BAAD is unique in that the workflow -- from raw fragments to homogenised database -- is entirely open and reproducible. In this talk I introduce BAAD and illustrate solutions (using R) for some of the challenges of working with and distributing lots and lots of #otherpeople's data.

Moderators