Description
This dataset represents fish specimen collection records compiled from field surveys and curated for use in databases such as GBIF and OBIS. The original data was sourced from CSV and Excel files and processed using R for the purpose of cleaning, reformatting, standardizing taxonomic and temporal information, and preparing catalog numbers for integration into biodiversity repositories.
Data Records
The data in this occurrence resource has been published as a Darwin Core Archive (DwC-A), which is a standardized format for sharing biodiversity data as a set of one or more data tables. The core data table contains 394 records.
This IPT archives the data and thus serves as the data repository. The data and resource metadata are available for download in the downloads section. The versions table lists other versions of the resource that have been made publicly available and allows tracking changes made to the resource over time.
Versions
The table below shows only published versions of the resource that are publicly accessible.
How to cite
Researchers should cite this work as follows:
Avery T, Oni E (2025). AUW Ichthyology. Version 2.0. Acadia University. Occurrence dataset. https://doi.org/10.5886/yhgsrk
Rights
Researchers should respect the following rights statement:
The publisher and rights holder of this work is Acadia University. To the extent possible under law, the publisher has waived all rights to these data and has dedicated them to the Public Domain (CC0 1.0). Users may copy, modify, distribute and use the work, including for commercial purposes, without restriction.
GBIF Registration
This resource has been registered with GBIF, and assigned the following GBIF UUID: 3c326b89-69fb-442d-a4a1-1bd54d633cac. Acadia University publishes this resource, and is itself registered in GBIF as a data publisher endorsed by Canada Biodiversity Information Facility.
Keywords
Occurrence; Specimen
Contacts
- Metadata Provider
- Editor
- Data manager
Geographic Coverage
Canada | Bahamas
Bounding Coordinates | South West [-90, -180], North East [90, 180] |
---|
Taxonomic Coverage
No Description available
Order | Gadiformes, Percopsiformes, Gasterosteiformes, Scleralcyonacea, Saccopharyngiformes, Tetraodontiformes, Stomiiformes, Acipenseriformes, Salmoniformes, Acanthuriformes, Mugiliformes, Aulopiformes, Myliobatiformes, Myxiniformes, Osmeriformes, Squaliformes, Syngnathiformes, Beloniformes, Scorpaeniformes, Perciformes, Esociformes, Lamniformes, Rajiformes, Siluriformes, Cypriniformes, Anguilliformes, Petromyzontiformes, Pleuronectiformes, Clupeiformes, Myctophiformes |
---|---|
Family | Liparidae, Psychrolutidae, Eurypharyngidae, Mugilidae, Agonidae, Rajidae, Sphyraenidae, Percopsidae, Carangidae, Myctophidae, Ictaluridae, Mopseidae, Synaphobranchidae, Anguillidae, Sternoptychidae, Gerreidae, Cyclopteridae, Triglidae, Pomacentridae, Monacanthidae, Lotidae, Nemichthyidae, Squalidae, Percidae, Stomiidae, Scophthalmidae, Chlorophthalmidae, Phycidae, Exocoetidae, Polynemidae, Howellidae, Esocidae, Etmopteridae, Catostomidae, Blenniidae, Bothidae, Gonostomatidae, Cetorhinidae, Ostraciidae, Paralepididae, Osmeridae, Scaridae, Salmonidae, Tetraodontidae, Pleuronectidae, Cyprinidae, Achiridae, Cichlidae, Sciaenidae, Macrouridae, Alosidae, Stromateidae, Ophichthidae, Zoarcidae, Ammodytidae, Sebastidae, Acipenseridae, Polyodontidae, Cynoglossidae, Muraenidae, Petromyzontidae, Stichaeidae, Pholidae, Paralichthyidae, Gempylidae, Serrivomeridae, Bramidae, Cottidae, Sparidae, Urotrygonidae, Acanthuridae, Labridae, Chaetodontidae, Gasterosteidae, Myxinidae, Lutjanidae, Syngnathidae, Gadidae, Hemitripteridae, Cryptacanthodidae, Centrarchidae, Merlucciidae |
Temporal Coverage
Start Date / End Date | 1836-01-01 / 2019-07-23 |
---|
Additional Metadata
Metadata Document: Fish Collection Data Cleaning and Transformation Project Title: Data Cleaning and Standardization of Fish Collection Records Prepared By: Eniola Oni Last Updated: August 14, 2025 ________________________________________ 1. Description This dataset represents fish specimen collection records compiled from field surveys and curated for use in databases such as GBIF and OBIS. The original data was sourced from CSV and Excel files and processed using R for the purpose of cleaning, reformatting, standardizing taxonomic and temporal information, and preparing catalog numbers for integration into biodiversity repositories. ________________________________________ 2. File Descriptions File Name Description New_fish_database(Sheet1).csv Raw dataset initially imported and cleaned. Fish_Database_Masterb.csv Intermediate cleaned version with standardized column names. FISH_DATABASE_CORRECTED.xlsx Corrected version with collection and determined dates separated into day/month/year. FISH_DATABASE_FIXED.xlsx Final cleaned version with catalog numbers, genus-species combinations, and standardized month values. FISH_DATABASE_FIXEDdd.xlsx Exported version ready for integration or sharing. ________________________________________ 3. Software & Libraries Used • R version ≥ 4.0.0 • Libraries: readr, readxl, writexl, dplyr, tidyr, stringr ________________________________________ 4. Data Cleaning & Transformation Summary a. Column Renaming Standardized field names for clarity and consistency: • authority → authority_agent_first_name • date_collection → start_date • depth → Bottom_distance • gear → Collection_object_citation • End_date → determined_date_1 • Determined_by → determiner_first_name_1 • number → Count_amount • time → Start_time • notes → Remarks • collector → Collector_first_name b. Species Separation The binomial species name was split into: • genus • species Combined again for scientificName used in GBIF format: FISH_DATABASE_FIXED$names <- paste(FISH_DATABASE_FIXED$`genus 1`, FISH_DATABASE_FIXED$`species 1`) c. Date Handling • collection_date and determined_date fields were separated into day, month, and year. • Month names (e.g., "Jun", "Dec") were converted to numeric format using str_replace_all. d. Catalog Number Standardization Rules for catalog number formatting: • If year missing → prefix with 2000 • If number missing → suffix with F0000(0001,0002, etc) • Hyphens normalized with spaces: - → - • For catalog that specify considered invalid still, add 0 in front of F (e.g 2012 - F03023) e. NA Handling Empty cells in date components were replaced with blank strings to avoid issues with database ingestion: replace_na(list(determined_day = "", ...) ________________________________________ 5. Important Notes for Database Integration (e.g., Specify) Variable Notes Count_amount Text 6 under Collection Object Attribute gear Text 7 under Collection Object Attribute authority Authority First Name 1 depth Text 9 under Collection Object Attribute collection date Text 11 under Collection Object Attribute determined date Text 12 under Collection Object Attribute collection_day, collection_month, collection_year Number 1, 2, integer 1 under Collection Object determined_day, determined_month, determined_year Number 11, 12, 13 under Collection Object Property ________________________________________ 6. Data Dictionary Variable Unit Description verified Internal flag for data review check Internal flag for manual checks obis Yes/No Indicates if uploaded to OBIS num Index to help users follow catalog numbers cat_no Museum catalog/identification number order Taxonomic order family Taxonomic family species Species name (binomial) authority Name of the person who described the species locality Collection location (site name) latitude Degrees Geographic latitude longitude Degrees Geographic longitude date_collection Date When the specimen was collected collector Person who collected the specimen depth meters Collection depth in the water column gear Equipment used to collect the specimen determined Identifier of the specimen date_determined Date Date when specimen was identified number Count Number of specimens in the lot station number Rarely used, mostly irrelevant time HH:MM Time of specimen collection cruise Identifying term for sampling trip notes Miscellaneous notes ________________________________________ 7. GBIF PUBLISHING COLUMN TITLE Darwin core mapping Remarks Remarks Institution name Institution ID Count amount Individual count Gear Preparations Authority Identifiedby Species SpecificEpithet Collector Recorded by
Alternative Identifiers | 10.5886/yhgsrk |
---|---|
3c326b89-69fb-442d-a4a1-1bd54d633cac | |
https://data.canadensys.net/ipt/resource?r=auw-ichthyology |