Description
This dataset represents fish specimen collection records compiled from field surveys and curated for use in databases such as GBIF and OBIS. The original data was sourced from CSV and Excel files and processed using R for the purpose of cleaning, reformatting, standardizing taxonomic and temporal information, and preparing catalog numbers for integration into biodiversity repositories.
Enregistrements de données
Les données de cette ressource occurrence ont été publiées sous forme d'une Archive Darwin Core (Darwin Core Archive ou DwC-A), le format standard pour partager des données de biodiversité en tant qu'ensemble d'un ou plusieurs tableurs de données. Le tableur de données du cœur de standard (core) contient 394 enregistrements.
Cet IPT archive les données et sert donc de dépôt de données. Les données et métadonnées de la ressource sont disponibles pour téléchargement dans la section téléchargements. Le tableau des versions liste les autres versions de chaque ressource rendues disponibles de façon publique et permet de tracer les modifications apportées à la ressource au fil du temps.
Versions
Le tableau ci-dessous n'affiche que les versions publiées de la ressource accessibles publiquement.
Comment citer
Les chercheurs doivent citer cette ressource comme suit:
Avery T, Oni E (2025). AUW Ichthyology. Version 2.0. Acadia University. Occurrence dataset. https://doi.org/10.5886/yhgsrk
Droits
Les chercheurs doivent respecter la déclaration de droits suivante:
L’éditeur et détenteur des droits de cette ressource est Acadia University. En vertu de la loi, l'éditeur a abandonné ses droits par rapport à ces données et les a dédié au Domaine Public (CC0 1.0). Les utilisateurs peuvent copier, modifier, distribuer et utiliser ces travaux, incluant des utilisations commerciales, sans aucune restriction.
Enregistrement GBIF
Cette ressource a été enregistrée sur le portail GBIF, et possède l'UUID GBIF suivante : 3c326b89-69fb-442d-a4a1-1bd54d633cac. Acadia University publie cette ressource, et est enregistré dans le GBIF comme éditeur de données avec l'approbation du Canada Biodiversity Information Facility.
Mots-clé
Occurrence; Specimen
Contacts
- Fournisseur Des Métadonnées
- Editeur
- Data manager
Couverture géographique
Canada | Bahamas
Enveloppe géographique | Sud Ouest [-90, -180], Nord Est [90, 180] |
---|
Couverture taxonomique
Pas de description disponible
Order | Gadiformes, Percopsiformes, Gasterosteiformes, Scleralcyonacea, Saccopharyngiformes, Tetraodontiformes, Stomiiformes, Acipenseriformes, Salmoniformes, Acanthuriformes, Mugiliformes, Aulopiformes, Myliobatiformes, Myxiniformes, Osmeriformes, Squaliformes, Syngnathiformes, Beloniformes, Scorpaeniformes, Perciformes, Esociformes, Lamniformes, Rajiformes, Siluriformes, Cypriniformes, Anguilliformes, Petromyzontiformes, Pleuronectiformes, Clupeiformes, Myctophiformes |
---|---|
Family | Liparidae, Psychrolutidae, Eurypharyngidae, Mugilidae, Agonidae, Rajidae, Sphyraenidae, Percopsidae, Carangidae, Myctophidae, Ictaluridae, Mopseidae, Synaphobranchidae, Anguillidae, Sternoptychidae, Gerreidae, Cyclopteridae, Triglidae, Pomacentridae, Monacanthidae, Lotidae, Nemichthyidae, Squalidae, Percidae, Stomiidae, Scophthalmidae, Chlorophthalmidae, Phycidae, Exocoetidae, Polynemidae, Howellidae, Esocidae, Etmopteridae, Catostomidae, Blenniidae, Bothidae, Gonostomatidae, Cetorhinidae, Ostraciidae, Paralepididae, Osmeridae, Scaridae, Salmonidae, Tetraodontidae, Pleuronectidae, Cyprinidae, Achiridae, Cichlidae, Sciaenidae, Macrouridae, Alosidae, Stromateidae, Ophichthidae, Zoarcidae, Ammodytidae, Sebastidae, Acipenseridae, Polyodontidae, Cynoglossidae, Muraenidae, Petromyzontidae, Stichaeidae, Pholidae, Paralichthyidae, Gempylidae, Serrivomeridae, Bramidae, Cottidae, Sparidae, Urotrygonidae, Acanthuridae, Labridae, Chaetodontidae, Gasterosteidae, Myxinidae, Lutjanidae, Syngnathidae, Gadidae, Hemitripteridae, Cryptacanthodidae, Centrarchidae, Merlucciidae |
Couverture temporelle
Date de début / Date de fin | 1836-01-01 / 2019-07-23 |
---|
Métadonnées additionnelles
Metadata Document: Fish Collection Data Cleaning and Transformation Project Title: Data Cleaning and Standardization of Fish Collection Records Prepared By: Eniola Oni Last Updated: August 14, 2025 ________________________________________ 1. Description This dataset represents fish specimen collection records compiled from field surveys and curated for use in databases such as GBIF and OBIS. The original data was sourced from CSV and Excel files and processed using R for the purpose of cleaning, reformatting, standardizing taxonomic and temporal information, and preparing catalog numbers for integration into biodiversity repositories. ________________________________________ 2. File Descriptions File Name Description New_fish_database(Sheet1).csv Raw dataset initially imported and cleaned. Fish_Database_Masterb.csv Intermediate cleaned version with standardized column names. FISH_DATABASE_CORRECTED.xlsx Corrected version with collection and determined dates separated into day/month/year. FISH_DATABASE_FIXED.xlsx Final cleaned version with catalog numbers, genus-species combinations, and standardized month values. FISH_DATABASE_FIXEDdd.xlsx Exported version ready for integration or sharing. ________________________________________ 3. Software & Libraries Used • R version ≥ 4.0.0 • Libraries: readr, readxl, writexl, dplyr, tidyr, stringr ________________________________________ 4. Data Cleaning & Transformation Summary a. Column Renaming Standardized field names for clarity and consistency: • authority → authority_agent_first_name • date_collection → start_date • depth → Bottom_distance • gear → Collection_object_citation • End_date → determined_date_1 • Determined_by → determiner_first_name_1 • number → Count_amount • time → Start_time • notes → Remarks • collector → Collector_first_name b. Species Separation The binomial species name was split into: • genus • species Combined again for scientificName used in GBIF format: FISH_DATABASE_FIXED$names <- paste(FISH_DATABASE_FIXED$`genus 1`, FISH_DATABASE_FIXED$`species 1`) c. Date Handling • collection_date and determined_date fields were separated into day, month, and year. • Month names (e.g., "Jun", "Dec") were converted to numeric format using str_replace_all. d. Catalog Number Standardization Rules for catalog number formatting: • If year missing → prefix with 2000 • If number missing → suffix with F0000(0001,0002, etc) • Hyphens normalized with spaces: - → - • For catalog that specify considered invalid still, add 0 in front of F (e.g 2012 - F03023) e. NA Handling Empty cells in date components were replaced with blank strings to avoid issues with database ingestion: replace_na(list(determined_day = "", ...) ________________________________________ 5. Important Notes for Database Integration (e.g., Specify) Variable Notes Count_amount Text 6 under Collection Object Attribute gear Text 7 under Collection Object Attribute authority Authority First Name 1 depth Text 9 under Collection Object Attribute collection date Text 11 under Collection Object Attribute determined date Text 12 under Collection Object Attribute collection_day, collection_month, collection_year Number 1, 2, integer 1 under Collection Object determined_day, determined_month, determined_year Number 11, 12, 13 under Collection Object Property ________________________________________ 6. Data Dictionary Variable Unit Description verified Internal flag for data review check Internal flag for manual checks obis Yes/No Indicates if uploaded to OBIS num Index to help users follow catalog numbers cat_no Museum catalog/identification number order Taxonomic order family Taxonomic family species Species name (binomial) authority Name of the person who described the species locality Collection location (site name) latitude Degrees Geographic latitude longitude Degrees Geographic longitude date_collection Date When the specimen was collected collector Person who collected the specimen depth meters Collection depth in the water column gear Equipment used to collect the specimen determined Identifier of the specimen date_determined Date Date when specimen was identified number Count Number of specimens in the lot station number Rarely used, mostly irrelevant time HH:MM Time of specimen collection cruise Identifying term for sampling trip notes Miscellaneous notes ________________________________________ 7. GBIF PUBLISHING COLUMN TITLE Darwin core mapping Remarks Remarks Institution name Institution ID Count amount Individual count Gear Preparations Authority Identifiedby Species SpecificEpithet Collector Recorded by
Identifiants alternatifs | 10.5886/yhgsrk |
---|---|
3c326b89-69fb-442d-a4a1-1bd54d633cac | |
https://data.canadensys.net/ipt/resource?r=auw-ichthyology |