Overview

This repository contains code for investigating how often manuscripts in Ecology and Evolutionary Biology that cite the R software language make their R code available. The R scripts that this work relies upon are contained in the folder 'R_scripts'. The data generated by this work (which includes both scripted and manual components) are stored in the 'data' folder. The 'figures' folder contains figures produced for a related manuscript (in review). For more information, see the preprint at: https://www.authorea.com/doi/full/10.22541/au.170003886.68548206/v1

Important Data

Citation Data

The main data file in this repository is cite_data.RDS. This is an RDS file containing information on citation counts for R files and associated predictor variables. Many of the variables are returned from the Rscopus package (https://cran.r-project.org/web/packages/rscopus/index.html) using the scopus API (https://dev.elsevier.com/sc_apis.html). Metadata on fields returned by the scopus API is available at https://dev.elsevier.com/sc_apis.html. Below, we provide information on fields which are NOT returned by the scopus API (i.e., data which we collected).

uid = A unique ID assigned to each record.
r_scripts_available = A binary variable (yes/no) describing whether any R code was shared as part of the publication.
r_used = A binary variable (yes/no) describing whether R was used in the publication (as opposed to simply referenced without being used).
data_available = A binary variable (yes/no) describing whether the full data underlying the publication were included.
comments = Unstructured comments about the record. This may contain information about why a judgement was made or where code was found.
code location = Text string describing where the code was located, options include: NA, "SI", "figshare", "website", "appendix", "dryad", "github", "Github", "zenodo", "environmental data initiative", "sciencebase.gov", "mendeley data", "osf", "bitbucket"
code format = Text string describing the format a code was shared in, options include: NA, "word", "pdf", "R", "typeset text", "rtf", "txt", "rmd"
code license = Text string describing the license for the shared code, if any. Note that "NA" means that a license was not specified, where NA means we did not check. Options include: NA, "NA", "GPL", "CC0", "CC-BY", "MIT", "Open", "copyright"
n = A numeric index variable used to stratify randomization.

See https://dev.elsevier.com/sc_apis.html for information on the following fields:

title
author
year
doi
journal
issn
volume
pages
date
display_date
citations
article_type
open_access

Impact Factor Data

The other important data file in this repository is impact_factor.csv. This is a CSV file containing information on the impact factors of journals used in this work, as recorded on June 16, 2023. This information on impact factor was provided by the R package "scholar" (https://cran.r-project.org/web/packages/scholar/index.html). Below we provide information on the fields included.

needed_journals = The list of journals submitted to the scholar R package. These were extracted from the "journal" field of the file cite_data.RDS (see above).
Journal = The journal title matched by scholar.
Cites = The number of citations of that journal.
ImpactFactor = The journal's impact factor.
Eigenfactor = The journal's Eigenfactor.
dist = The distance between the submitted journal name and the returned journal name, as calcualted by scholar.

Important Code

There are two important R scripts in this repository: 1_data_collection.R and 2_analyses_and_figures.R. The former file was used to select publications for the study (along with relevant metadata). The latter file contains code underlying analyses and visualizations.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
R_scripts		R_scripts
data		data
figures		figures
py		py
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
summarym7.csv		summarym7.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R_scripts

R_scripts

data

data

figures

figures

py

py

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

summarym7.csv

summarym7.csv

Repository files navigation

Overview

Important Data

Citation Data

Impact Factor Data

Important Code

About

Releases 3

Packages

Contributors 2

Languages

License

bmaitner/R_citations

Folders and files

Latest commit

History

Repository files navigation

Overview

Important Data

Citation Data

Impact Factor Data

Important Code

About

Resources

License

Stars

Watchers

Forks

Languages