Data sets

Home > Library and Museum Studies > Scholarly Communication > Data sets

Data sets are increasingly being recognized as a form of scholarly communication. These are sets of data that have been collected for research purposes and are made available to other researchers for further analysis.

What is a data set?: Understanding the definition of a data set and distinguishing it from other data types, such as databases and spreadsheets.
Types of data sets: Identifying the different types of data sets, including structured, unstructured, and semi-structured data and their characteristics.
Data set acquisition: Understanding where to find data sets, such as open data repositories or proprietary sources, and how to obtain access to them.
Data set preparation: Techniques for cleaning, transforming, and formatting data sets to ensure their quality and make them more usable.
Data set exploration: Methods for visually and statistically exploring a data set to identify patterns, trends, and potential outliers.
Data set analysis: Techniques for performing analyses on a data set, such as descriptive statistics, regression, clustering, and classification.
Data set visualization: Tools and techniques for creating visualizations that communicate insights from a data set, including charts, graphs, and dashboards.
Data set interpretation: Strategies for making sense of the results obtained from data set analysis and communicating these findings to others.
Data set management: Best practices for storing, backing up, and archiving data sets to ensure their long-term viability and accessibility.
Data set ethics: Understanding ethical considerations pertaining to the collection, use, and sharing of data sets, including issues related to privacy, confidentiality, and intellectual property.
Bibliographic data sets: Includes data related to books, journals, articles, and other scholarly publications that are indexed and searchable.
Citation data sets: Includes data related to the citation of scholarly publications, such as who cited whom, when, and in what context.
Usage data sets: Includes data related to the usage of scholarly publications, such as the number of downloads or views of an article or journal.
Authorship data sets: Includes data related to the authors of scholarly publications, such as their names, affiliations, and publications.
Funding data sets: Includes data related to the funding of scholarly research, such as grants, awards, and other forms of financial support.
Peer review data sets: Includes data related to the peer review process of scholarly publications, such as reviewers' comments and recommendations.
Altmetrics data sets: Includes data related to the impact of scholarly publications beyond traditional citation metrics, such as social media mentions, blog posts, and other online interactions.
Open access data sets: Includes data related to open access scholarly publications, such as the number of open access journals and articles, and their usage and impact.
Publishing data sets: Includes data related to the publishing industry, such as the number of publishers, journals, and articles, and their business models and practices.
Patent data sets: Includes data related to patents and patent filings, such as the number of patents issued, the fields of technology covered, and the companies and inventors involved.
"A data set (or dataset) is a collection of data."
"A data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the data set in question."
"Every column of a table represents a particular variable."
"Each row corresponds to a given record of the data set in question."
"Yes, the data set lists values for each of the variables."
"Data sets can also consist of a collection of documents or files."
"In the open data discipline, data set is the unit to measure the information released in a public open data repository."
"The European data.europa.eu portal aggregates more than a million data sets."
"The data set lists values for each of the variables, such as for example height…"
"...and weight of an object..."
"Data set is the unit to measure the information released in a public open data repository."
"Data set is the unit to measure the information released in a public open data repository."
"The European data.europa.eu portal aggregates more than a million data sets."
"The European data.europa.eu portal aggregates more than a million data sets."
"Every column of a table represents a particular variable."
"Every column of a table represents a particular variable."
"A data set corresponds to one or more database tables..."
"Each row corresponds to a given record of the data set in question."
"Each row corresponds to a given record of the data set in question."
"The data set lists values for each of the variables, ... for each member of the data set."