Compare commits

...

2 Commits

Author SHA1 Message Date
Nemo e6c2c49dd4 Adds DOI and citation 2024-01-23 14:00:54 +05:30
Nemo 040bf1c1c5 add note about methodology 2024-01-23 13:40:07 +05:30
2 changed files with 38 additions and 1 deletions

18
CITATION.cff Normal file
View File

@ -0,0 +1,18 @@
cff-version: 1.2.0
message: "If you use this dataset, please cite it as below."
authors:
- family-names: "Rana"
given-names: "Abhay"
orcid: https://orcid.org/0000-0002-7993-1363
title: "Stackshare Dataset, by Nemo"
version: 2023.01.23
url: "https://github.com/captn3m0/stackshare-dataset"
date-released: 2023-07-17
preferred-citation:
url: "https://github.com/captn3m0/stackshare-dataset"
type: data
authors:
- family-names: "Rana"
given-names: "Abhay"
orcid: "https://orcid.org/0000-0002-7993-1363"
title: "Stackshare Dataset, by Nemo"

View File

@ -1,4 +1,6 @@
# stackshare-dataset
# stackshare-dataset [![DOI](https://zenodo.org/badge/747065915.svg)](https://zenodo.org/doi/10.5281/zenodo.10554436)
**DOI**: `10.5281/zenodo.10554437`
A dataset from stackshare.io providing lists of packages and various services. While a list of packages for
various ecosystems is easily available elsewhere, a list of services is much harder.
@ -24,3 +26,20 @@ As long as you:
* **Attribute**: You must attribute any public use of the database, or works produced from the database, in the manner specified in the ODbL. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database.
* **Share-Alike**: If you publicly use any adapted version of this database, or works produced from an adapted database, you must also offer that adapted database under the ODbL.
* **Keep open**: If you redistribute the database, or an adapted version of it, then you may use technological measures that restrict the work (such as DRM) as long as you also redistribute a version without such measures.
## Generating
Ensure you have GNU Make, Python, and wget installed
```
make tools.csv
make packages.csv
```
The scraper uses the following as sources:
1. Sitemap (https://stackshare.io/sitemap.xml)
2. StackShare Search for enriching service results (https://stackshare.io/search)
The package results are not enriched, since much better data for those is available elsewhere.