Compare commits

...

14 Commits

Author SHA1 Message Date
Nemo cdfda9714f
Create FUNDING.yml 2022-05-30 14:49:58 +05:30
Nemo ed11dff030 1.2.4 2022-01-04 09:43:59 +05:30
Nemo 014231895c [deps] Dependency updates 2022-01-04 09:43:58 +05:30
Nemo 98721bf6ef v1.2.3 2020-07-19 19:14:11 +05:30
Nemo ab0f565a8c Update LICENSE date 2020-04-26 07:02:49 +05:30
Nemo c654c33f91 Fixes dc:date element 2020-01-28 02:24:33 +05:30
Nemo 884cdb42eb Document 1.2.1 features 2020-01-28 00:47:17 +05:30
Nemo a46bb80ac8 General improvements 2020-01-28 00:45:41 +05:30
Nemo b97fa62e7c 1.2.0 2020-01-16 14:44:08 +05:30
Nemo 4fc2818eea
Adds Docker support (#2)
Adds Docker support
2020-01-16 14:39:34 +05:30
Nemo 2969530996 Adds Docker support 2020-01-16 14:14:41 +05:30
Nemo 1491153452 Updates usage 2020-01-16 13:43:01 +05:30
Nemo 83e383b003 v1.1.0 2020-01-15 19:17:49 +05:30
Nemo 7906f151e6 Adds support for cover image and title 2020-01-15 19:13:56 +05:30
11 changed files with 2235 additions and 1172 deletions

2
.dockerignore Normal file
View File

@ -0,0 +1,2 @@
node_modules/
TODO.md

3
.github/FUNDING.yml vendored Normal file
View File

@ -0,0 +1,3 @@
ko_fi: captn3m0
liberapay: captn3m0
github: captn3m0

34
CHANGELOG.md Normal file
View File

@ -0,0 +1,34 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [1.2.3]
### Changed
- Dependency updates
## [1.2.2]
### Fixed
- Fixed `dc:date` element (publication date) in the generated EPUB
## [1.2.1]
### Added
- Default filename is now generated
- Added publisher and source tags in generate EPUB
### Fixed
- Fixed a bug where the cover image was getting reused
## [1.2.0]
### Added
- Docker Image published
## 1.1.0
### Added
- External cover image
- Configurable title

13
Dockerfile Normal file
View File

@ -0,0 +1,13 @@
FROM node:10-slim
VOLUME /data
COPY . /app
WORKDIR /app
RUN npm ci && npm link
RUN apt-get update && apt-get install --yes pandoc
ENTRYPOINT ["/usr/local/bin/url-to-epub"]

View File

@ -1,7 +1,7 @@
Copyright 2019 Abhay Rana
Copyright 2020 Abhay Rana
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

View File

@ -1,5 +1,7 @@
# url-to-epub
![npm](https://img.shields.io/npm/v/url-to-epub?style=flat-square) ![Docker Build Status](https://img.shields.io/docker/build/captn3m0/url-to-epub?style=flat-square) ![GitHub issues](https://img.shields.io/github/issues/captn3m0/url-to-epub) ![Docker Pulls](https://img.shields.io/docker/pulls/captn3m0/url-to-epub?style=flat-square) ![npm bundle size](https://img.shields.io/bundlephobia/min/url-to-epub?style=flat-square)
A simple script that generates an EPUB from a single URL, taking care for the following:
1. Cover Image
@ -8,6 +10,10 @@ A simple script that generates an EPUB from a single URL, taking care for the fo
## Installation
You can either run this script as an NPM package, or as a docker container.
### NPM
You will need `pandoc` and `npm` installed. To install this script globally:
npm install --global url-to-epub
@ -15,8 +21,14 @@ You will need `pandoc` and `npm` installed. To install this script globally:
Please make sure that the global node_modules/bin directory is added to your PATH.
You can check that directory by running `npm bin --global`.
### Docker
`docker pull captn3m0/url-to-epub:latest`
## Usage
### NPM
```
url-to-epub <url>
@ -26,15 +38,55 @@ Positionals:
url The URL to download [string] [required]
Options:
--version Show version number [boolean]
-h Show help [boolean]
--output, -o Output file to save EPUB [string] [default: "url-to-epub.epub"]
--version Show version number [boolean]
-h Show help [boolean]
--output, -o Output file to save EPUB[string] [default: "url-to-epub.epub"]
--title, -t Title of the book, if not the same as the page title
[string] [default: null]
--cover-url Image URL to download as cover [string] [default: null]
--language, -l A valid language tag
[string] [choices: "af", "af-ZA", "ar", "ar-AE", "ar-BH", "ar-DZ", "ar-EG",
"ar-IQ", "ar-JO", "ar-KW", "ar-LB", "ar-LY", "ar-MA", "ar-OM", "ar-QA",
"ar-SA", "ar-SY", "ar-TN", "ar-YE", "az", "az-AZ", "az-Cyrl-AZ", "be",
"be-BY", "bg", "bg-BG", "bs-BA", "ca", "ca-ES", "cs", "cs-CZ", "cy", "cy-GB",
"da", "da-DK", "de", "de-AT", "de-CH", "de-DE", "de-LI", "de-LU", "dv",
"dv-MV", "el", "el-GR", "en", "en-AU", "en-BZ", "en-CA", "en-CB", "en-GB",
"en-IE", "en-JM", "en-NZ", "en-PH", "en-TT", "en-US", "en-ZA", "en-ZW", "eo",
"es", "es-AR", "es-BO", "es-CL", "es-CO", "es-CR", "es-DO", "es-EC", "es-ES",
"es-GT", "es-HN", "es-MX", "es-NI", "es-PA", "es-PE", "es-PR", "es-PY",
"es-SV", "es-UY", "es-VE", "et", "et-EE", "eu", "eu-ES", "fa", "fa-IR", "fi",
"fi-FI", "fo", "fo-FO", "fr", "fr-BE", "fr-CA", "fr-CH", "fr-FR", "fr-LU",
"fr-MC", "gl", "gl-ES", "gu", "gu-IN", "he", "he-IL", "hi", "hi-IN", "hr",
"hr-BA", "hr-HR", "hu", "hu-HU", "hy", "hy-AM", "id", "id-ID", "is", "is-IS",
"it", "it-CH", "it-IT", "ja", "ja-JP", "ka", "ka-GE", "kk", "kk-KZ", "kn",
"kn-IN", "ko", "ko-KR", "kok", "kok-IN", "ky", "ky-KG", "lt", "lt-LT", "lv",
"lv-LV", "mi", "mi-NZ", "mk", "mk-MK", "mn", "mn-MN", "mr", "mr-IN", "ms",
"ms-BN", "ms-MY", "mt", "mt-MT", "nb", "nb-NO", "nl", "nl-BE", "nl-NL",
"nn-NO", "ns", "ns-ZA", "pa", "pa-IN", "pl", "pl-PL", "ps", "ps-AR", "pt",
"pt-BR", "pt-PT", "qu", "qu-BO", "qu-EC", "qu-PE", "ro", "ro-RO", "ru",
"ru-RU", "sa", "sa-IN", "se", "se-FI", "se-NO", "se-SE", "sk", "sk-SK", "sl",
"sl-SI", "sq", "sq-AL", "sr-BA", "sr-Cyrl-BA", "sr-SP", "sr-Cyrl-SP", "sv",
"sv-FI", "sv-SE", "sw", "sw-KE", "syr", "syr-SY", "ta", "ta-IN", "te",
"te-IN", "th", "th-TH", "tl", "tl-PH", "tn", "tn-ZA", "tr", "tr-TR", "tt",
"tt-RU", "ts", "uk", "uk-UA", "ur", "ur-PK", "uz", "uz-UZ", "uz-Cyrl-UZ",
"vi", "vi-VN", "xh", "xh-ZA", "zh", "zh-CN", "zh-HK", "zh-MO", "zh-SG",
"zh-TW", "zu", "zu-ZA"] [default: "en-US"]
Examples:
url-to-epub -o articulated-restraint.epub
url-to-epub --title "Articulated Restraint" -o articulated-restraint.epub
"https://www.tor.com/2019/02/06/articulated-restraint-mary-robinette-kowal/"
```
### Docker
docker run --user $UID --volume /tmp:/data captn3m0/url-to-epub:latest --output /data/articulated-restraint.epub --cover-url https://i.gr-assets.com/images/S/compressed.photo.goodreads.com/books/1548962694l/43782466._SY475_.jpg https://www.tor.com/2019/02/06/articulated-restraint-mary-robinette-kowal/
The generated file will be available in `/data/articulated-restraint.epub`.
## HACKING
There is a list of planned items in [TODO.md](TODO.md).
## License
Licensed under the [MIT License](https://nemo.mit-license.org/). See LICENSE file for details.

View File

@ -6,4 +6,5 @@ General ideas for improvement of the script:
4. Remove pandoc dependency
5. Optionally support <https://github.com/mozilla/readability> instead of article-parser. Mozilla's engine is much better.
6. Add more EPUB metadata tag support
7. Pick up the language automatically
7. Pick up the language automatically
8. Better error message without pandoc

View File

@ -3,6 +3,8 @@ const tempFile = require("tempfile");
const nodePandoc = require("node-pandoc-promise");
const fs = require("fs");
const { DownloaderHelper } = require("node-downloader-helper");
const path = require('path');
const slugify = require('slugify');
const getArticle = async url => {
try {
@ -13,24 +15,47 @@ const getArticle = async url => {
}
};
module.exports = (url, epubPath, language="en-US") => {
module.exports = (url, epubPath, title, coverURL, language="en-US") => {
getArticle(url).then(res => {
let xml = `<dc:title id="epub-title-1">${res.title}</dc:title>
<dc:date>${res.published}</dc:date>
title = title ? title : res.title;
epubPath = epubPath ? epubPath : slugify(path.basename(url)) + '.epub';
let date = new Date(Date.parse(res.published));
function pad(number) {
if (number < 10) {
return '0' + number;
}
return number;
}
// Using toISOString() trips Pandoc, which leaves an empty dc:date element instead.
let epubDate = date.getUTCFullYear() +
'-' + pad(date.getUTCMonth() + 1) +
'-' + pad(date.getUTCDate());
let xml = `<dc:title id="epub-title-1">${title}</dc:title>
<dc:date>${epubDate}</dc:date>
<dc:language>${language}</dc:language>
<dc:identifier>${url}</dc:identifier>
<dc:creator id="epub-creator-1" opf:role="aut">${res.author}</dc:creator>`;
<dc:source>${url}</dc:source>
<dc:description>${res.description}</dc:description>
<dc:publisher>${res.source}</dc:publisher>
<dc:creator opf:role="aut">${res.author}</dc:creator>`;
let html = tempFile(".html");
let metadata = tempFile(".xml");
fs.writeFileSync(html, res.content);
fs.writeFileSync(metadata, xml);
const imageUrl = coverURL ? coverURL : res.image;
const imageUrl = res.image;
const dl = new DownloaderHelper(res.image, "/tmp", {
fileName: "epub-to-image.jpg"
const dl = new DownloaderHelper(imageUrl, "/tmp", {
fileName: "epub-to-image.jpg",
override: true
});
dl.start();

544
index.js
View File

@ -3,269 +3,281 @@ var argv = require("yargs").argv;
const generateEPUB = require("./generate");
require("yargs") // eslint-disable-line
.usage("$0 --output [output] <url>")
.help("h")
.command(
"$0 <url>",
"Generate EPUB file from URL",
yargs => {
yargs
.positional("url", {
describe: "The URL to download",
type: "string"
})
.option("output", {
alias: "o",
type: "string",
default: "url-to-epub.epub",
description: "Output file to save EPUB"
})
.option("language", {
alias: "l",
type: "string",
default: "en-US",
description: "A valid language tag",
choices: [
"af",
"af-ZA",
"ar",
"ar-AE",
"ar-BH",
"ar-DZ",
"ar-EG",
"ar-IQ",
"ar-JO",
"ar-KW",
"ar-LB",
"ar-LY",
"ar-MA",
"ar-OM",
"ar-QA",
"ar-SA",
"ar-SY",
"ar-TN",
"ar-YE",
"az",
"az-AZ",
"az-Cyrl-AZ",
"be",
"be-BY",
"bg",
"bg-BG",
"bs-BA",
"ca",
"ca-ES",
"cs",
"cs-CZ",
"cy",
"cy-GB",
"da",
"da-DK",
"de",
"de-AT",
"de-CH",
"de-DE",
"de-LI",
"de-LU",
"dv",
"dv-MV",
"el",
"el-GR",
"en",
"en-AU",
"en-BZ",
"en-CA",
"en-CB",
"en-GB",
"en-IE",
"en-JM",
"en-NZ",
"en-PH",
"en-TT",
"en-US",
"en-ZA",
"en-ZW",
"eo",
"es",
"es-AR",
"es-BO",
"es-CL",
"es-CO",
"es-CR",
"es-DO",
"es-EC",
"es-ES",
"es-GT",
"es-HN",
"es-MX",
"es-NI",
"es-PA",
"es-PE",
"es-PR",
"es-PY",
"es-SV",
"es-UY",
"es-VE",
"et",
"et-EE",
"eu",
"eu-ES",
"fa",
"fa-IR",
"fi",
"fi-FI",
"fo",
"fo-FO",
"fr",
"fr-BE",
"fr-CA",
"fr-CH",
"fr-FR",
"fr-LU",
"fr-MC",
"gl",
"gl-ES",
"gu",
"gu-IN",
"he",
"he-IL",
"hi",
"hi-IN",
"hr",
"hr-BA",
"hr-HR",
"hu",
"hu-HU",
"hy",
"hy-AM",
"id",
"id-ID",
"is",
"is-IS",
"it",
"it-CH",
"it-IT",
"ja",
"ja-JP",
"ka",
"ka-GE",
"kk",
"kk-KZ",
"kn",
"kn-IN",
"ko",
"ko-KR",
"kok",
"kok-IN",
"ky",
"ky-KG",
"lt",
"lt-LT",
"lv",
"lv-LV",
"mi",
"mi-NZ",
"mk",
"mk-MK",
"mn",
"mn-MN",
"mr",
"mr-IN",
"ms",
"ms-BN",
"ms-MY",
"mt",
"mt-MT",
"nb",
"nb-NO",
"nl",
"nl-BE",
"nl-NL",
"nn-NO",
"ns",
"ns-ZA",
"pa",
"pa-IN",
"pl",
"pl-PL",
"ps",
"ps-AR",
"pt",
"pt-BR",
"pt-PT",
"qu",
"qu-BO",
"qu-EC",
"qu-PE",
"ro",
"ro-RO",
"ru",
"ru-RU",
"sa",
"sa-IN",
"se",
"se-FI",
"se-NO",
"se-SE",
"sk",
"sk-SK",
"sl",
"sl-SI",
"sq",
"sq-AL",
"sr-BA",
"sr-Cyrl-BA",
"sr-SP",
"sr-Cyrl-SP",
"sv",
"sv-FI",
"sv-SE",
"sw",
"sw-KE",
"syr",
"syr-SY",
"ta",
"ta-IN",
"te",
"te-IN",
"th",
"th-TH",
"tl",
"tl-PH",
"tn",
"tn-ZA",
"tr",
"tr-TR",
"tt",
"tt-RU",
"ts",
"uk",
"uk-UA",
"ur",
"ur-PK",
"uz",
"uz-UZ",
"uz-Cyrl-UZ",
"vi",
"vi-VN",
"xh",
"xh-ZA",
"zh",
"zh-CN",
"zh-HK",
"zh-MO",
"zh-SG",
"zh-TW",
"zu",
"zu-ZA"
]
})
.example(
'$0 -o articulated-restraint.epub "https://www.tor.com/2019/02/06/articulated-restraint-mary-robinette-kowal/"'
)
.demandOption(["url"]);
},
argv => {
generateEPUB(argv.url, argv.output);
}
).argv;
.usage("$0 --output [output] --title [title] <url>")
.help("h")
.command(
"$0 <url>",
"Generate EPUB file from URL",
yargs => {
yargs
.positional("url", {
describe: "The URL to download",
type: "string"
})
.option("output", {
alias: "o",
type: "string",
default: false,
description: "Output file to save EPUB"
})
.option("title", {
alias: "t",
type: "string",
default: null,
description:
"Title of the book, if not the same as the page title"
})
.option("cover-url", {
type: "string",
default: null,
description: "Image URL to download as cover"
})
.option("language", {
alias: "l",
type: "string",
default: "en-US",
description: "A valid language tag",
choices: [
"af",
"af-ZA",
"ar",
"ar-AE",
"ar-BH",
"ar-DZ",
"ar-EG",
"ar-IQ",
"ar-JO",
"ar-KW",
"ar-LB",
"ar-LY",
"ar-MA",
"ar-OM",
"ar-QA",
"ar-SA",
"ar-SY",
"ar-TN",
"ar-YE",
"az",
"az-AZ",
"az-Cyrl-AZ",
"be",
"be-BY",
"bg",
"bg-BG",
"bs-BA",
"ca",
"ca-ES",
"cs",
"cs-CZ",
"cy",
"cy-GB",
"da",
"da-DK",
"de",
"de-AT",
"de-CH",
"de-DE",
"de-LI",
"de-LU",
"dv",
"dv-MV",
"el",
"el-GR",
"en",
"en-AU",
"en-BZ",
"en-CA",
"en-CB",
"en-GB",
"en-IE",
"en-JM",
"en-NZ",
"en-PH",
"en-TT",
"en-US",
"en-ZA",
"en-ZW",
"eo",
"es",
"es-AR",
"es-BO",
"es-CL",
"es-CO",
"es-CR",
"es-DO",
"es-EC",
"es-ES",
"es-GT",
"es-HN",
"es-MX",
"es-NI",
"es-PA",
"es-PE",
"es-PR",
"es-PY",
"es-SV",
"es-UY",
"es-VE",
"et",
"et-EE",
"eu",
"eu-ES",
"fa",
"fa-IR",
"fi",
"fi-FI",
"fo",
"fo-FO",
"fr",
"fr-BE",
"fr-CA",
"fr-CH",
"fr-FR",
"fr-LU",
"fr-MC",
"gl",
"gl-ES",
"gu",
"gu-IN",
"he",
"he-IL",
"hi",
"hi-IN",
"hr",
"hr-BA",
"hr-HR",
"hu",
"hu-HU",
"hy",
"hy-AM",
"id",
"id-ID",
"is",
"is-IS",
"it",
"it-CH",
"it-IT",
"ja",
"ja-JP",
"ka",
"ka-GE",
"kk",
"kk-KZ",
"kn",
"kn-IN",
"ko",
"ko-KR",
"kok",
"kok-IN",
"ky",
"ky-KG",
"lt",
"lt-LT",
"lv",
"lv-LV",
"mi",
"mi-NZ",
"mk",
"mk-MK",
"mn",
"mn-MN",
"mr",
"mr-IN",
"ms",
"ms-BN",
"ms-MY",
"mt",
"mt-MT",
"nb",
"nb-NO",
"nl",
"nl-BE",
"nl-NL",
"nn-NO",
"ns",
"ns-ZA",
"pa",
"pa-IN",
"pl",
"pl-PL",
"ps",
"ps-AR",
"pt",
"pt-BR",
"pt-PT",
"qu",
"qu-BO",
"qu-EC",
"qu-PE",
"ro",
"ro-RO",
"ru",
"ru-RU",
"sa",
"sa-IN",
"se",
"se-FI",
"se-NO",
"se-SE",
"sk",
"sk-SK",
"sl",
"sl-SI",
"sq",
"sq-AL",
"sr-BA",
"sr-Cyrl-BA",
"sr-SP",
"sr-Cyrl-SP",
"sv",
"sv-FI",
"sv-SE",
"sw",
"sw-KE",
"syr",
"syr-SY",
"ta",
"ta-IN",
"te",
"te-IN",
"th",
"th-TH",
"tl",
"tl-PH",
"tn",
"tn-ZA",
"tr",
"tr-TR",
"tt",
"tt-RU",
"ts",
"uk",
"uk-UA",
"ur",
"ur-PK",
"uz",
"uz-UZ",
"uz-Cyrl-UZ",
"vi",
"vi-VN",
"xh",
"xh-ZA",
"zh",
"zh-CN",
"zh-HK",
"zh-MO",
"zh-SG",
"zh-TW",
"zu",
"zu-ZA"
]
})
.example(
'$0 --title "Articulated Restraint" -o articulated-restraint.epub "https://www.tor.com/2019/02/06/articulated-restraint-mary-robinette-kowal/"'
)
.demandOption(["url"]);
},
argv => {
generateEPUB(argv.url, argv.output, argv.title, argv['cover-url'], argv.language);
}
).argv;

2689
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@ -1,18 +1,24 @@
{
"name": "url-to-epub",
"version": "1.0.0",
"version": "1.2.4",
"description": "A single tool to generate a standards-compliant EPUB from a webpage. Zero config. Requires pandoc",
"main": "index.js",
"bin": {
"url-to-epub": "./index.js"
},
"repository": {
"type": "git",
"url": "https://github.com/captn3m0/url-to-epub.git"
},
"funding": "https://paypal.me/captn3m0",
"author": "Nemo",
"license": "MIT",
"dependencies": {
"article-parser": "^4.1.1",
"node-downloader-helper": "^1.0.11",
"article-parser": "^4.2.1",
"node-downloader-helper": "^1.0.13",
"node-pandoc-promise": "0.0.6",
"slugify": "^1.4.4",
"tempfile": "^3.0.0",
"yargs": "^15.1.0"
"yargs": "^15.4.1"
}
}