Compare commits

...

29 Commits
v1.0.0 ... main

Author SHA1 Message Date
Nemo bb122223fd
Create FUNDING.yml 2022-05-27 07:22:57 +00:00
Vonter 2c386a3f2f Add README badges 2022-01-26 11:10:00 +05:30
Nemo e62284f3b0 update changelog 2021-12-31 13:15:36 +05:30
Nemo 55bfb6e26b
Merge pull request #20 from captn3m0/python-upgrade 2021-12-30 17:29:41 +05:30
Nemo be985dd40b [dep] switch from html5 to html5lib 2021-12-30 17:18:03 +05:30
Nemo c614de7efc [ci] Run tests on python3.10 2021-12-30 17:02:30 +05:30
Nemo f617c6fde5
Add Installation instructions
Closes #19
2021-07-21 18:17:00 +00:00
Nemo 5167dd4c8a
Merge pull request #18 from captn3m0/old-python
Support older python releases
2021-07-16 17:07:24 +05:30
Nemo dd8129aa2d Fix for older Python 2021-07-16 17:05:27 +05:30
Nemo 3ea18ff01b [tests] Add tests for argument parser 2021-07-16 16:57:09 +05:30
Nemo 2db41250f6 docs: Update docs to mention remote URL support 2021-07-05 13:34:45 +05:30
Nemo cc2a58bddc
Add Tests (#13)
Basic functional tests that cover 90% of the usecases. 
Doesn't cover zoomlevel, remote fetch yet.
2021-07-04 07:27:18 +00:00
Vonter af4752bee1
Merge pull request #11 from captn3m0/feature/external_url
Add basic implementation of external URL fetching of PDFs
2021-06-27 20:51:10 +05:30
Vonter 052060d256
Fix setup.cfg
Included validators
2021-06-27 17:57:38 +05:30
Vonter e70166efc2
Fix logged filename for locally cached file 2021-06-27 17:43:09 +05:30
Vonter 31faa1a36c
Add external URL fetching of PDFs
Also changed import order according to PEP8
2021-06-27 17:33:49 +05:30
Vonter ebc9c1e0cf
Update README.rst
Fixed attribute table
2021-06-27 00:15:26 +05:30
Vonter 1324c2e4aa
Merge pull request #10 from Vonter/feature/page_filter
Add PDF page selection/filter
2021-06-27 00:12:17 +05:30
Vonter 487e1002d4
Make defaultEnd correspond to absolute page number 2021-06-27 00:03:57 +05:30
Vonter 096b1f6be2
Add PDF page selection/filter 2021-06-26 22:56:38 +05:30
Nemo 4f505efde2 Add link to wiki 2021-06-26 18:05:47 +05:30
Nemo ab4d938909 Release: v1.0.2 2021-06-26 17:57:51 +05:30
Vonter 1b8185cfd0 Added PDF rotation filter 2021-06-26 17:51:22 +05:30
Nemo c1a2926ce2 Update CHANGELOG 2021-05-29 02:26:11 +05:30
Nemo 3a9971c77a Fixes the case of heading less markdown 2021-05-29 02:18:43 +05:30
Nemo f61de619bc Move around test templates 2021-05-29 00:40:46 +05:30
Nemo d834964d4e Add --help output to README 2021-05-29 00:34:36 +05:30
Nemo a04625683c Add help for --cleanup 2021-05-29 00:32:56 +05:30
Nemo 860065d99b Remove tox again 2021-05-29 00:16:55 +05:30
26 changed files with 425 additions and 89 deletions

3
.github/FUNDING.yml vendored Normal file
View File

@ -0,0 +1,3 @@
ko_fi: captn3m0
liberapay: captn3m0
github: captn3m0

29
.github/workflows/tests.yml vendored Normal file
View File

@ -0,0 +1,29 @@
name: Run Tests
on: push
jobs:
python:
runs-on: ubuntu-latest
strategy:
matrix:
python: ["3.7", "3.8", "3.9", "3.10"]
env:
PYTHON_VERSION: ${{matrix.python}}
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{matrix.python}}
uses: actions/setup-python@v2
with:
python-version: ${{matrix.python}}
- name: Install deps
run: |
python -m pip install --upgrade pip
pip install -e .[testing]
- name: Run pytest
run: |
pytest --cache-clear --cov=./ --cov-report=xml --cov-report=html
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: ./coverage.xml
env_vars: RUNNER_OS,PYTHON_VERSION,CI,GITHUB_SHA,RUNNER_OS,GITHUB_RUN_ID

View File

@ -2,6 +2,28 @@
Changelog
=========
Version 1.0.4
=============
- Switched from `html5` to `html5lib` as a dependency, since the former is unmaintained
- Python 3.10 is now supported
- Python 3.6 is no longer supported
Version 1.0.3
=============
- Added tests and code coverage
- PDFs can be directly fetched from Remote URLs
- PDFs can be filtered to have start and end pages
- Support for Python 3.6-3.8
- Removed --cleanup argument, since that is default
Version 1.0.2
=============
- Adds support for rotating PDFs
Version 1.0.1
=============
- Fixes a bug where markdown files without any headings would not render.
Version 1.0.0
===========

View File

@ -2,15 +2,59 @@
pystitcher
==========
.. image:: https://img.shields.io/pypi/v/pystitcher
:target: https://pypi.org/project/pystitcher/
:alt: PyPI Version
.. image:: https://img.shields.io/pypi/l/pystitcher
:target: LICENSE.txt
:alt: Repository License
.. image:: https://img.shields.io/github/checks-status/captn3m0/pystitcher/main
:target: https://github.com/captn3m0/pystitcher/actions?query=branch%3Amain
:alt: GitHub branch checks status
.. image:: https://img.shields.io/codecov/c/gh/captn3m0/pystitcher
:target: https://app.codecov.io/gh/captn3m0/pystitcher/
:alt: Codecov
|
pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative input in the form of a markdown file. It is written in pure python and uses `PyPDF3 <https://pypi.org/project/PyPDF3/>`_ for reading and writing PDF files.
Installation
============
You can install it easily using `pipx <https://pypa.github.io/pipx/>`_::
pipx install pystitcher
The Wiki has `Alternative Installation Instructions <https://github.com/captn3m0/pystitcher/wiki/Installation>`_.
Description
===========
pystitcher is a command line tool, with very few cli options::
usage: pystitcher [-h] [--version] [-v] [--cleanup | --no-cleanup] spine.md output.pdf
Stitch PDF files together
positional arguments:
spine.md Input markdown file
output.pdf Output PDF file
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
-v, --verbose log more things
--cleanup, --no-cleanup
Delete temporary files (default: True)
Given this input::
existing_bookmarks: flatten
existing_bookmarks: remove
title: Complete Guide to the Personal Data Protection Bill
author: Medianama
keywords: privacy, surveillance, personal data protection
@ -21,8 +65,8 @@ Given this input::
# The Bills
- [Personal Data Protection Bill, 2019](1.a.pdf)
- [Personal Data Protection Bill, 2018](1.b.pdf)
- [Personal Data Protection Bill, 2019](https://example.com/2019-bill.pdf)
- [Personal Data Protection Bill, 2018](https://example.com/2018-bill.pdf)
# Other key reading material
@ -70,3 +114,30 @@ Configuration options can be specified with Meta data at the top of the file.
| | `docs <https://github.com/captn3m0/pystitcher/wiki/Existing-Bookmarks>`_ |
| | for more details. |
+---------------------+--------------------------------------------------------------------------+
Additionally, PDF links specified in markdown can have attributes to alter the PDFs before merging. The below attribute will rotate the second PDF file by 90 degrees clockwise before merging::
[Part 1](1.pdf)
[Part 2](2.pdf){: rotate="90"}
And the below attribute will merge only pages 2 to 5, both inclusive, from the second PDF file::
[Part 1](1.pdf)
[Part 2](2.pdf){: start=2 end=5}
The list of available attributes are:
+---------------------+-----------------------------------------------+
| Attribute | Notes |
+=====================+===============================================+
| rotate | Rotate the PDF. Valid values are 90, 180, 270 |
+---------------------+-----------------------------------------------+
| start | Start page number for PDF page selection |
+---------------------+-----------------------------------------------+
| end | End page number for PDF page selection |
+---------------------+-----------------------------------------------+
Documentation
=============
Additional documentation is maintained on the `project wiki <https://github.com/captn3m0/pystitcher/wiki>`_ on GitHub.

View File

@ -36,16 +36,18 @@ package_dir =
=src
# Require a min/specific Python version (comma-separated conditions)
python_requires = >=3.6
python_requires = >=3.7
# PyPDF3: Read and write PDF files
# Markdown: Render input markdown file to HTML
# html5: Parse HTML file to generate bookmarks
# html5lib: Parse HTML file to generate bookmarks
# validators: Validate URL for fetching external PDF
install_requires =
importlib-metadata; python_version<"3.8"
PyPDF3>=1.0.4
Markdown>=3.3.4
html5>=0.0.9
html5lib>=1.1
validators>=0.18.1
[options.packages.find]
where = src
@ -80,9 +82,9 @@ console_scripts =
# in order to write a coverage file that can be read by Jenkins.
# CAUTION: --cov flags may prohibit setting breakpoints while debugging.
# Comment those flags to avoid this py.test issue.
addopts =
--cov pystitcher --cov-report term-missing
--verbose
addopts = --verbose
# --cov pystitcher --cov-report term-missing
norecursedirs =
dist
build

View File

@ -40,7 +40,7 @@ def parse_args(args):
action="version",
version="pystitcher {ver}".format(ver=__version__),
)
parser.add_argument(dest="input", help="Input Spine markdown file", type=argparse.FileType('r', encoding='UTF-8'), metavar="spine.md")
parser.add_argument(dest="input", help="Input markdown file", type=argparse.FileType('r', encoding='UTF-8'), metavar="spine.md")
parser.add_argument(dest="output", help="Output PDF file", type=str, metavar="output.pdf")
parser.add_argument(
"-v",
@ -51,8 +51,13 @@ def parse_args(args):
const=logging.INFO,
)
parser.add_argument('--cleanup', action=argparse.BooleanOptionalAction, default=True)
# parser.parse_args(['--no-cleanup'])
parser.add_argument(
'--no-cleanup',
action='store_false',
default=True,
dest='cleanup',
help="Delete temporary files"
)
return parser.parse_args(args)

View File

@ -1,12 +1,17 @@
import os
import markdown
from .bookmark import Bookmark
import logging
import shutil
import tempfile
import urllib.request
import validators
import html5lib
import markdown
from PyPDF3 import PdfFileWriter, PdfFileReader
from PyPDF3.generic import FloatObject
from pystitcher import __version__
import tempfile
import logging
from .bookmark import Bookmark
_logger = logging.getLogger(__name__)
@ -17,11 +22,17 @@ class Stitcher:
self.currentPage = 1
self.title = None
self.bookmarks = []
self.currentLevel = None
self.currentLevel = 0
self.oldBookmarks = []
self.dir = os.path.dirname(os.path.abspath(inputBuffer.name))
# Fit complete page width by default
DEFAULT_FIT = '/FitV'
# Do not rotate by default
DEFAULT_ROTATE = 0
# Start at page 1 by default
DEFAULT_START = 1
# End at the final page by default
DEFAULT_END = None
# TODO: This is a hack
os.chdir(self.dir)
@ -31,11 +42,28 @@ class Stitcher:
html = md.convert(text)
self.attributes = md.Meta
self.defaultFit = self._getAttribute('fit', DEFAULT_FIT)
self.defaultRotate = self._getAttribute('rotate', DEFAULT_ROTATE)
self.defaultStart = self._getAttribute('start', DEFAULT_START)
self.defaultEnd = self._getAttribute('end', DEFAULT_END)
document = html5lib.parseFragment(html, namespaceHTMLElements=False)
for e in document.iter():
self.iter(e)
"""
Check if file has been cached locally and if
not cached, download from provided URL. Return
download filename
"""
def _cacheURL(self, url):
if not os.path.exists(os.path.basename(url)):
_logger.info("Downloading PDF from remote URL %s", url)
with urllib.request.urlopen(url) as response, open(os.path.basename(url), 'wb') as downloadedFile:
shutil.copyfileobj(response, downloadedFile)
else:
_logger.info("Locally cached PDF found at %s", os.path.basename(url))
return os.path.basename(url)
"""
Get the number of pages in a PDF file
"""
@ -89,10 +117,17 @@ class Stitcher:
self.currentLevel = 3
elif(tag =='a'):
file = element.attrib.get('href')
if(validators.url(file)):
file = self._cacheURL(file)
fit = element.attrib.get('fit', self.defaultFit)
rotate = int(element.attrib.get('rotate', self.defaultRotate))
start = int(element.attrib.get('start', self.defaultStart))
end = int(element.attrib.get('end', self._get_pdf_number_of_pages(file)
if self.defaultEnd is None else self.defaultEnd))
filters = (rotate, start, end)
b = Bookmark(self.currentPage, element.text, self.currentLevel+1, fit)
self.files.append((file, self.currentPage))
self.currentPage += self._get_pdf_number_of_pages(file)
self.files.append((file, self.currentPage, filters))
self.currentPage += (end - start) + 1
if b:
self.bookmarks.append(b)
@ -107,7 +142,7 @@ class Stitcher:
return (self._existingBookmarkConfig() == 'flatten')
"""
Adds the existing bookmarks into the
Adds the existing bookmarks into the
self.bookmarks list
"""
def _add_existing_bookmarks(self):
@ -129,7 +164,7 @@ class Stitcher:
self.bookmarks = bookmarks
"""
Gets the last bookmkark level at a given page number
Gets the last bookmark level at a given page number
on the combined PDF
"""
def _get_level_from_page_number(self, page):
@ -186,14 +221,15 @@ class Stitcher:
"""
def _merge(self, output):
writer = PdfFileWriter()
for (inputFile,startPage) in self.files:
for (inputFile,startPage,filters) in self.files:
assert os.path.isfile(inputFile), ERROR_PATH.format(inputFile)
reader = PdfFileReader(open(inputFile, 'rb'))
# Recursively iterate through the old bookmarks
self._iterate_old_bookmarks(reader, startPage, reader.getOutlines())
for page in range(1, reader.getNumPages()+1):
writer.addPage(reader.getPage(page - 1))
rotate, start, end = filters
for page in range(start, end + 1):
writer.addPage(reader.getPage(page - 1).rotateClockwise(rotate))
writer.write(output)
output.close()

1
tests/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
*.pdf

BIN
tests/1.pdf Normal file

Binary file not shown.

37
tests/2.md Normal file
View File

@ -0,0 +1,37 @@
<!-- Convert to PDF as per https://tex.stackexchange.com/a/553075 to ensure bookmarks -->
# Chapter 3
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus quis tristique velit, ut elementum velit. Vestibulum sit amet purus ac lacus pretium ornare. Integer aliquet finibus odio ac sagittis. Cras ultricies vitae mauris vitae facilisis. Fusce mollis nisl nulla, a blandit arcu consequat a. Morbi faucibus tincidunt orci a luctus. Curabitur at urna fringilla, vestibulum nunc at, molestie tortor. Nullam vestibulum lorem vitae lectus pellentesque, a blandit felis blandit. Cras sagittis ante sit amet finibus hendrerit. Pellentesque viverra sollicitudin neque, ac lacinia metus tincidunt ut. Duis non sollicitudin nibh, sed sodales quam. Proin nec risus ac est elementum dignissim eu at eros.
In et suscipit purus. Nullam tincidunt sit amet orci vitae consequat. In facilisis, quam non tincidunt rhoncus, orci eros porttitor magna, vitae mattis orci turpis euismod sapien. Praesent imperdiet quam at dui hendrerit suscipit. Vivamus fermentum massa metus, quis viverra libero gravida eu. Nunc non libero felis. Vestibulum scelerisque massa ac est sagittis, non mollis neque semper. Donec ultrices magna est, in cursus turpis efficitur vitae. Curabitur et dignissim lacus, non porta enim. Mauris vel aliquam libero, non venenatis elit. Aenean nisl ante, malesuada et massa et, elementum vehicula sapien. Maecenas egestas velit nec pulvinar pulvinar. Proin non sem imperdiet, porta nisl a, iaculis erat. Nunc egestas mattis felis, consequat mollis ligula euismod et.
In ut justo gravida, varius massa sit amet, vestibulum erat. Praesent lacinia odio viverra diam ornare sagittis. Maecenas ut nisl rutrum, auctor orci a, auctor odio. Integer tincidunt elementum enim, vel pretium augue ullamcorper et. Nunc scelerisque, quam et imperdiet faucibus, diam lorem tempus sem, sit amet eleifend orci libero at nulla. Integer luctus, purus non auctor eleifend, elit turpis condimentum enim, eget bibendum tortor risus nec ante. Pellentesque lobortis eleifend elementum. Mauris ut massa mauris. Nam lacinia lorem quis ante tristique suscipit. Donec libero quam, vulputate nec aliquam vitae, mollis in diam. Fusce sit amet fringilla turpis.
Mauris sit amet volutpat sem. Nullam viverra sagittis velit quis varius. Praesent rutrum quam quis mollis elementum. Sed in elementum neque. Praesent et volutpat nisi. Sed eu ante tincidunt, venenatis odio id, semper nisl. Mauris tincidunt posuere justo id convallis. Praesent faucibus, leo et pulvinar volutpat, quam sem feugiat diam, non molestie nibh justo eu lorem. Fusce id risus efficitur, tempus elit sit amet, sodales enim. Suspendisse ac lorem eu lorem bibendum ultrices at at ipsum. Vestibulum facilisis lorem sit amet consequat imperdiet. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.
\pagebreak
# Chapter 4
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus quis tristique velit, ut elementum velit. Vestibulum sit amet purus ac lacus pretium ornare. Integer aliquet finibus odio ac sagittis. Cras ultricies vitae mauris vitae facilisis. Fusce mollis nisl nulla, a blandit arcu consequat a. Morbi faucibus tincidunt orci a luctus. Curabitur at urna fringilla, vestibulum nunc at, molestie tortor. Nullam vestibulum lorem vitae lectus pellentesque, a blandit felis blandit. Cras sagittis ante sit amet finibus hendrerit. Pellentesque viverra sollicitudin neque, ac lacinia metus tincidunt ut. Duis non sollicitudin nibh, sed sodales quam. Proin nec risus ac est elementum dignissim eu at eros.
In et suscipit purus. Nullam tincidunt sit amet orci vitae consequat. In facilisis, quam non tincidunt rhoncus, orci eros porttitor magna, vitae mattis orci turpis euismod sapien. Praesent imperdiet quam at dui hendrerit suscipit. Vivamus fermentum massa metus, quis viverra libero gravida eu. Nunc non libero felis. Vestibulum scelerisque massa ac est sagittis, non mollis neque semper. Donec ultrices magna est, in cursus turpis efficitur vitae. Curabitur et dignissim lacus, non porta enim. Mauris vel aliquam libero, non venenatis elit. Aenean nisl ante, malesuada et massa et, elementum vehicula sapien. Maecenas egestas velit nec pulvinar pulvinar. Proin non sem imperdiet, porta nisl a, iaculis erat. Nunc egestas mattis felis, consequat mollis ligula euismod et.
In ut justo gravida, varius massa sit amet, vestibulum erat. Praesent lacinia odio viverra diam ornare sagittis. Maecenas ut nisl rutrum, auctor orci a, auctor odio. Integer tincidunt elementum enim, vel pretium augue ullamcorper et. Nunc scelerisque, quam et imperdiet faucibus, diam lorem tempus sem, sit amet eleifend orci libero at nulla. Integer luctus, purus non auctor eleifend, elit turpis condimentum enim, eget bibendum tortor risus nec ante. Pellentesque lobortis eleifend elementum. Mauris ut massa mauris. Nam lacinia lorem quis ante tristique suscipit. Donec libero quam, vulputate nec aliquam vitae, mollis in diam. Fusce sit amet fringilla turpis.
Mauris sit amet volutpat sem. Nullam viverra sagittis velit quis varius. Praesent rutrum quam quis mollis elementum. Sed in elementum neque. Praesent et volutpat nisi. Sed eu ante tincidunt, venenatis odio id, semper nisl. Mauris tincidunt posuere justo id convallis. Praesent faucibus, leo et pulvinar volutpat, quam sem feugiat diam, non molestie nibh justo eu lorem. Fusce id risus efficitur, tempus elit sit amet, sodales enim. Suspendisse ac lorem eu lorem bibendum ultrices at at ipsum. Vestibulum facilisis lorem sit amet consequat imperdiet. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.
## Scene 3
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed sollicitudin nunc et ligula tristique iaculis. Praesent malesuada velit a ipsum vehicula, vitae efficitur ex condimentum. Nunc non placerat tellus. Suspendisse nulla ex, semper eget lectus vel, aliquet blandit justo. Morbi quis elit vel leo auctor pharetra eu sit amet enim. Mauris lorem dui, ultrices bibendum cursus sed, gravida feugiat est. Praesent efficitur, tortor vel mollis sollicitudin, nisi neque placerat purus, maximus sodales dolor odio vitae eros. Suspendisse potenti. Etiam in augue nunc. Duis ac fermentum arcu. Suspendisse lacinia purus elit, ac rutrum massa pulvinar non. Cras at congue lectus. Praesent viverra elit at lacus fringilla cursus at eget dolor. Donec eget elit nec lacus pretium cursus. Mauris sagittis aliquet magna ut cursus. Nam non tellus dignissim, gravida felis tempor, ultricies elit.
Aliquam consequat varius euismod. Quisque tincidunt metus quis mi tempus, vel mattis nulla facilisis. Integer tortor nibh, lobortis a maximus non, feugiat cursus risus. Nulla ut condimentum massa. Nam varius velit efficitur, bibendum ligula quis, dictum massa. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Proin efficitur ornare ipsum, ac volutpat turpis fringilla a.
## Scene 4
Sed in lacinia risus, sit amet sollicitudin justo. Aliquam erat volutpat. Sed augue urna, aliquam non tellus in, facilisis viverra urna. Proin feugiat, ipsum eu euismod interdum, arcu augue blandit ex, a suscipit dui dolor ut eros. Maecenas tempor dignissim urna nec sollicitudin. Nam eget sodales lorem, nec varius felis. Morbi laoreet eros elit, in cursus ex tempus vitae. Etiam ultrices suscipit tellus sed sollicitudin. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Aliquam nec nisi non nisi vestibulum maximus. Suspendisse tristique eros eu metus laoreet, nec euismod mi elementum. Morbi dapibus ipsum non purus aliquet sollicitudin. Nullam ultricies commodo mauris ut tincidunt. Etiam felis est, fermentum a dui sit amet, maximus egestas tortor.
Nullam suscipit ut tortor vitae gravida. Nullam lobortis risus et nulla ultricies pretium. Ut a eros eros. In non sem ut orci lobortis laoreet a id ex. Quisque magna dolor, iaculis nec eros ut, auctor pulvinar lectus. Fusce elementum pretium auctor. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nam facilisis nisi neque, eu tristique tortor accumsan lobortis.
Curabitur rutrum quam quis elit ultricies condimentum. Sed dui libero, vehicula malesuada aliquam vitae, aliquet ac tellus. Nullam a nibh nec nulla fringilla commodo. Aliquam venenatis lobortis enim. Aliquam sit amet sem quis nunc luctus congue ut a nunc. Donec in libero facilisis, consequat arcu ac, dignissim ex. Sed nec diam est. Pellentesque orci justo, vehicula quis nisi eu, varius molestie ligula.

BIN
tests/2.pdf Normal file

Binary file not shown.

5
tests/3.md Normal file
View File

@ -0,0 +1,5 @@
<!-- Convert to PDF as per https://tex.stackexchange.com/a/553075 to ensure bookmarks -->
# Cast of Characters
- Nemo, Master of Ceremonies

View File

@ -1,12 +1,14 @@
existing_bookmarks: remove
author: Wiki, the Cat
title: Super Jelly Book
subject: A book about adventures of Wiki, the cat.
keywords: wiki,potato,jelly
# Super Potato Book
# Volume 1
[Part 1](2page.pdf)
[Part 2](2page.pdf)
[Part 1](1.pdf)
# Volume 2
[Part 3](2page.pdf)
[Part 4](2page.pdf)
[Part 2](2.pdf)

View File

@ -0,0 +1,17 @@
existing_bookmarks: remove
author: Wiki, the Cat
subject: A book about adventures of Wiki, the cat.
keywords: wiki,potato,jelly
# Super Potato Book
# Volume 1
[Part 1](1.pdf)
# Volume 2
[Part 2](https://unec.edu.az/application/uploads/2014/12/pdf-sample.pdf)
# Volume 3
[Part 3](https://juventudedesporto.cplp.org/files/sample-pdf_9359.pdf)

View File

@ -1,12 +1,11 @@
existing_bookmarks: flatten
# Super Potato Book
# Volume 1
[Part 1](2page.pdf)
[Part 2](2page.pdf)
[Part 1](1.pdf)
# Volume 2
[Part 3](2page.pdf)
[Part 4](2page.pdf)
[Part 3](2.pdf)

10
tests/book-headings.md Normal file
View File

@ -0,0 +1,10 @@
# Heading 1
[Part 1](1.pdf)
## Heading 2
[Part 2](1.pdf)
### Heading 3
[Part 3](1.pdf)

View File

@ -1,16 +1,10 @@
existing_bookmarks: keep
title: Super Jelly Book
author: Wiki, the Cat
subject: A book about adventures of Wiki, the cat.
keywords: wiki,potato,jelly
# Super Potato Book
# Volume 1
[Part 1](2page.pdf)
[Part 2](2page.pdf)
[Part 1](1.pdf)
# Volume 2
[Part 3](2page.pdf)
[Part 4](2page.pdf)
[Part 3](2.pdf)

2
tests/book-min.md Normal file
View File

@ -0,0 +1,2 @@
existing_bookmarks: keep
[Part 1](1.pdf)

18
tests/book-page-select.md Normal file
View File

@ -0,0 +1,18 @@
existing_bookmarks: keep
# Super Potato Book
# Volume 1
[Part 1](1.pdf){: start=1 end=2}
# Volume 2
[Part 2](2.pdf){: start=2}
# Volume 3
[Part 3](1.pdf){: end=2}
# Volume 4
[Part 4](2.pdf){: start=1 end=3 rotate="90"}

14
tests/book-rotate.md Normal file
View File

@ -0,0 +1,14 @@
existing_bookmarks: remove
# Super Potato Book
# Volume 1
[Part 1](1.pdf)
# Volume 2
[Part 2](2.pdf){: rotate="90"}
# Volume 3
[Part 3](1.pdf){: rotate="180"}

6
tests/book-title.md Normal file
View File

@ -0,0 +1,6 @@
---
titlepage: true
---
# Super Potato Book
In memory of Starch

15
tests/test_cli.py Normal file
View File

@ -0,0 +1,15 @@
from pystitcher.skeleton import parse_args
import logging
def test_default_args():
args = parse_args(['tests/book-clean.md', 'o.pdf'])
assert args.loglevel == None
assert args.cleanup == True
def test_loglevel():
args = parse_args(['-v', 'tests/book-clean.md', 'o.pdf'])
assert args.loglevel == logging.INFO
def test_cleanup():
args = parse_args(['--no-cleanup', 'tests/book-clean.md', 'o.pdf'])
assert args.cleanup == False

96
tests/test_integration.py Normal file
View File

@ -0,0 +1,96 @@
import os
import io
import PyPDF3
from pystitcher.stitcher import Stitcher
from pystitcher import __version__
import pytest
from contextlib import redirect_stdout
ROOT_DIR = os.path.dirname(os.path.abspath(__file__)) + "/../"
"""
Fixtures for the integration tests. Each test is a tuple consisting of 4 things:
- input name (used as book-{name}.md)
- total expected page count
- A dictionary of expected metadata. Leave empty if nothing is set
- A flattened list of expected bookmarks, with each bookmark as a tuple containing:
- Title
- Destination Page Number
- Bookmark Level (default = 0)
Each of the above 4 is passed to test_book as an argument
"""
TEST_DATA = [
("clean",6, {'Author': 'Wiki, the Cat', 'Title': 'Super Jelly Book', 'Subject': 'A book about adventures of Wiki, the cat.', 'Keywords': 'wiki,potato,jelly'}, [('Super Potato Book', 0, 0), ('Volume 1', 0, 0), ('Part 1', 0, 1), ('Volume 2', 3, 0), ('Part 2', 3, 1)]),
("keep",6, {'Title': 'Super Potato Book'}, [('Super Potato Book', 0, 0), ('Volume 1', 0, 0), ('Part 1', 0, 1), ('Chapter 1', 0, 2), ('Chapter 2', 1, 2), ('Scene 1', 1, 3), ('Scene 2', 2, 3), ('Volume 2', 3, 0), ('Part 3', 3, 1), ('Chapter 3', 3, 2), ('Chapter 4', 4, 2), ('Scene 3', 4, 3), ('Scene 4', 5, 3)]),
("flatten", 6, {}, [('Super Potato Book', 0, 0), ('Volume 1', 0, 0), ('Part 1', 0, 1), ('Chapter 1', 0, 2), ('Chapter 2', 1, 2), ('Scene 1', 1, 2), ('Scene 2', 2, 2), ('Volume 2', 3, 0), ('Part 3', 3, 1), ('Chapter 3', 3, 2), ('Chapter 4', 4, 2), ('Scene 3', 4, 2), ('Scene 4', 5, 2)]),
("rotate", 9, {}, [('Super Potato Book', 0, 0), ('Volume 1', 0, 0), ('Part 1', 0, 1), ('Volume 2', 3, 0), ('Part 2', 3, 1), ('Volume 3', 6, 0), ('Part 3', 6, 1)]),
("min",3, {}, [('Part 1', 0, 0), ('Chapter 1', 0, 1), ('Chapter 2', 1, 1), ('Scene 1', 1, 2), ('Scene 2', 2, 2)]),
("page-select", 9, {}, [('Super Potato Book', 0, 0), ('Volume 1', 0, 0), ('Part 1', 0, 1), ('Chapter 1', 0, 2), ('Chapter 2', 1, 2), ('Scene 1', 1, 3), ('Volume 2', 2, 0), ('Part 2', 2, 1), ('Scene 2', 2, 2), ('Chapter 3', 2, 2), ('Chapter 4', 3, 2), ('Scene 3', 3, 3), ('Volume 3', 4, 0), ('Part 3', 4, 1), ('Scene 4', 4, 2), ('Chapter 1', 4, 2), ('Chapter 2', 5, 2), ('Scene 1', 5, 3), ('Volume 4', 6, 0), ('Part 4', 6, 1), ('Scene 2', 6, 2), ('Chapter 3', 6, 2), ('Chapter 4', 7, 2), ('Scene 3', 7, 3), ('Scene 4', 8, 3)]),
("headings", 9, {'Title': 'Heading 1'}, [('Heading 1', 0, 0), ('Part 1', 0, 1), ('Heading 2', 3, 1), ('Part 2', 3, 2), ('Heading 3', 6, 2), ('Part 3', 6, 3)])
]
def pdf_name(name):
return "tests/%s.pdf" % name
def render(name, cleanup=True):
input_file = open("tests/book-%s.md" % name, 'r')
output_file = "%s.pdf" % name
stitcher = Stitcher(input_file)
stitcher.generate(output_file, cleanup)
# Switch back to main directory
os.chdir(ROOT_DIR)
return pdf_name(name)
def flatten_bookmarks(bookmarks, level=0):
"""Given a list, possibly nested to any level, return it flattened."""
output = []
for destination in bookmarks:
if type(destination) == type([]):
output.extend(flatten_bookmarks(destination, level+1))
else:
output.append((destination, level))
return output
def get_all_bookmarks(pdf):
""" Returns a list of all bookmarks with title, page number, and level in a PDF file"""
bookmarks = flatten_bookmarks(pdf.getOutlines())
return [(d[0]['/Title'], pdf.getDestinationPageNumber(d[0]), d[1]) for d in bookmarks]
@pytest.mark.parametrize("name,pages,metadata,bookmarks", TEST_DATA)
def test_book(name, pages, metadata, bookmarks):
output_file = render(name)
pdf = PyPDF3.PdfFileReader(output_file)
assert pages == pdf.getNumPages()
assert bookmarks == get_all_bookmarks(pdf)
info = pdf.getDocumentInfo()
identity = "pystitcher/%s" % __version__
assert identity == info['/Producer']
assert identity == info['/Creator']
for key in metadata:
assert info["/%s" % key] == metadata[key]
def test_rotation():
""" Validates the book-rotate.pdf with pages rotated."""
output_file = render("rotate")
pdf = PyPDF3.PdfFileReader(output_file)
# Note that inputs to getPage are 0-indexed
assert 90 == pdf.getPage(3)['/Rotate']
assert 90 == pdf.getPage(4)['/Rotate']
assert 90 == pdf.getPage(5)['/Rotate']
assert 180 == pdf.getPage(6)['/Rotate']
assert 180 == pdf.getPage(7)['/Rotate']
assert 180 == pdf.getPage(8)['/Rotate']
def test_cleanup_disabled():
f = io.StringIO()
with redirect_stdout(f):
output_file = render("min", False)
temp_filename = f.getvalue()[29:-1]
assert os.path.exists(temp_filename)
pdf = PyPDF3.PdfFileReader(temp_filename)
assert 3 == pdf.getNumPages()
assert [] == pdf.getOutlines()
# Clean it up manually to avoid cluttering
os.remove(temp_filename)

View File

@ -1,25 +0,0 @@
import pytest
from pystitcher.skeleton import fib, main
__author__ = "Nemo"
__copyright__ = "Nemo"
__license__ = "MIT"
def test_fib():
"""API Tests"""
assert fib(1) == 1
assert fib(2) == 1
assert fib(7) == 13
with pytest.raises(AssertionError):
fib(-10)
def test_main(capsys):
"""CLI Tests"""
# capsys is a pytest fixture that allows asserts agains stdout/stderr
# https://docs.pytest.org/en/stable/capture.html
main(["7"])
captured = capsys.readouterr()
assert "The 7-th Fibonacci number is 13" in captured.out

23
tox.ini
View File

@ -1,23 +0,0 @@
# Tox configuration file
# Read more under https://tox.readthedocs.org/
# THIS SCRIPT IS SUPPOSED TO BE AN EXAMPLE. MODIFY IT ACCORDING TO YOUR NEEDS!
[tox]
minversion = 3.15
envlist = default
[publish]
description =
Publish the package you have been developing to a package index server.
By default, it uses testpypi. If you really want to publish your package
to be publicly accessible in PyPI, use the `-- --repository pypi` option.
skip_install = True
changedir = {toxinidir}
passenv =
TWINE_USERNAME
TWINE_PASSWORD
TWINE_REPOSITORY
deps = twine
commands =
python -m twine check dist/*
python -m twine upload {posargs:--repository testpypi} dist/*