hn-classics/parse.py

11 lines
196 B
Python
Raw Permalink Normal View History

2018-03-03 09:35:28 +00:00
import sys
from newspaper import Article
url = sys.argv[1].strip()
a = Article(url, language='en', keep_article_html=True, http_success_only=False)
a.download()
a.parse()
print(a.article_html)