Fri, 22 Nov 2024 10:18:06 UTC | login

Information for build python-html-text-0.6.2-1.fc41

ID343319
Package Namepython-html-text
Version0.6.2
Release1.fc41
Epoch
SummaryExtract text from HTML
DescriptionHow is html_text different from .xpath('//text()') from LXML or .get_text() from Beautiful Soup? - Text extracted with html_text does not contain inline styles, javascript, comments and other text that is not normally visible to users; - html_text normalizes whitespace, but in a way smarter than .xpath('normalize-space()), adding spaces around inline elements (which are often used as block elements in html markup), and trying to avoid adding extra spaces for punctuation; - html-text can add newlines (e.g. after headers or paragraphs), so that the output text looks more like how it is rendered in browsers.
Built bydavidlt
State complete
Volume DEFAULT
StartedTue, 12 Nov 2024 07:53:13 UTC
CompletedTue, 12 Nov 2024 07:53:13 UTC
Tags
f41
RPMs
src
python-html-text-0.6.2-1.fc41.src.rpm (info) (download)
noarch
python3-html-text-0.6.2-1.fc41.noarch.rpm (info) (download)
Changelog * Fri Oct 18 2024 Benson Muite <benson_muite@emailplus.org> - 0.6.2-1 - Initial packaging