python-html-text-0.6.2-1.fc41.src.rpm | RPM Info

Information for RPM python-html-text-0.6.2-1.fc41.src.rpm

1473643

Name

python-html-text

Version

0.6.2

Release

1.fc41

Epoch

Arch

src

Summary

Extract text from HTML

Description

How is html_text different from .xpath('//text()') from LXML or .get_text() from Beautiful Soup? - Text extracted with html_text does not contain inline styles, javascript, comments and other text that is not normally visible to users; - html_text normalizes whitespace, but in a way smarter than .xpath('normalize-space()), adding spaces around inline elements (which are often used as block elements in html markup), and trying to avoid adding extra spaces for punctuation; - html-text can add newlines (e.g. after headers or paragraphs), so that the output text looks more like how it is rendered in browsers.

Build Time

2024-10-25 03:31:48 GMT

Size

73.27 KB

SIGMD5

a35d76505236a2d99df3c9ce27e91560

License

MIT

Provides

python3-html-text = 0.6.2-1.fc41

Obsoletes

No Obsoletes

Conflicts

No Conflicts

Requires

pyproject-rpm-macros

python3-devel

python3dist(lxml)

python3dist(lxml-html-clean)

python3dist(packaging)

python3dist(pip) >= 19

python3dist(pytest)

python3dist(setuptools) >= 40.8

python3dist(wheel)

rpmlib(CompressedFileNames) <= 3.0.4-1

rpmlib(DynamicBuildRequires) <= 4.15.0-1

rpmlib(FileDigests) <= 4.6.0-1

Recommends

No Recommends

Suggests

No Suggests

Supplements

No Supplements

Enhances

No Enhances

Files

Name	Size
1 through 2 of 2
html-text-0.6.2.tar.gz	51.95 KB
python-html-text.spec	1.62 KB

Component of

No Buildroots

Main Site Links:

Information for RPM python-html-text-0.6.2-1.fc41.src.rpm