Skip to content
/ ehp Public

Easy Html Parser is an AST generator for html/xml documents. You can easily delete/insert/extract tags in html/xml documents as well as look for patterns.

License

Notifications You must be signed in to change notification settings

iogf/ehp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ehp

Easy Html Parser is an AST generator for html/xml documents. EHP is a nice tool to parse html content.
It has a short learning curve compared to other parsers. You don't need to lose time going through massive documentation to do simple stuff. EHP handles broken html nicely.

EHP has a short learning curve, you can go through some examples, in a few minutes you can implement cool stuff.

Create/Delete elements

from ehp import *

html = Html()

data = '''
<body> <em> foo </em> </body>
'''

dom = html.feed(data)

for root, item in dom.find_with_root('em'):
    root.remove(item)

print(dom)

<body >  </body>

Manipulate attributes

from ehp import *

data = '''<html><body> <p> It is simple.</p> </body></html>'''

dom = Html().feed(data)

for ind, name, attr in dom.walk():
    attr['size']  = '+2'

print(dom)
<html size="+2" ><body size="+2" > <p size="+2" > It is simple.</p> </body></html>

Install

Note: Ehp works on python3 only, python2 support is no longer available.

pip ehp install

Note: The module is quite well documented, you can find documentation there.

About

Easy Html Parser is an AST generator for html/xml documents. You can easily delete/insert/extract tags in html/xml documents as well as look for patterns.

Resources

License

Stars

Watchers

Forks

Packages

No packages published