Python lxml: syntax to selectively remove online style attributes?

I’m using python 3.4 with the lxml.html library.

I’m trying to remove the border-bottom
in-line styling from html elements that I’ve targeted with a css selector.

Here’s a code fragment showing a sample td element and my selector:

html_snippet = lxml.html.fromstring("""Estimated Future Payouts n            
Under Non-Equity Incentive n
Plan Awards n """) selection = html_snippet.cssselect('td[style*="border-bottom"]') selection.attrib['style'] >>>>'background-color: azure;border-bottom:1px solid #000000'

What’s the proper way to access the in-line style properties so I can remove the border-bottom
attribute from any element I target with my selector?

You can approach it by splitting the style
attribute value by ;
, create a CSS property name -> value map, remove the border-bottom
from the map and reconstruct the style
attribute again by joining the elements of the map with ;
. Sample implementation:

style = selection.attrib['style']
properties = dict([item.split(":") for item in style.split("; ")])

del properties['border-bottom']

selection.attrib['style'] = "; ".join([key + ":" + value for key, value in properties.items()])

print(lxml.html.tostring(selection))

I’m pretty sure you can break this solution easily.

Alternatively, here is a rather “crazy” option – dump the data into the “html” file, open the file in a browser via
selenium

, remove the attribute via javascript and print out the HTML representation of the element after:

import os
from selenium import webdriver   

data = """
Estimated Future Payouts n            
Under Non-Equity Incentive n
Plan Awards n """ with open("index.html", "w") as f: f.write("%s
" % data) driver = webdriver.Chrome() driver.get("file://" + os.path.abspath("index.html")) td = driver.find_element_by_tag_name("td") driver.execute_script("arguments[0].style['border-bottom'] = '';", td) print(td.get_attribute("outerHTML")) driver.close()

Prints:

Estimated Future Payouts
    
Under Non-Equity Incentive
Plan Awards
Hello, buddy!稿源:Hello, buddy! (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » Python lxml: syntax to selectively remove online style attributes?

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录