(Moved from the Sigil user forum, with apologies)
I'm currently trying to write an html conversion plugin for Sigil which runs on python 3.4 (external) or Sigil's bundled python 3.5+(internal). As part of the html sanitizing process I currently use bs4 with python 3.4 and this works fine.
But when I use sigil_bs4 or gumbo_bs4.parse from the bundled python I do not get the same results as using bs4 -- because it simply doesn't work. Here is the code:
When I use this code with bs4 on my python 3.4 it works fine:
But when I write this code using sigil_bs4 or gumbo_bs4.parse with the bundled python(3.5+) swtiched on it doesn't do the job and also doesn't give any specific errors.
I'm using Sigil 0.9.7 on Windows 8.
It seems that neither sigil_bs4 nor gumbo_bs4.parse produce a callable BS object(taking no arguments and returning a list of all html tags) which is what I need for the above code to work. I've also used sigil_bs4 quite successfully throughout my plugin as a line by line parser for other formatting(but not as a callable object as above).
Any further suggestions to make this code work for sigil_bs4 or gumbo_bs4.parse would be greatly appreciated.
This is my first python plugin(or major python app of any note).
I'm currently trying to write an html conversion plugin for Sigil which runs on python 3.4 (external) or Sigil's bundled python 3.5+(internal). As part of the html sanitizing process I currently use bs4 with python 3.4 and this works fine.
But when I use sigil_bs4 or gumbo_bs4.parse from the bundled python I do not get the same results as using bs4 -- because it simply doesn't work. Here is the code:
When I use this code with bs4 on my python 3.4 it works fine:
Code:
from bs4 import BeautifulSoup as bs
html = open(file, 'rt', encoding='utf-8').read()
soup = bs(html, 'html.parser')
for tag in soup():
for attribute in ["lang", "id", "dir", "name" "link"]:
del tag[attribute]
Code:
from sigil_bs4 import BeautifulSoup as bs
(or import sigil_gumbo_bs4_adapter as gumbo_bs4)
html = open(file, 'rt', encoding='utf-8').read()
soup = bs(html, 'html.parser')
(or soup = gumbo_bs4.parse(html))
for tag in soup():
for attribute in ["lang", "id", "dir", "name" "link"]:
del tag[attribute]
It seems that neither sigil_bs4 nor gumbo_bs4.parse produce a callable BS object(taking no arguments and returning a list of all html tags) which is what I need for the above code to work. I've also used sigil_bs4 quite successfully throughout my plugin as a line by line parser for other formatting(but not as a callable object as above).
Any further suggestions to make this code work for sigil_bs4 or gumbo_bs4.parse would be greatly appreciated.
This is my first python plugin(or major python app of any note).