Hi. I recently experienced errors using the Instapaper recipe with some longer articles, and traced it down to this error:
I looked at the processed HTML of one of the articles, and found a huge section of hidden HTML in a <div id='speedRead'> element which replicated the text of the article with each word wrapped in individual <span> elements. I tried adding the "speedRead" div to the remove_tags attribute of the recipe, and that solved the problem.
I haven't contributed to Calibre recipes before, but would like to contribute this fix back to the community as it is probably affecting others. (It may be the cause of the issue reported in this thread.) What's the preferred way to do this? Should I submit a pull request to the github project?
Thank you!
Code:
SplitError: Could not find reasonable point at which to split: feed_0/article_9/index_u33.html Sub-tree size: 606 KB
http://www.w3.org/1999/xhtml}h3 /*/*[2]/*[4]/*/*[2]/*[6]/*[19]
I haven't contributed to Calibre recipes before, but would like to contribute this fix back to the community as it is probably affecting others. (It may be the cause of the issue reported in this thread.) What's the preferred way to do this? Should I submit a pull request to the github project?
Thank you!