html2text.php 1.1
Update: Use Markdownify, it’s the successor to html2text.
I changed my html2text.php function and it now supports non markdownable elements better. Previously something like <p class="foobar">...</p>
would have resulted in <p>...</p>
. Now these elements (which could be ported to markdown) will be left in plain html.
Additionally I made some changes which should lead to an improved performance.
Download
Get it while it’s hot: html2text.php 1.1 (.tar.gz ~ 120.9 KB)
Known Bugs
Yes, there are some, which I’ll try to fix in the next days (note: to better point out the bugs I just write what happens if you convert html to markdown to html):
- Also if the parent element (e.g.
<table>
) gets parsed and a child<tr>,<td> or<th>
has attributes they will be ignored and dropped. Workaround: Add a attribute to the parent element (e.g. a class / id). If you give a single
<li>
element in the middle of a list some attributes it wont lose them, but will produce not well formed html:<ul><li>abc</li> <li class="foo">bar</li> </ul>
Will result in:
```
<ul> <li>abc <li class="foo">bar</li></li> </ul> ```
<pre><code some="attrib">
will result in<pre><code><code some="attrib">
For more information read the bottom of the html2text.php site
Comments
Want to comment? Send me an email!