Tag markdown

Recent Posts

Markdownify Tests and PEAR Text_Diff (April 01, 2011)

Yes, I’m finally gearing towards the release of my html2text.php successor, dubbed Markdownify. I’m using exessive testing and utilize the MDTest suite to find potential regressions etc. I’m really enjoying to program little CLI scripts with PHP, it just works like a charm.

Here’s an example of how my test suite currently looks like:

alt text

To the left is the original input (HTML), in the middle you find the generated Markdown and to the right HTML again - but now generated via PHP Markdown by Michel Fortin. The pretty colors mark changes between the two HTML versions. I use PEAR Text_Diff for this and a little of my own code. But since all of the existing diff engines for Text_Diff took ages for the Markdown Documentation (~400 lines afair), I wrote a Text_Diff engine which utilizes [shell_exec](http://www.php.net/shell_exec)() and GNU diff. This is blazingly fast and works like a charm! You can get the source code over at pastebin.org. Also take a look on the feature request I made. Dunno if this was the correct place for that…

Hello Drupal World! (February 21, 2008)

Yes! I’ve finally done it. I’ve moved my website to Drupal, which is so much better than my old 3co stuff. Tons of great modules out there and those I couldn’t find for Drupal 6, which was recently released, I ported. Well not all of them, there are still some I’m really looking forward. On the top of my list are definitely the spam module and the Akismet module. Minutes after my move I got my first spam comment…

Well let’s see how I might get involved into Drupal development. I already filed some patches for the following modules:

Marksmarty: better GeSHi support and some other minor things, but it doesn’t seem to be what the maintainer wants. I’ll have to move it into another extra-module then. Also some work to separate Smartypants and Markdown into distinct modules. Furtheron I’ve added support for PHP Markdown’s no-markup mode. This as well needs some more work. Maybe it will be dropped again and the pristine Drupal HTML Filter will be used, lets see.

Second Markdownify Beta released (February 03, 2008)

I’ve just released a second Markdownify Beta with better PHP 4 support and some other small bug fixes. You can download it from sourceforge.

Markdownify Beta released (February 03, 2008)

Finally I’ve completed the Markdownify website. Also I’ve released the first beta, here the news text from SourceForge:

This is the first beta release of Markdownify - the HTML to Markdown converter for PHP.

It is very stable and should handle nearly all features of Markdown and Markdown Extra syntax. Missing are only two things:

“Markdown inside block elements” for Markdownify Extra

word wrapping

These two things will be added before the first “stable” release. Additionally some performance improvements will hopefully be added.

You are encouraged to use this release in your web applications. Please let me know if you find any bugs. Also a code review by anyone would be very much appreciated!

Download it now

html2text rewrite (May 16, 2007)

A few days ago I started a complete rewrite of html2text. It now uses a new htmlparser (also written by me) which should make the whole HTML cleanup process obsolete. The generic XML parser which is currently used dies on invalid XHTML, with my parser it should be possible do handle errors and parse HTML 4.01 documents without any regex magic beforehand.

You’ll hear more of this in about a week as I’ll be on vacation until the 24th.

html2text.php version 1.3 released (December 24, 2006)

Update: Use Markdownify, it’s the successor to html2text.

I just released html2text version 1.3 which sports a ton of bug fixes. Most notably all features of php markdown extra are now fully supported, including footnotes and abbrevations.

Also wrapping should work like intended and inline links (like <foo@bar.com>) won’t be converted to block links (like [foo@bar.com]([foo@bar.com](mailto:foo@bar.com))).

In the next version I’ll add some more options, especially disabling php markdown extra support. Also I’ll clean up the code a bit.

Merry Christmas to you!

html2text.php 1.1 (July 23, 2006)

Update: Use Markdownify, it’s the successor to html2text.

I changed my html2text.php function and it now supports non markdownable elements better. Previously something like <p class="foobar">...</p> would have resulted in <p>...</p>. Now these elements (which could be ported to markdown) will be left in plain html.

Additionally I made some changes which should lead to an improved performance.

Download

Get it while it’s hot: html2text.php 1.1 (.tar.gz ~ 120.9 KB)

Known Bugs

Yes, there are some, which I’ll try to fix in the next days (note: to better point out the bugs I just write what happens if you convert html to markdown to html):

Also if the parent element (e.g. <table>) gets parsed and a child <tr>,<td> or<th> has attributes they will be ignored and dropped. Workaround: Add a attribute to the parent element (e.g. a class / id).
If you give a single <li> element in the middle of a list some attributes it wont lose them, but will produce not well formed html:
```
<ul><li>abc</li> <li class="foo">bar</li> </ul>
```