I'm trying to interpret responses from people via e-mail.
If this had been the year 1985 or something, it would be easy: I would just strip any line beginning with >
, and that would be it.
However, the year is 2020 and e-mail is an absolute mess of multiple layers of madness. For one thing, many e-mails aren't plaintext at all, but instead use HTML formatting, and I very strongly doubt that these consistently use <blockquote>
s for quotes. I fear that there are numerous different styles of quotes and markup used for HTML e-mail quotes.
Even plaintext e-mails may not consistently use >
quotes.
This immediately strikes me as something I do not wish to sit and attempt to code on my own. Is there some existing, reliable PHP library/function for this task?
I already use MailMimeParse
, but it doesn't appear to have this feature. Its job appears to be all about parsing the MIME blobs into plaintext/HTML bodies -- not to do anything further with these, once properly extracted.
To make it crystal clear: I'm trying to turn this:
I shall have the business proposal ready tomorrow.OK. Great.
Into:
OK. Great.
And:
<whateverunknownmarkup>I shall have the business proposal ready tomorrow.</whateverunknownmarkup>
OK. Great.
Into:
OK. Great.
Of course, those are just basic examples. These can be nested in many levels, etc.
I don't know how the most popular e-mail clients and e-mail services do this, but it feels like yet another task which has been solved in private a million times but never released to the public.