0

I'm having an issue where the following HTML is stored in my database:

Carer £4.20 per person<br />

And is being output to XML with DOMDocument, as follows:

$content = htmlspecialchars($page->content);
$xmlDoc = new DOMDocument();
$xmlDoc->formatOutput = true;

//create the root element
$root = $xmlDoc->appendChild(
$xmlDoc->createElement("document"));

$page->appendChild(
$xmlDoc->createElement("content", $content));

Resulting in

Carer &#xA3;4.20 per person&lt;br /&gt;

However, instead of the HEX encoding, is it possible to have the named HTML entities, e.g &pound; ?

bsod99
  • 1,287
  • 6
  • 16
  • 32

5 Answers5

1

However, instead of the HEX encoding, is it possible to have the named HTML entities, e.g &pound; ?

Yes and No. First of all no because you are using a XML and in XML there is no such named entity &pound; by default.

Yes, because you can output HTML instead ;) Let's see the example (online-demo):

$content = htmlspecialchars('Carer £4.20 per person<br />');

$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;


//create the root element
$root = $doc->appendChild(
    $doc->createElement("document")
);

$root->appendChild(
    $doc->createElement("content", $content)
);

echo "Save XML:\n", $doc->saveXML();
echo "\n\nSave HTML:\n", $doc->saveHTML();

And the output:

Save XML:
<?xml version="1.0"?>
<document>
  <content>Carer &#xA3;4.20 per person&lt;br /&gt;</content>
</document>


Save HTML:
<document><content>Carer &pound;4.20 per person&lt;br /&gt;</content></document>

So remember: In XML there is a very limited set of named entities, in HTML there are many more. You can also add more named entities to XML. If you'er interested, please see

hakre
  • 193,403
  • 52
  • 435
  • 836
0

You get this error if your XML contains non ASCII characters, and the file was saved as single-byte ANSI (or ASCII) with no encoding specified.

0

Try removing the htmlspecialchars and see what happens?

http://php.net/manual/en/function.htmlspecialchars.php

H2ONOCK
  • 956
  • 1
  • 5
  • 19
0

Very easy - just use htmlentities() instead of htmlspecialchars().

See http://de2.php.net/manual/en/function.htmlentities.php

But be warned - XML does not know HTML-entities like &pound;! If you output XML, not HTML, then numeric references are the only solution.

Alex Shesterov
  • 26,085
  • 12
  • 82
  • 103
0

Yes, it should be possible - but it depends.

Look at pound from "IT view".

  • £ - sing of pound

  • pound - name of currency

  • &pound; - entity name

  • &#163; - entity code

Now, let's write all items from above without marking it as code. What is result?

£, pound, £, £ - as you see, 3rd and 4th has resulted to £ - but this is HTML. Belive me, I don't lie :P

But I strongly recommend you to use &#163; in XML!

If you wantmore info, you can visit:

Community
  • 1
  • 1
Stepo
  • 1,036
  • 1
  • 13
  • 24