2

I wrote a C++ XPath parser with the libxml++ library, which was built on the C libxml2 library. It works great when the xmlns is not present in xml but it breaks when that namespace is added.

Sample xml:

<A xmlns="http://some.url/something">
  <B>
    <C>hello world</C>
  <B>
</a>

Sample XPath:

string xpath = "/A/B/C" // returns nothing when xmlns is present in the XML

I found this answer and tried adjusting my XPath to the following, which does work but it makes the XPath kind of obnoxious to read and write.

string xpath = "/*[name()='A']/*[name()='B']/*[name()='C']"

Ideally I want to register the namespace so I can use normal XPaths. I've also searched through the libxml++ documentation and found a Node.set_namespace but it just causes an exception when I try to use it.

root_node->set_namespace("http://some.url/something");
// exception: The namespace (http://some.url/something) has not been declared.

However, the root_node is definitely aware of the namespace when it parses the XML document:

cout << "namespace uri: " << root_node->get_namespace_uri();
// namespace uri: http://some.url/something

At this point I am out of ideas so help is greatly appreciated.

EDIT Also tried:

Element *root_node = parser->get_document()->get_root_node();
root_node->set_namespace_declaration("http://some.url/something","x");
cout << "namespace uri: " << root_node->get_namespace_uri() << endl;
cout << "namespace prefix: " << root_node->get_namespace_prefix() << endl;
// namespace uri: http://some.url/something
// namespace prefix: 

Does not complain but doesn't appear to register the namespace.

Community
  • 1
  • 1
Mike S
  • 11,329
  • 6
  • 41
  • 76
  • I think you need to use : void xmlpp::Element::set_namespace_declaration ( const std::string & ns_uri, const std::string & ns_prefix = std::string() ) defined [here](http://libxmlplusplus.sourceforge.net/docs/reference/1.0/html/classxmlpp_1_1Element.html#a2) – SomeDude Apr 11 '16 at 15:34
  • Well my variable is a `Node`, but `Element` is a child of `Node` so I might be able to make it work. I'll give it a try. – Mike S Apr 11 '16 at 15:40
  • Node type can be Element node type and infact here 'A' is an element node I presume. – SomeDude Apr 11 '16 at 16:07
  • My code is happy with`root_node` being an `Element` and I call `root_node->set_namespace_declaration("http://some.url/something","x");`. No issues with that but when I try to evaluate XPath `""/x:A/x:B/x:C"`I get this error `XPath error : Undefined namespace prefix`. Any thoughts on that? I tried reversing the arguments on my `set_namespace_declaration` call just in case that was wrong but no luck. – Mike S Apr 11 '16 at 16:25
  • Have look to `xmlXPathRegisterNs` – hr_117 Apr 11 '16 at 18:05

2 Answers2

3

The online documentation for libxml++ does not mention how to use namespaces with xpaht expression. But as you pointed out libxml++ is a wrapper to libxml2.

For libxml2 have a look to xmlXPathRegisterNs.

As always with wrapper the hide complexity and even (most likely) functionality.

Having a look to the libxml++ sourcecode shows that there are find overloads which make use of xmlXPathRegisterNs.

using PrefixNsMap = std::map<Glib::ustring, Glib::ustring>
NodeSet find(const Glib::ustring& xpath, const PrefixNsMap& namespaces);

Therefor try to call find with the PrefixNsMap, with the prefix as key.
Update:

 xmlpp::Node::PrefixNsMap nsmap;
 nsmap["x"] = "http://some.url/something";
 auto set = node->find(xpath, nsmap);
 std::cout << set.size() << " nodes have been found:" << std::endl;

Comment to strange discussion about namespaces:

  • A default namespace is often used in xml documents
  • The default namespace in a xml document could be changed in any node and is valid until the next change.
  • A namespace with prefix is only valid for nodes with this prefix.
  • Form xpath point of view the used prefix in xml does not really matter. You need to know in wich namesapace (uri) the nodes are. Each namespace need to be register for use in xpaht with an unique namespace prefix.
  • Avoid using this *[name()='A'] or *[local-name()='A']`stuff.
hr_117
  • 9,589
  • 1
  • 18
  • 23
  • Thanks but I'm not sure how I would call that from libxml++ without adding downloading the source code, implementing it, and recompiling the library. – Mike S Apr 11 '16 at 18:51
  • Thanks! I'm currently on libxml++ 2.6 but I could upgrade that to 3.0. I will give this a try and report back. – Mike S Apr 12 '16 at 15:21
  • Did you try it with 2.6. I did not, so I do not know if this will also work – hr_117 Apr 12 '16 at 15:40
  • Not yet but I'll give it a try and report back, may not have time today though. – Mike S Apr 12 '16 at 17:14
  • @Mike Do you have meanwhile some response/result? – hr_117 Apr 18 '16 at 20:37
  • Sorry busy week, but yes this does work! I got a C11 error when I used `auto set` from your answer, but changing it to `NodeSet set` fixes it for me. And the XPath `/x:A/x:B/x:C` successfully returns the result with your code. Thanks a lot! – Mike S Apr 18 '16 at 21:57
  • Now I have a question to the **downvoters**.Why the down votes? Perhaps the initial answer was a little short. But nevertheless it was already right. The key was and is that libxml2 can handle default namespaces properly. And to know that (xmlXPathRegisterNs) will (and had) help to find the right way with the wrapper libxml++. – hr_117 Apr 19 '16 at 09:57
  • Unfortunately, I assume they downvoted before you updated, then never checked this question again. But future visitors who need to register namespaces with libxml++ should upvote this. – Mike S Apr 20 '16 at 15:28
0

When you use prefix for your xmlns I believe your xml should be :

<x:A xmlns:x="http://some.url/something">
  <x:B>
    <x:C>hello world</x:C>
  </x:B>
</x:A>

and the xpath expression /x:A/x:B/x:C/text() would yield 'hello world'

SomeDude
  • 13,876
  • 5
  • 21
  • 44
  • Sorry this seems to be wrong. Have a look to default namespace. – hr_117 Apr 11 '16 at 18:26
  • Could you please explain why this is wrong if user wants to use an xml namespace that is not default? Take a look at : http://www.w3schools.com/xml/xml_namespaces.asp There is a reason why prefixes are used, to avoid name conflicts – SomeDude Apr 11 '16 at 18:32
  • No `x` is here the namespce prefix and `http://some.url/something` the namespace. The prfix can be empty. Then you have a dedault namespace for all elements without prefix. – hr_117 Apr 11 '16 at 18:38
  • In that case how would you make sure that elements belong to that namespace? You want all elements refer to that default namespace? for example, I have another elements with same name B and I don't want it look under that namespace but look under another namespace what will you get for xpath /A/B/C ? – SomeDude Apr 11 '16 at 18:41
  • This does seem to help because after doing this when I run my code in my question it prints out `"namespace prefix: x"`. So libxml++ is definitely understanding it. But two issues 1) I can't change how the XML is being sent to me so I would have to add the "x:" to millions of text lines if this worked and 2) for some reason when I try to run that XPath I strangely still get `XPath error : Undefined namespace prefix`. – Mike S Apr 11 '16 at 18:50
  • @Mike : If you don't need a prefix, I think you can just pass an empty string "" to set_namespace_declaration() , but I wonder why will you have a private namespace without prefix which is usually the norm. – SomeDude Apr 11 '16 at 18:53
  • Okay I'm trying that right now. Would you expect `/A/B/C` to return "hello world" then? I'm getting nothing. – Mike S Apr 11 '16 at 18:56
  • So it is bizarre for the data to have the "xmlns" part without also having a prefix? – Mike S Apr 11 '16 at 18:58
  • Is A your root ? If you have the xml hello world then the xpath : /root/*[name()='A']/*[name()='B']/*[name()='C']/text() should yield 'hello world' – SomeDude Apr 11 '16 at 19:11
  • xml with a default namespace is quit common. But to use it in xpath you allways need to registe a prefix. – hr_117 Apr 11 '16 at 19:12
  • Yes it does work when I add the extra "*[name()='A']" stuff (it's in my question) but I was hoping to avoid using that by registering the namespace. – Mike S Apr 11 '16 at 20:10
  • when you register a namespace, you need a prefix otherwise I think there is no way to tell xpath to look under the namespace you want , may be you can use namespace-uri() in xpath to identify the namespace, but that will complicate your xpath more. With prefix, your xpath will be simpler like : x:A/x:B/x:C – SomeDude Apr 11 '16 at 20:16
  • But unfortunately it seems like I would have to edit my XML in order to register the namespace prefix. – Mike S Apr 11 '16 at 20:21
  • I believe if you want simpler xpath expr, you need to add prefixes to xml elements, otherwise go with your current xpath – SomeDude Apr 11 '16 at 20:46