Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodeSet returned by xmlXPathEvalExpression may be freed later than a document #35

Open
ZigzagAK opened this issue Dec 14, 2023 · 5 comments

Comments

@ZigzagAK
Copy link
Contributor

ZigzagAK commented Dec 14, 2023

function libxml2.xmlXPathEvalExpression(expression, context)
  local object = xml2.xmlXPathEvalExpression(expression, context)
  if object == ffi.NULL then
    return nil
  end
  return ffi.gc(object, xml2.xmlXPathFreeObject)
end

This is cause of valgrind error:

==130538== Invalid read of size 4
==130538== at 0xDB89132: xmlXPathFreeNodeSet (xpath.c:4199)
==130538== by 0xDB89219: xmlXPathFreeObject (xpath.c:5519)
==130538== by 0x948AFC5: lj_vm_ffi_call (in /opt/api_gateway_3/apigw/lib/libluajit-5.1.so.2.1.0)
==130538== by 0x94D5A67: lj_ccall_func (lj_ccall.c:1382)
==130538== by 0x94EB9BA: lj_cf_ffi_meta___call (lib_ffi.c:230)
==130538== by 0x9488BA5: lj_BC_FUNCC (in /opt/api_gateway_3/apigw/lib/libluajit-5.1.so.2.1.0)
==130538== by 0x948B323: gc_call_finalizer.isra.3 (lj_gc.c:520)
==130538== by 0x948B4B5: gc_finalize (lj_gc.c:555)
==130538== by 0x948BF4D: gc_onestep (lj_gc.c:706)
==130538== by 0x948C62C: lj_gc_step (lj_gc.c:738)
==130538== by 0x949C8AC: lua_pushlstring (lj_api.c:652)
==130538== by 0x1CDCE59F: json_parse_object_context (lua_cjson.c:1429)
==130538== Address 0x1103be98 is 8 bytes inside a block of size 120 free'd
==130538== at 0x4C3AC2B: free (vg_replace_malloc.c:974)
==130538== by 0x948AFC5: lj_vm_ffi_call (in /opt/api_gateway_3/apigw/lib/libluajit-5.1.so.2.1.0)
==130538== by 0x94D5A67: lj_ccall_func (lj_ccall.c:1382)
==130538== by 0x94EB9BA: lj_cf_ffi_meta___call (lib_ffi.c:230)
==130538== by 0x9488BA5: lj_BC_FUNCC (in /opt/api_gateway_3/apigw/lib/libluajit-5.1.so.2.1.0)
==130538== by 0x948B323: gc_call_finalizer.isra.3 (lj_gc.c:520)
==130538== by 0x948B4B5: gc_finalize (lj_gc.c:555)
==130538== by 0x948BF4D: gc_onestep (lj_gc.c:706)
==130538== by 0x948C62C: lj_gc_step (lj_gc.c:738)
==130538== by 0x949C8AC: lua_pushlstring (lj_api.c:652)
==130538== by 0x1CDCE59F: json_parse_object_context (lua_cjson.c:1429)
==130538== by 0x1CDCDED4: json_process_value (lua_cjson.c:1585)
==130538== Block was alloc'd at
==130538== at 0x4C38185: malloc (vg_replace_malloc.c:431)
==130538== by 0xDB54ED4: xmlNewNodeEatName (tree.c:2281)
==130538== by 0xDB592A1: xmlNewDocNodeEatName (tree.c:2356)
==130538== by 0xDBFF8BB: xmlSAX2StartElementNs (SAX2.c:2278)
==130538== by 0xDB4896D: xmlParseStartTag2 (parser.c:9645)
==130538== by 0xDB4C66E: xmlParseElement (parser.c:9992)
==130538== by 0xDB4BCD5: xmlParseContent (parser.c:9910)
==130538== by 0xDB4C588: xmlParseElement (parser.c:10078)
==130538== by 0xDB4BCD5: xmlParseContent (parser.c:9910)
==130538== by 0xDB4C588: xmlParseElement (parser.c:10078)
==130538== by 0xDB4BCD5: xmlParseContent (parser.c:9910)
==130538== by 0xDB4C588: xmlParseElement (parser.c:10078)

To prevent it object returned from libxml2.xmlXPathEvalExpression MUST be freed immediatelly after usage or nodeNr field must be set to 0 before call xmlXPathFreeObject.

void
xmlXPathFreeNodeSet(xmlNodeSetPtr obj) {
    if (obj == NULL) return;
    if (obj->nodeTab != NULL) {
        int i;

        /* @@ with_ns to check whether namespace nodes should be looked at @@ */
        for (i = 0;i < obj->nodeNr;i++)
            if ((obj->nodeTab[i] != NULL) &&
                (obj->nodeTab[i]->type == XML_NAMESPACE_DECL))
                xmlXPathNodeSetFreeNs((xmlNsPtr) obj->nodeTab[i]);
        xmlFree(obj->nodeTab);
    }
    xmlFree(obj);
}
function libxml2.xmlXPathEvalExpression(expression, context)
  local object = xml2.xmlXPathEvalExpression(expression, context)
  if object == ffi.NULL then
    return nil
  end
  return ffi.gc(object, function(pobject)
    if pobject.nodesetval ~= ffi.NULL then
      pobject.nodesetval.nodeNr = 0
    end
    xml2.xmlXPathFreeObject(pobject)
  end)
end
@ZigzagAK ZigzagAK changed the title nodeSet returned by xmlXPathEvalExpression man be freed later that document nodeSet returned by xmlXPathEvalExpression may be freed later that document Dec 14, 2023
@ZigzagAK ZigzagAK changed the title nodeSet returned by xmlXPathEvalExpression may be freed later that document nodeSet returned by xmlXPathEvalExpression may be freed later than a document Dec 14, 2023
@kou
Copy link
Member

kou commented Dec 14, 2023

Could you provide a reproducible Lua script with sample data?

@ZigzagAK
Copy link
Contributor Author

Actually this solution will provide memory leak if obj->nodeTab[i]->type == XML_NAMESPACE_DECL. Correct way is freeing after usage before freening document. GC may produce unpredictable results. I can't make simple small testcase but in large program this happens constantly.

@ZigzagAK
Copy link
Contributor Author

May be:

function libxml2.xmlXPathEvalExpression(expression, context)
  local object = xml2.xmlXPathEvalExpression(expression, context)
  if object == ffi.NULL then
    return nil
  end
  return object
end
jit.off(libxml2.xmlXPathEvalExpression)

function libxml2.xmlXPathFreeObject(object)
  xml2.xmlXPathFreeObject(object)
end
jit.off(libxml2.xmlXPathFreeObject)

and

  local object = libxml2.xmlXPathEvalExpression(xpath, context)
  if not object then
    local raw_error_message = context.lastError.message
    if raw_error_message == ffi.NULL then
      local xpath_error_code =
        context.lastError.code - ffi.C.XML_XPATH_EXPRESSION_OK
      error(ERROR_MESSAGES[xpath_error_code])
    else
      error(ffi.string(context.lastError.message))
    end
  end

  local type = tonumber(object.type)
  if type == ffi.C.XPATH_NODESET then
    local found_node_set = object.nodesetval
    if found_node_set == ffi.NULL then
      libxml2.xmlXPathFreeObject(object)
      return NodeSet.new({})
    end
    local raw_node_set = {}
    for i = 1, found_node_set.nodeNr do
      local node = found_node_set.nodeTab[i - 1]
      local node_type = tonumber(node.type)
      if node_type == ffi.C.XML_ELEMENT_NODE then
        table.insert(raw_node_set, Element.new(document, node))
      elseif node_type == ffi.C.XML_TEXT_NODE then
        table.insert(raw_node_set, Text.new(document, node))
      elseif node_type == ffi.C.XML_COMMENT_NODE then
        table.insert(raw_node_set, Comment.new(document, node))
      elseif node_type == ffi.C.XML_PI_NODE then
        table.insert(raw_node_set,
                     ProcessingInstruction.new(document, node))
      elseif node_type == ffi.C.XML_ATTRIBUTE_NODE then
        table.insert(raw_node_set, Attribute.new(document, node))
      elseif node_type == ffi.C.XML_CDATA_SECTION_NODE then
        table.insert(raw_node_set, CDATASection.new(document, node))
      else
        -- TODO: Support more nodes such as text node
        -- table.insert(raw_node_set, node)
      end
    end
    libxml2.xmlXPathFreeObject(object)
    return NodeSet.new(raw_node_set)
  else
    libxml2.xmlXPathFreeObject(object)
  end

@ZigzagAK
Copy link
Contributor Author

Fixed in #34

@kou
Copy link
Member

kou commented Dec 16, 2023

We need to write a test for this case. So we need a reproducible Lua script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants