node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

Peter Harrison-2
Over the last few days I've come across a problem while trying to recover
from a ranaway script that created tens of thousands of nodes under a
single node.

When I get the parent node to this large number of new nodes and call
hasNodes() things lock up and the Mongo query times out. Similar problem
when you try to call getNodes() to return a nodeIterator.

I know that one of the key points with Oak was meant to be the ability to
handle a large number of child nodes,



The second problem I have is in removing these nodes. While I was able to
find out the node paths without the above calls to get each node by path
when I call node.remove() it is taking about 20-30 seconds to delete each
node. I wanted to remove about 300,000 nodes, but at 20 seconds a node....
just under 69 days. It took no more than 2 days to add then, probably much
shorter.

While I'm working on ways around these problems - essentially by rebuilding
the repo - it would be good to see if these problems are known or whether
there is something I'm doing wrong.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

Clay Ferguson
Two thoughts:

1) It's a known issue (severe weakness) in the design of Jackrabbit/Oak
that it chokes like a dog on large numbers of child nodes all under the
same node. Many users have struggled with this, and imo it has been one of
the massive flaws that has kept the JCR from really taking off. I mean,
probably still only 1% of developers have ever heard of the JCR.

2) About cleaning up the massive child list, be sure you aren't doing a
commit (save) after each node. Try to run commits after 100 to 500 deletes
at a time.

Good luck. That scalability issue is a pretty big problem. I sure wish
Adobe would find some people with the requisite skill to get that fixed.
Every serious user runs into this problem. I mean the Derby DB is
litterally 100x of times more powerful, and most people consider Derby a
toy.


Best regards,
Clay Ferguson
[hidden email]


On Sun, Aug 6, 2017 at 7:38 PM, Peter Harrison <[hidden email]> wrote:

> Over the last few days I've come across a problem while trying to recover
> from a ranaway script that created tens of thousands of nodes under a
> single node.
>
> When I get the parent node to this large number of new nodes and call
> hasNodes() things lock up and the Mongo query times out. Similar problem
> when you try to call getNodes() to return a nodeIterator.
>
> I know that one of the key points with Oak was meant to be the ability to
> handle a large number of child nodes,
>
>
>
> The second problem I have is in removing these nodes. While I was able to
> find out the node paths without the above calls to get each node by path
> when I call node.remove() it is taking about 20-30 seconds to delete each
> node. I wanted to remove about 300,000 nodes, but at 20 seconds a node....
> just under 69 days. It took no more than 2 days to add then, probably much
> shorter.
>
> While I'm working on ways around these problems - essentially by rebuilding
> the repo - it would be good to see if these problems are known or whether
> there is something I'm doing wrong.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

Clay Ferguson
In reply to this post by Peter Harrison-2
​Peter,
Also as a last resort if absolutely nothing else is workable, you could
theoretically run an Export to XML, and then process that XML with custom
code you write, and THEN re-import back into a new empty repo.​

Please share your solution with the group if you would, once found. Adobe
might benefit from seeing what problems they are creating and how people
are working around those problems. Hopefully that's a legit use of this
email list also.

Best regards,
Clay Ferguson
[hidden email]


On Sun, Aug 6, 2017 at 7:38 PM, Peter Harrison <[hidden email]> wrote:

> Over the last few days I've come across a problem while trying to recover
> from a ranaway script that created tens of thousands of nodes under a
> single node.
>
> When I get the parent node to this large number of new nodes and call
> hasNodes() things lock up and the Mongo query times out. Similar problem
> when you try to call getNodes() to return a nodeIterator.
>
> I know that one of the key points with Oak was meant to be the ability to
> handle a large number of child nodes,
>
>
>
> The second problem I have is in removing these nodes. While I was able to
> find out the node paths without the above calls to get each node by path
> when I call node.remove() it is taking about 20-30 seconds to delete each
> node. I wanted to remove about 300,000 nodes, but at 20 seconds a node....
> just under 69 days. It took no more than 2 days to add then, probably much
> shorter.
>
> While I'm working on ways around these problems - essentially by rebuilding
> the repo - it would be good to see if these problems are known or whether
> there is something I'm doing wrong.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

Peter Harrison-2
In reply to this post by Clay Ferguson
1) I knew many nodes under one node was an issue with 2.X but I thought Oak
was going to address this issue.

To get a better grasp of what is going on I took a look at the data
structure in Mongo. It seems to be a 'flat' node Collection.  There is a
Collection called 'nodes'. A document in this collection represents a node.
Inside the node is a list of the ID's of the child nodes. Every addition of
a child node implies a change to the parent node Document. Each revision of
the number of children stores a complete new list of the children. This
means the document becomes more unmanagable the more nodes are added
directly under it. When you get the node you MUST also get the entire list
of children ID's! Not only this, but for every modification a full list of
all the children is stored. Thus removing a child of a node with lots of
other nodes actually adds a huge amount of data.

This is *insane*. No. Seriously. This is nuts. If I'm reading this right it
means that if you have say 10 children you have 10 revisions each with its
own set of children all in the one Document.


2) I experimented with the number of removes before a save. If you try and
put too many under a single commit it blows up. The API I wrote had a
parameter you could override to control the number or removes done for each
commit. It didn't look like the commit was making much difference in terms
of performance. I might be wrong on that one - see below.

Now that I know how things work under the covers I have some idea of the
scope of the problem. Each remove can actually adding a HUGE volume of data
to the parent node, a copy of all the child id's previously less the
removed children.

Am I getting all this wrong?



A sane implementation would have a separate collection for the links
between nodes or each node would have a parent and finding out the children
would involve a simple query to return all nodes that have a specific
parent. This would be easy and fast as you can have an index on the
parent_id. It would also mean you can perform a query and iterate the list
without getting all the children at once. This would mean the hasNodes()
and getNodes() would only need to get the first record. I'm sure there are
reasons for all this, but nears as I can tell this is a pretty fatal flaw.



Looks like that Cassandra spike is closer than I thought.


On Mon, Aug 7, 2017 at 1:39 PM, Clay Ferguson <[hidden email]> wrote:

> Two thoughts:
>
> 1) It's a known issue (severe weakness) in the design of Jackrabbit/Oak
> that it chokes like a dog on large numbers of child nodes all under the
> same node. Many users have struggled with this, and imo it has been one of
> the massive flaws that has kept the JCR from really taking off. I mean,
> probably still only 1% of developers have ever heard of the JCR.
>
> 2) About cleaning up the massive child list, be sure you aren't doing a
> commit (save) after each node. Try to run commits after 100 to 500 deletes
> at a time.
>
> Good luck. That scalability issue is a pretty big problem. I sure wish
> Adobe would find some people with the requisite skill to get that fixed.
> Every serious user runs into this problem. I mean the Derby DB is
> litterally 100x of times more powerful, and most people consider Derby a
> toy.
>
>
> Best regards,
> Clay Ferguson
> [hidden email]
>
>
> On Sun, Aug 6, 2017 at 7:38 PM, Peter Harrison <[hidden email]>
> wrote:
>
> > Over the last few days I've come across a problem while trying to recover
> > from a ranaway script that created tens of thousands of nodes under a
> > single node.
> >
> > When I get the parent node to this large number of new nodes and call
> > hasNodes() things lock up and the Mongo query times out. Similar problem
> > when you try to call getNodes() to return a nodeIterator.
> >
> > I know that one of the key points with Oak was meant to be the ability to
> > handle a large number of child nodes,
> >
> >
> >
> > The second problem I have is in removing these nodes. While I was able to
> > find out the node paths without the above calls to get each node by path
> > when I call node.remove() it is taking about 20-30 seconds to delete each
> > node. I wanted to remove about 300,000 nodes, but at 20 seconds a
> node....
> > just under 69 days. It took no more than 2 days to add then, probably
> much
> > shorter.
> >
> > While I'm working on ways around these problems - essentially by
> rebuilding
> > the repo - it would be good to see if these problems are known or whether
> > there is something I'm doing wrong.
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

chetan mehrotra
> Every addition of a child node implies a change to the parent node Document

Looks like the parent nodetype is nt:unstructured which requires
orderable children. If you do not require that use a nodetype like
oak:Unstructured. See [1] for some background

Chetan Mehrotra
[1] https://jackrabbit.apache.org/oak/docs/dos_and_donts.html#Large_number_of_direct_child_node

On Mon, Aug 7, 2017 at 9:32 AM, Peter Harrison <[hidden email]> wrote:

> 1) I knew many nodes under one node was an issue with 2.X but I thought Oak
> was going to address this issue.
>
> To get a better grasp of what is going on I took a look at the data
> structure in Mongo. It seems to be a 'flat' node Collection.  There is a
> Collection called 'nodes'. A document in this collection represents a node.
> Inside the node is a list of the ID's of the child nodes. Every addition of
> a child node implies a change to the parent node Document. Each revision of
> the number of children stores a complete new list of the children. This
> means the document becomes more unmanagable the more nodes are added
> directly under it. When you get the node you MUST also get the entire list
> of children ID's! Not only this, but for every modification a full list of
> all the children is stored. Thus removing a child of a node with lots of
> other nodes actually adds a huge amount of data.
>
> This is *insane*. No. Seriously. This is nuts. If I'm reading this right it
> means that if you have say 10 children you have 10 revisions each with its
> own set of children all in the one Document.
>
>
> 2) I experimented with the number of removes before a save. If you try and
> put too many under a single commit it blows up. The API I wrote had a
> parameter you could override to control the number or removes done for each
> commit. It didn't look like the commit was making much difference in terms
> of performance. I might be wrong on that one - see below.
>
> Now that I know how things work under the covers I have some idea of the
> scope of the problem. Each remove can actually adding a HUGE volume of data
> to the parent node, a copy of all the child id's previously less the
> removed children.
>
> Am I getting all this wrong?
>
>
>
> A sane implementation would have a separate collection for the links
> between nodes or each node would have a parent and finding out the children
> would involve a simple query to return all nodes that have a specific
> parent. This would be easy and fast as you can have an index on the
> parent_id. It would also mean you can perform a query and iterate the list
> without getting all the children at once. This would mean the hasNodes()
> and getNodes() would only need to get the first record. I'm sure there are
> reasons for all this, but nears as I can tell this is a pretty fatal flaw.
>
>
>
> Looks like that Cassandra spike is closer than I thought.
>
>
> On Mon, Aug 7, 2017 at 1:39 PM, Clay Ferguson <[hidden email]> wrote:
>
>> Two thoughts:
>>
>> 1) It's a known issue (severe weakness) in the design of Jackrabbit/Oak
>> that it chokes like a dog on large numbers of child nodes all under the
>> same node. Many users have struggled with this, and imo it has been one of
>> the massive flaws that has kept the JCR from really taking off. I mean,
>> probably still only 1% of developers have ever heard of the JCR.
>>
>> 2) About cleaning up the massive child list, be sure you aren't doing a
>> commit (save) after each node. Try to run commits after 100 to 500 deletes
>> at a time.
>>
>> Good luck. That scalability issue is a pretty big problem. I sure wish
>> Adobe would find some people with the requisite skill to get that fixed.
>> Every serious user runs into this problem. I mean the Derby DB is
>> litterally 100x of times more powerful, and most people consider Derby a
>> toy.
>>
>>
>> Best regards,
>> Clay Ferguson
>> [hidden email]
>>
>>
>> On Sun, Aug 6, 2017 at 7:38 PM, Peter Harrison <[hidden email]>
>> wrote:
>>
>> > Over the last few days I've come across a problem while trying to recover
>> > from a ranaway script that created tens of thousands of nodes under a
>> > single node.
>> >
>> > When I get the parent node to this large number of new nodes and call
>> > hasNodes() things lock up and the Mongo query times out. Similar problem
>> > when you try to call getNodes() to return a nodeIterator.
>> >
>> > I know that one of the key points with Oak was meant to be the ability to
>> > handle a large number of child nodes,
>> >
>> >
>> >
>> > The second problem I have is in removing these nodes. While I was able to
>> > find out the node paths without the above calls to get each node by path
>> > when I call node.remove() it is taking about 20-30 seconds to delete each
>> > node. I wanted to remove about 300,000 nodes, but at 20 seconds a
>> node....
>> > just under 69 days. It took no more than 2 days to add then, probably
>> much
>> > shorter.
>> >
>> > While I'm working on ways around these problems - essentially by
>> rebuilding
>> > the repo - it would be good to see if these problems are known or whether
>> > there is something I'm doing wrong.
>> >
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

Peter Harrison-2
Thanks for the reference. Much appreciated.

On Mon, Aug 7, 2017 at 4:15 PM, Chetan Mehrotra <[hidden email]>
wrote:

> > Every addition of a child node implies a change to the parent node
> Document
>
> Looks like the parent nodetype is nt:unstructured which requires
> orderable children. If you do not require that use a nodetype like
> oak:Unstructured. See [1] for some background
>
> Chetan Mehrotra
> [1] https://jackrabbit.apache.org/oak/docs/dos_and_donts.html#
> Large_number_of_direct_child_node
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

Julian Reschke
In reply to this post by Clay Ferguson
On 2017-08-07 03:39, Clay Ferguson wrote:
> Two thoughts:
>
> 1) It's a known issue (severe weakness) in the design of Jackrabbit/Oak
> that it chokes like a dog on large numbers of child nodes all under the
> same node. Many users have struggled with this, and imo it has been one of
> the massive flaws that has kept the JCR from really taking off. I mean,
> probably still only 1% of developers have ever heard of the JCR.
> ...

Jackrabbit yes, Oak no.

Oak has been designed to handle bigger flat collections, but it does
require a container node type that doesn't need to maintain the
collection ordering.

Best regards, Julian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

Clay Ferguson
In reply to this post by chetan mehrotra
I was unaware simply making nodes unorderable would allow good
scalability.  Good to know!  I guess we could always experiment with using
a nextNode property to allow iterating in order, and also get good
scalability for inserting/deleting, but using that linked-list approach
would be slow at iterating, because each node retrieved would have to come
from a lookup of it's nextNode property. The only thing (afaik) that could
significantly improve that performance would be if each node's children
happened to be in contiguous storage so that disk caching at hardware layer
played a role in the speedup.

Is using this nextNode (linked list built on Node properties) the best
practice for when ordering AND large numbers of children are an absolute
requirement? What do you guys think? Crazy idea or reasonable?

-Clay


On Sun, Aug 6, 2017 at 11:15 PM, Chetan Mehrotra <[hidden email]>
wrote:

> > Every addition of a child node implies a change to the parent node
> Document
>
> Looks like the parent nodetype is nt:unstructured which requires
> orderable children. If you do not require that use a nodetype like
> oak:Unstructured. See [1] for some background
>
> Chetan Mehrotra
> [1] https://jackrabbit.apache.org/oak/docs/dos_and_donts.html#La
> rge_number_of_direct_child_node
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: node.hasNodes() ,node.getNodes() and removing nodes with node.remove()

chetan mehrotra
> Is using this nextNode (linked list built on Node properties) the best
> practice for when ordering AND large numbers of children are an absolute
> requirement? What do you guys think? Crazy idea or reasonable?

Thats can be an option. However concurrent updates would result in
conflicts which would need to be resolved. Also for inserting find the
last entry would involve iterating over all previous entries. In
general having an orderable list of large size would pose problem and
should be avoided

Chetan Mehrotra
Loading...