Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue for catch all routes in SiteMap.FindSiteMapNodeFromControllerAction #264

Closed
bkryl opened this issue Jan 8, 2014 · 5 comments

Comments

@bkryl
Copy link

bkryl commented Jan 8, 2014

I'll try to create a separate reproducible sample project but want to at least log the issue. I have a sitemap with a dynamicNodeProvider that generates a lot of nodes (maybe 30k-100k?) The last route on the site is a catch all route for 404 errors:

routes.MapRoute("NotFound", "{*url}", new { controller = "NotFound", action = "Http404" });

When the sitemap menu tries to render for a page for the 404 catch all route, it spends a really long time and a lot of CPU cycles recursively looping inside all the nodes in SiteMap.FindSiteMapNodeFromControllerAction without finding a match.

To try to fix the issue, I added the catch all route to the top of sitemap file:
<mvcSiteMapNode key="notFound" title="Not Found" controller="NotFound" action="Http404" />

However, it doesn't find a match because node.MatchesRoute(values) returns false because one of the RouteData values is the catch all "url" parameter which doesn't match the incoming request url that got mapped to that value.

When spam bots flood the site with 404 requests, the server crashes. As a workaround, I created a custom route that inherits from Route and removes the url route value but its a hack.

@NightOwl888
Copy link
Collaborator

This is an exact duplicate of #241.

The solution to your problem is straightforward - use preservedRouteParameters to force it to match any value for URL:

<mvcSiteMapNode key="notFound" title="Not Found" controller="NotFound" action="Http404" preservedRouteParameters="url" />

For an explanation of how this works, please read How to make MvcSiteMapProvider remember a User's Position.

You can use the same technique for your data administration pages so you don't have so many nodes (read the above post). Realistically, because this design stores all nodes in memory at once (a carryover from Microsoft's original design), we have a hard limit in the 10s of thousands of nodes depending on the available hardware. I am hoping to address the scalability limitations in v5.

Now, that doesn't mean you are off the hook. We have had about a half dozen complaints about this issue, yet nobody seems willing to provide a way to reproduce the error. I have tried without success to reproduce this and without some idea of how to do it there is little hope of ever fixing this problem.

Since this is likely a misconfiguration problem, I would like to have MvcSiteMapProvider throw an exception with a clear message of how to fix the configuration. But there is no chance of ever doing that if I cannot reproduce the issue. I would really appreciate if you could put together a demo to show what you are doing or at the very least post some of your existing code and configuration.

@bkryl
Copy link
Author

bkryl commented Jan 10, 2014

I'm working on a sample, but if you have a simple DynamicNodeProviderBase that generates tens of thousands of nodes, performance starts degrading significantly in many places such as rendering @Html.MvcSiteMap().SiteMapPath() or @Html.MvcSiteMap().Menu(), even if all the dynamic nodes and their parent are marked as not visible. Unfortunately, I need all the database-driven nodes added in order for all their URL's to appear individually on the sitemap.xml. I suppose an option until v5 comes out is separating out the sitemap file for navigation menus from the sitemap file for sitemap.xml.

@NightOwl888
Copy link
Collaborator

Yes, that is an option. However, it is still not a pretty one - see #258 (I also posted some more suggestions there on reducing the problems caused by too many nodes). Note that I was able to get a SiteMap to run with reasonable performance (aside from a really long load time) with 30,000 nodes on a machine with 8GB of RAM.

The main problem is that once the OS switches over to using swap space instead of RAM, performance drops off considerably. And at present, all of the nodes are physically stored in RAM so keeping them to a reasonable number is important.

The only realistic option to support more than 10s of thousands of URLs at the time is to roll your own SiteMaps XML implementation and use < 10,000 physical nodes in the SiteMap. Keep in mind, you could potentially have millions of "fake" breadcrumbs by using preservedRouteParameters with only a handful of physical nodes to support them.

I am thinking of making v5 work more like a search engine, and deviating somewhat from Microsoft's design. That is, have a high performance index used to lookup the nodes, but store the actual payload of the nodes in a stream. That would give you the option to store those streams on disk or in blob storage to minimize the amount of RAM required. They could still be stored in RAM by default for maximum performance. I would also like to segment the data so it can be cached (and loaded) in smaller chunks rather than having to construct the entire SiteMap in one piece. Of course, this is going to take some R&D and might take more than one prototype to get working right, but I think if it can scale to > 1 million nodes without taking up unreasonable amounts of RAM it should suit just about anyone.

Presently I am working on other projects (I am currently sidetracked from being sidetracked from being sidetracked from what I am supposed to be working on), so unfortunately I can't give you a timeframe for when development on v5 will start or be completed. I have posted an unofficial roadmap in one of the issues with some of the random thoughts I have about what should go into the new version, which may or may not actually make it in there. Eventually, when I have time I hope to make this into a WIKI post and solicit some input from the community.

@bkryl
Copy link
Author

bkryl commented Jan 10, 2014

This issue can be closed as you are correct it was an exact duplicate. To fix 404 page performance, I needed to add a node entry to the top of my .sitemap file with the preservedRouteParameters="url" property and that was it. With this simple fix, average 404 page request time went from 2.5 seconds to about .2 seconds and took a huge load off the server. I found out I only have ~3500 total nodes, not the tens of thousands like I suspected. I initially never thought to add a 404 route to my sitemap though.

For future reference, here is my .sitemap entry:

<mvcSiteMapNode key="notFound" title=" " controller="NotFound" action="Http404" preservedRouteParameters="url" clickable="false" visibility="!*" />

@bkryl bkryl closed this as completed Jan 10, 2014
@bkryl
Copy link
Author

bkryl commented Oct 4, 2015

At some point again with another version this performance issue cropped up again where 404 page request would take 3+ seconds and causing CPU to spike on the server. Removing clickable="false" on the 404 node fixed this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants