-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing load of generating sitemap #258
Comments
If I understand you correctly, you are referring to the /sitemap.xml endpoint that generates an XML structure for search engines, correct? Currently, there are no built-in options, but there are plans to make some. The code that generates the XML is not that complex, and if you are interested in either modifying it for yourself and/or to make a contribution it could be done with some effort. The primary bottleneck is that it uses an in memory collection to generate the XML no matter how large it is. What we really need is an option to specify an external folder where the results of a FileStream can be stored and then read back to the response stream. This would prevent the need to utilize the same amount of memory as the size of the data structure itself since it would never be loaded into memory all at once. Of course, this means that we need to have 3 things:
While we could just use a hard-coded location, I think it would be better to specify a folder via configuration and just default to using an in-memory scheme if it is not provided. Ideally, we would also be able to specify that the file is not regenerated for some period (maybe defaulting to 12 hours). This is because if the result exceeds 35,000 nodes, it will need to be generated once for the index and again for each XML page as it is accessed because it is broken up into multiple parts for the search engine. It makes little sense to generate the structure more than once each time the search engine robot requests the set of XML files, and I think 12 hours should be enough time for the search engine to read all of them. Furthermore, it would make sense to make the file writing interface pluggable (via DI) so it can be written to blob storage and other locations, but default to writing to memory and include an implementation that writes to the Windows file system (and possibly other implementations as well). My suggestion is to break the streaming into two stages - the first one would serialize and stream the entire structure to a single location (through an injected stream service) and the second one would use the output stream of that service (whether it is in memory or on disk) to generate each one of the XML structures and compress them accordingly. Ideally, the primary stream would contain the raw data elements (in this case, just the information that is necessary to generate each of the url XML elements) and the output stream would be generated on the fly based on this data. The visibility/accessibility of nodes should be processed before doing this storage. The key is to make a binary formatter and serialize all of the objects as they are created so you free up the memory immediately instead of holding onto it until the entire operation is complete. System.Runtime.Serialization.Formatters.Binary.BinaryFormatter formatter = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
// Use an injected streamFactory to create a stream for the memory or disk storage system.
// The streamFactory will get configuration (such as folder location) through its constructor.
using (Stream stream = new streamFactory.Create())
{
foreach (var item in flattenedHierarchy)
{
// Convert the data into a structure that requires the bare minimum of space
ISiteMapXmlItem newItem = GetRequriedDataOnly(item);
formatter.Serialize(stream, newItem);
}
} The same goes for when reading the data back from the stream - you must ensure you only read one element at a time and write the result to the output stream immediately, allowing the memory it occupies to be freed. There must not be any using (var inputStream = streamService.GetStoredStream())
{
using (var outputStream = RetrieveOutputStream(context);
{
// Write the opening XML elements
do
{
var item = (ISiteMapXmlItem)formatter.Deserialize(inputStream);
outputStream.Write(item.GetUrlXmlElement());
} while (stream.Position < stream.Length - 1);
// Write the closing XML elements
outputStream.Flush();
}
} If you decide to make a contribution, please base your changes on the "dev" branch and submit the pull request back to the "dev" branch. Also, feel free to post any follow up questions here. |
Thanks for your extensive answer.
A bit of both I think. I was under the impression that the first request after iisreset triggers the generation of the whole sitemap structure in order to show the proper breadcrumbs.
Yes, and that would make it possible to store it all in appfabric cache right? But why via DI? The Output Cache is easier to extend, you just have to create a derived class and configure it in the web.config. This is probably another question, but why cant MvcSiteMapProvider make use of the same mechanism? I guess under the hood there is some built in DI in the works anyway, but it makes it really easy to extend it.
I'd like to but I don't know if I have the skill and the time.. |
Actually, both the SiteMap structure and /sitemap.xml endpoint currently have some scalability issues/limits because of the way they use memory. You are correct that the SiteMap is cached at application startup and shared between users and the default is to expire the cache every 5 minutes. This is the unfortunate result of basing its original design on Microsoft's ASP.NET sitemap functionality which also stored the entire cache in memory at once. 15,000 nodes shouldn't be too much of an issue for the SiteMap unless your server is starved for memory. And if it is, you can implement System.Runtime.Caching.ObjectCache and provide a file caching mechanism (or there is an open source file cache). See this article for info about extending the cache. If you do that and it is enough to fix the SiteMap performance problem, then the only part you will need to worry about is the /sitemap.xml endpoint (the XML for search engines) because of the issues I mentioned before. Note that if you do not need to index some of your pages in search engines, you can work around this problem by using preservedRouteParameters to force multiple URLs to match a single node and then using a custom visibility provider and [SiteMapTitleAttribute] to fix the display of your menu and breadcrumbs. See How to Make MvcSiteMapProvider Remember a User's Position for working examples and documentation about how to do that.
Correct.
You can think of DI like a super-web.config file. When following a few principles of pattern design and constructor injection, making an application using DI means you don't have to carefully plan out every single extension point and painstakingly (and unwittingly) tie it to the web.config file. When making an application using DI, virtually any interface or abstract class turns into an extension point that can be replaced with another implementation. It puts the application in charge of what implementation (or implementations) of ISomeService to plug in where, which is much more powerful and less effort than what you can do using web.config. It also makes writing automated tests for your code easier. Several problems appear when you tie it to the web.config file. One of them is that the implementation becomes more complex - a lot of code needs to be written to read the values from the config file. Also, the author of the code might have made the multiplicity wrong for your particular scenario - what if you have a multi-tenant application and each one needs a separate implementation? Suddenly that configurable web.config seems quite limiting. Finally, if you have to put settings in a web.config for your test you have gone from a unit test to a more complicated integration test because you cannot isolate that piece to test independenly. To get up to speed with DI, I recommend reading Dependency Injection in .NET. There aren't many books I have read that I can honestly say changed the way I program - this is one of them.
By default MvcSiteMapProvider uses the System.Runtime.Caching.MemoryCache, a subclass of System.Runtime.Caching.ObjectCache. This is the new low-level caching extension point available since .NET 4.0 which is not tied in any way to ASP.NET. As previously mentioned through the use of DI, this automatically becomes an extension point is where you can plug another implementation of ObjectCache. There are also several other uses of caching throughout MvcSiteMapProvider, but currently page output caching is not one of them and until now hasn't been something I have considered. I did some checking and it doesn't seem like there is a way to use it on a templated HTML helper, though - it looks like it is only an option for controller actions (which MvcSiteMapProvider only uses on the /sitemap.xml endpoint, but nowhere else). I think for /sitemap.xml it would make more sense to use a low-level cache and store only the count of the number of visible nodes in it. From that number, the rest of the logic can be driven to determine how many pages are required to be generated and whether or not an index page applies, and also get a 12 hour timeout. You really don't gain much from changing the caching around because the main problem is that it loads the entire data structure into RAM before it starts to build the output stream every time it is built.
No problem. But unfortunately, that means you will either have to wait until I have some time to work on it or else build your own implementation that generates the sitemaps XML in a way that scales to 15,000 nodes. |
FYI - I am working on another project that also has its own sitemap and I did a test with output caching. As it turns out, when writing to a stream the output caching functionality doesn't work. When I tried enabling it, it hit the database on every page request. Of course, my test was with ASP.NET, not MVC, but I suspect the result will be the same because the output caching is based on buffered rather than streamed output. So, the caching needs to be done at a lower level. And to both make it scale and adjust to using a sitemap index in the case where the node count exceeds 35,000, it will need to be cached using a stream (or probably a set of streams - one for each 35,000 nodes) that writes to disk. |
I did some experimenting and was able to get MvcSiteMapProvider to work with adequate performance (aside from a delayed initial load) with 15,000 nodes. I am using a 5 year old laptop with 8GB of RAM. However, I was able to reproduce the serious performance drain when doubling that to 30,000 nodes. The bottleneck is reading the SiteMap, not actually creating the nodes. Basically after reaching a certain point the OS starts using disk swap space instead of RAM and things really slow down. Due to a recent contribution though, I discovered a bug that was causing each node to take up about double the memory that it should have. This has now been fixed in v4.4.7. After the fix, I was also able to get adequate performance with 30,000 nodes, although building the /sitemap.xml took some CPU for several seconds to make happen. If you upgrade and you find that you are still having problems, there are some things you can try to reduce the load.
|
Thank you for your effort and following up.
I need to make time to learn and contribute something... If anyone's interested, I have a few providers, and they look like this: using System.Collections.Generic;
using Client.Project.Facade;
using Client.Project.Models;
using Client.Project.Site.Controllers;
using MvcSiteMapProvider;
using Category = Client.Project.Models.View.Category;
namespace Client.Project.Site.Extensions
{
public class CategoryDynamicNodeProvider : DynamicNodeProviderBase
{
public override IEnumerable<DynamicNode> GetDynamicNodeCollection(ISiteMapNode node)
{
List<DynamicNode> dynamicNodes = new List<DynamicNode>();
//Get the currently active cultures
List<CultureInfo> languages = new Localization().GetCultures();
foreach (CultureInfo cultureInfo in languages)
{
//Switch to culture based on display name, eg: EN
Language.Switch(cultureInfo.DisplayName);
//Initialize a new node
DynamicNode categoriesNode = new DynamicNode
{
Key = "Categories_" + cultureInfo.DisplayName,
ParentKey = "language" + cultureInfo.DisplayName, //Same as the key in the node provider for the language
Title = new BaseController().GetString("home_categories"), //Get the translated title
Controller = "Categories",
Action = "Index"
};
categoriesNode.RouteValues.Add("language", cultureInfo.DisplayName);
dynamicNodes.Add(categoriesNode);
List<Category> categories = new Categories().GetCategoryTree();
foreach (Category category in categories)
{
DynamicNode categoryNode = new DynamicNode
{
Key = "Category_" + category.Id + "_" + cultureInfo.DisplayName,
ParentKey = categoriesNode.Key,
Title = category.Name, //Translated name of category
Controller = "Categories",
Action = "Category"
};
categoryNode.RouteValues.Add("id", category.Id);
categoryNode.RouteValues.Add("description", category.Name.ToSeoEncoded()); //Encode the name
categoryNode.RouteValues.Add("language", cultureInfo.DisplayName);
dynamicNodes.Add(categoryNode);
if (category.Children != null)
{
foreach (Category subcategory in category.Children)
{
DynamicNode subcategoryNode = new DynamicNode
{
Key = "SubCategory_" + subcategory.Id + "_" + cultureInfo.DisplayName,
ParentKey = categoryNode.Key,
Title = subcategory.Name,
Description = subcategory.Name,
Controller = "Categories",
Action = "SubCategory"
};
subcategoryNode.RouteValues.Add("language", cultureInfo.DisplayName);
subcategoryNode.RouteValues.Add("catid", category.Id);
subcategoryNode.RouteValues.Add("name", category.Name.ToSeoEncoded());
subcategoryNode.RouteValues.Add("subcatid", subcategory.Id);
subcategoryNode.RouteValues.Add("subName", subcategory.Name.ToSeoEncoded());
dynamicNodes.Add(subcategoryNode);
}
}
}
}
return dynamicNodes;
}
}
} |
Another thing I just thought of to help you reduce the memory footprint (assuming you are already using the latest version of MvcSiteMapProvider already that was released yesterday) is to disable caching of the resolved URLs. <appSettings>
<add key="MvcSiteMapProvider_EnableResolvedUrlCaching" value="false"/>
</appSettings> This setting causes the SiteMap to resolve the URLs at the point the SiteMap is built if true and store them in the shared cache. If false, the URLs are resolved on demand for each user, so they are not stored in the cache. Doing the math, if your URLs are an average of 30 bytes long and you have 15,000 of them, disabling this setting should let go of about 440 MB of RAM. It should also decrease the load time of the SiteMap, but it remains to be seen by how much. On the other hand, it will probably slow down the loading of the pages slightly because each SiteMap node that is visible on the page will have to resolve the URL on each request. You can also disable MvcSiteMapProvider localization, since you are not using it. This may also help to speed things up a little. <appSettings>
<add key="MvcSiteMapProvider_EnableLocalization" value="false"/>
</appSettings> Question: Are you using the exact same node structure for each locale, or do they differ? |
Is there any progress on this problem? I have a website with 600k pages which need to be indexed.. What to do? |
Unfortunately, not. There is a limitation of around 10,000-20,000 nodes in the sitemap with the current design because it loads all nodes into memory and keeps them there. My suggestion is to use the preservedRouteParameters to force a match on the breadcrumb trail and add just enough nodes to create a high-level menu. Using this technique, you will get a breadcrumb trail for every page. So that takes care of the Menu and SiteMapPath at least. Depending on your requirements, this might also take care of a user sitemap page using the SiteMap HTML helper. Then you can roll your own SiteMaps XML endpoint that uses paging from your data source so you only pull up the data that you need into memory. The code that generates the endpoint is not very complex and would not be that difficult to duplicate. The SiteMaps XML protocol dictates that no XML file should contain more than 50,000 entries and be no larger than 10 MB. To be on the safe side, we use 35,000 entries as the maximum page size. The tricky part will be to implement the paging mechanism if your pages are not all stored in the same table. You might think about using a Union query to stack the tables first before implementing the paging, but you need to be careful not to load all of that data into memory at once. As for the URLs, depending on how you store the information in your data source (if you store controller and action name) you might be able to take what you need from the SiteMapNodeUrlResolver to create them or implement ISiteMapNode and use the class directly to generate them. Basically, the SiteMapNodeUrlResover is a thin wrapper around the MVC UrlHelper class. But we use a fake HTTP context object to eliminate the possibility that values from the current context be inserted into the URL (see #213). On the other hand, if you store the URL slugs in your data source you probably don't need to do this - some table joining and string concatenation might be all you need to generate the URLs. This post might be helpful with generating your own sitemaps XML, but you will definitely need to take into consideration the fact that you don't want to load all of your nodes at once into memory. I suggest that once you have figured out what is required that you create a blog post of your own to demonstrate to others how this can be achieved. |
FYI - I have created a new issue #345 to discuss requirements to address adding an extension point that can be used to provide URLs that are not registered in the SiteMap object in the sitemaps XML. Your feedback would be appreciated. |
Thanks for the help. I read #345 and i look forward to see the new features. I have started working on your advice from the previous comment. I force matched my breadcrumbs and populated the information up to the root node. This works perfect. |
The XmlSiteMapResult is injected into a plain old MVC controller called XmlSiteMapController. My suggestion is to completely replace this controller with your own (including your own route registration) and make a clean break from this design, utilizing only the bits you need - that would be the easiest approach (and what I am planning to do in #345). Ideally the ActionResult derived class will only be a shell to make it interact with MVC, delegating its work out to other services. Another option is that you could implement IXmlSiteMapResultFactory, but unfortunately it has a lot of Create overloads to make. You could copy much of the logic from the existing XmlSiteMapControllerFactory to satisfy these overloads and then inject your own services into that factory (which you can pass into your custom XmlSiteMapResult class constructor in the main Create method(s)). The factory can then be injected into XmlSiteMapController directly, providing it an instance of your custom implementation. You will either need to implement ISiteMapNode for your custom URL entries so you can utilize the built-in SiteMapNodeUrlResolver, or build your own simplified URL resolver and interface to replace it. Most of what is going on in that class is to prevent MVC from inadvertently adding "ambient values" (query string parameters and route values from the current request) into the URL. Since the request for your sitemaps page is likely to be simple (/sitemaps.xml), you probably don't need all of that extra code; just utilizing the services IUrlHelper and possibly IUrlPath will suffice. IMHO, following the SRP applies - make a separate class (service) for each responsibility (paging, XML format, XML index creation, URL entry providers, compression, URL resolving, etc). For optimal performance, you should pass a System.IO.Stream (abstract class) as the parameter to populate from service to service, not an interface or collection. If necessary, serialize and stream individual objects (not just XML text) from one service to the next. If pulling the URL entry data from a database, you should also ensure to use streamed and paged data access to ensure only 1 record is in memory at a time and a you have a small memory footprint. Use an abstract factory to create and properly dispose of the streams as shown in DI Friendly Framework. |
Thanks for your advices. I've implemented it in that way and it works perfectly. But I have a little problem - on some certain sitemap pages (sitemap-X) i get the following Exception :
Do you have any suppositions? |
Without seeing your code, I can only guess. But it looks like you are getting the results from a View instead of the results from an XmlWriter injected into your OutputStream. The error message is a result of the configured layout page trying to inject itself into that view. In order to prevent this from happening, make sure you are returning an ActionResult derived object that only writes XML, never return a View from your action method. There is a simple example here, although I would recommend using an XmlWriter to write the output instead of a serializer as was done in that post. I now have a working prototype that allows providers to supply URLs 1 at a time as well as record counts and will automatically page across the providers regardless of how many URLs each one uses. The implementation is incomplete, but there is enough there to see it working. Most of the new code is in the Xml/Sitemaps directory and subdirectories. Here is some sample code on how to wire it up. Note that the API is guaranteed to change between now and the final release, but you can borrow pieces of this to get your implementation running if you need to. The Controller public class SitemapController : Controller
{
public ActionResult Index(int page = 0)
{
return new SitemapXml(page);
}
} The ActionResult This wires up 2 URL entry providers to demonstrate how the automatic paging will work. We override the default page size of 35,000 and make it return just 4 records per page. using System;
using System.Web;
using System.Web.Mvc;
using MvcSiteMapProvider;
using MvcSiteMapProvider.IO;
using MvcSiteMapProvider.Web.Mvc;
using MvcSiteMapProvider.Xml.Sitemaps;
using MvcSiteMapProvider.Xml.Sitemaps.Paging;
using MvcSiteMapProvider.Xml.Sitemaps.Specialized;
public class SitemapXml
: ActionResult
{
public SitemapXml(int page)
{
this.page = page;
}
private readonly int page;
public override void ExecuteResult(ControllerContext context)
{
// Output content type
context.HttpContext.Response.ContentType = "text/xml";
// Wire up our services
var urlEntryProviders = new IUrlEntryProvider[] {
new UrlEntryProvider1(),
new UrlEntryProvider2()
};
var mvcContextFactory = new MvcContextFactory();
var httpStreamFactory = new CompressibleHttpResponseStreamFactory(mvcContextFactory);
var urlEntryProviderPagingInstructionFactory = new UrlEntryProviderPagingInstructionFactory();
var sitemapsPagingStrategy = new AutomaticSitemapsPagingStrategy(urlEntryProviders, urlEntryProviderPagingInstructionFactory);
sitemapsPagingStrategy.MaximumPageSize = 4;
var urlEntryHelperFactory = new UrlEntryHelperFactory();
var specializedContentXmlWriterFactoryStrategy = new SpecializedContentXmlWriterFactoryStrategy(new ISpecializedContentXmlWriterFactory[] {
new ImageContentXmlWriterFactory(),
new VideoContentXmlWriterFactory()
});
var sitemapsXmlWriterFactory = new SitemapsXmlWriterFactory(specializedContentXmlWriterFactoryStrategy);
var sitemapsIndexXmlWriterFactory = new SitemapsIndexXmlWriterFactory();
var sitemapsService = new SitemapsService(sitemapsPagingStrategy, httpStreamFactory, urlEntryHelperFactory, sitemapsXmlWriterFactory, sitemapsIndexXmlWriterFactory);
sitemapsService.Execute(this.page);
}
} The Providers For a demo we are just wiring up 10 hard coded entries in memory with no specialized content. The idea here is that we will typically create a LINQ query in both the using MvcSiteMapProvider;
using MvcSiteMapProvider.Xml.Sitemaps;
using MvcSiteMapProvider.Xml.Sitemaps.Specialized;
public class UrlEntryProvider1
: IUrlEntryProvider
{
public int GetTotalRecordCount()
{
return 10;
}
public void GetEntries(IUrlEntryHelper helper)
{
var specializedContent = new List<ISpecializedContent>();
var entries = new List<IUrlEntry>
{
new UrlEntry("http://www.somewhere.com/", specializedContent),
new UrlEntry("http://www.somewhere.com/someotherpage.html", specializedContent),
new UrlEntry("http://www.somewhere.com/oops.html", specializedContent),
new UrlEntry("http://www.somewhere.com/hello.html", specializedContent),
new UrlEntry("http://www.somewhere.com/goodbye.html", specializedContent),
new UrlEntry("http://www.somewhere.com/contact.html", specializedContent),
new UrlEntry("http://www.somewhere.com/about.html", specializedContent),
new UrlEntry("http://www.somewhere.com/search.html", specializedContent),
new UrlEntry("http://www.somewhere.com/testing.html", specializedContent),
new UrlEntry("http://www.somewhere.com/lastpage.html", specializedContent),
};
var results = entries.Skip(helper.Skip).Take(helper.Take);
foreach (var url in results)
{
helper.AddUrlEntry(url);
}
}
}
public class UrlEntryProvider2
: IUrlEntryProvider
{
public int GetTotalRecordCount()
{
return 10;
}
public void GetEntries(IUrlEntryHelper helper)
{
var specializedContent = new List<ISpecializedContent>();
var entries = new List<IUrlEntry>
{
new UrlEntry("http://www.nowhere.com/", specializedContent),
new UrlEntry("http://www.nowhere.com/someotherpage.html", specializedContent),
new UrlEntry("http://www.nowhere.com/oops.html", specializedContent),
new UrlEntry("http://www.nowhere.com/hello.html", specializedContent),
new UrlEntry("http://www.nowhere.com/goodbye.html", specializedContent),
new UrlEntry("http://www.nowhere.com/contact.html", specializedContent),
new UrlEntry("http://www.nowhere.com/about.html", specializedContent),
new UrlEntry("http://www.nowhere.com/search.html", specializedContent),
new UrlEntry("http://www.nowhere.com/testing.html", specializedContent),
new UrlEntry("http://www.nowhere.com/lastpage.html", specializedContent),
};
var results = entries.Skip(helper.Skip).Take(helper.Take);
foreach (var url in results)
{
helper.AddUrlEntry(url);
}
}
} |
Actually I'm not using Views anywhere in the sitemap xml generation. I just got your old XmlSiteMapResult class and extended it to use database. I made some views in the database that are returning the url properies of all pages and i'm using XmlWriter.I load only one url at a time. It is throwing this exception only on a particular sitemap pages - sitemap-1.xml, sitemap-7.xml, sitemap-17.xml. The occurence of the problem is fixed on this pages. The other pages are working well. Here is the code that is almost the same as yours. XmlSiteMapController.cs public class XmlSiteMapController : Controller
{
public XmlSiteMapController(IUowData data,
ISiteMapLoader siteMapLoader,
IUrlPath urlPath,
ICultureContextFactory cultureContextFactory)
{
if (siteMapLoader == null)
throw new ArgumentNullException("siteMapLoader");
if (urlPath == null)
throw new ArgumentNullException("urlPath");
if (cultureContextFactory == null)
throw new ArgumentNullException("cultureContextFactory");
this.siteMapLoader = siteMapLoader;
this.urlPath = urlPath;
this.cultureContextFactory = cultureContextFactory;
this.data = data;
}
protected readonly ISiteMapLoader siteMapLoader;
protected readonly IUrlPath urlPath;
protected readonly ICultureContextFactory cultureContextFactory;
protected readonly IUowData data;
public ActionResult Index(int page = 0)
{
return new XmlSiteMapResult(
page,
this.DefaultRootNode,
this.DefaultSiteMapCacheKeys,
this.DefaultBaseUrl,
this.DefaultSiteMapUrlTemplate,
this.siteMapLoader,
this.urlPath,
this.cultureContextFactory,
this.data);
}
//... the same properties as in your XmlSitemapController
} XmlSiteMapResults.cs public class XmlSiteMapResult
: ActionResult
{
public XmlSiteMapResult(
int page,
ISiteMapNode rootNode,
IEnumerable<string> siteMapCacheKeys,
string baseUrl,
string siteMapUrlTemplate,
ISiteMapLoader siteMapLoader,
IUrlPath urlPath,
ICultureContextFactory cultureContextFactory,
IUowData data)
{
if (siteMapLoader == null)
throw new ArgumentNullException("siteMapLoader");
if (urlPath == null)
throw new ArgumentNullException("urlPath");
if (cultureContextFactory == null)
throw new ArgumentNullException("cultureContextFactory");
this.Ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
this.Page = page;
this.RootNode = rootNode;
this.SiteMapCacheKeys = siteMapCacheKeys;
this.BaseUrl = baseUrl;
this.SiteMapUrlTemplate = siteMapUrlTemplate;
this.siteMapLoader = siteMapLoader;
this.urlPath = urlPath;
this.cultureContextFactory = cultureContextFactory;
this.db = data;
}
protected readonly ISiteMapLoader siteMapLoader;
protected readonly IUrlPath urlPath;
protected readonly ICultureContextFactory cultureContextFactory;
protected readonly HashSet<string> duplicateUrlCheck = new HashSet<string>();
protected readonly IUowData db = null;
//.. some other properties
protected virtual void ExecuteSitemapIndexResult(ControllerContext context)
{
// Count the number of pages
double numPages = Math.Ceiling((double)this.allUrlsCount / MaxNumberOfLinksPerFile);
// Output content type
context.HttpContext.Response.ContentType = "text/xml";
// Generate sitemap index
var sitemapIndex = new XElement(Ns + "sitemapindex");
sitemapIndex.Add(GenerateSiteMapIndexElements(Convert.ToInt32(numPages), this.BaseUrl, SiteMapUrlTemplate).ToArray());
// Generate sitemap
var xmlSiteMap = new XDocument(
new XDeclaration("1.0", "utf-8", "true"),
sitemapIndex);
// Write XML
using (Stream outputStream = RetrieveOutputStream(context))
{
using (var writer = XmlWriter.Create(outputStream))
{
xmlSiteMap.WriteTo(writer);
}
outputStream.Flush();
}
}
protected virtual void ExecuteSitemapResult(ControllerContext context, IEnumerable<ISiteMapNode> flattenedHierarchy, int page)
{
// Output content type
context.HttpContext.Response.ContentType = "text/xml";
// Write XML
using (Stream outputStream = RetrieveOutputStream(context))
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
using (var writer = XmlWriter.Create(outputStream, settings))
{
writer.WriteStartDocument(true);
writer.WriteStartElement("urlset", Ns.NamespaceName);
if (page == 1)
{
var localUrls = GenerateUrlElements(context, flattenedHierarchy, BaseUrl);
foreach (var url in localUrls)
{
url.WriteTo(writer);
}
}
var currentUrlsCommand = this.db.SitemapUrlList.BuildGetRecordsCommand((page - 1) * MaxNumberOfLinksPerFile, MaxNumberOfLinksPerFile);
try
{
using (SqlDataReader dr = currentUrlsCommand.ExecuteReader())
{
while (dr.Read())
{
var siteMapNode = buildSiteMapNode(context,
dr["Type"].ToString(), long.Parse(dr["Id"].ToString()),
dr["MusicRecordUri"].ToString(), dr["ArtistUri"].ToString());
var xmlUrlElement = GenerateUrlElement(siteMapNode);
xmlUrlElement.WriteTo(writer);
}
}
}
finally
{
this.db.SitemapUrlList.ReleaseCommand(currentUrlsCommand);
}
writer.WriteEndElement();
writer.WriteEndDocument();
}
outputStream.Flush();
}
}
public override void ExecuteResult(ControllerContext context)
{
var flattenedHierarchy = new HashSet<ISiteMapNode>();
// Flatten link hierarchy
if (SiteMapCacheKeys.Any())
{
foreach (var key in SiteMapCacheKeys)
{
var siteMap = siteMapLoader.GetSiteMap(key);
if (siteMap == null)
{
// throw new UnknownSiteMapException(Resources.Messages.UnknownSiteMap);
}
this.RootNode = siteMap.RootNode;
foreach (var item in FlattenHierarchy(this.RootNode, context, siteMap.VisibilityAffectsDescendants))
{
flattenedHierarchy.Add(item);
}
}
}
else
{
foreach (var item in FlattenHierarchy(this.RootNode, context, this.RootNode.SiteMap.VisibilityAffectsDescendants))
{
flattenedHierarchy.Add(item);
}
}
this.flattenedHierarchyCount = flattenedHierarchy.LongCount();
this.urlsFromDatabaseCount = this.db.SitemapUrlList.GetRecordsCount();
// Determine type of sitemap to generate: sitemap index file or sitemap file
if (this.allUrlsCount > MaxNumberOfLinksPerFile && Page == 0)
{
// Sitemap index file
ExecuteSitemapIndexResult(context);
}
else if (this.allUrlsCount > MaxNumberOfLinksPerFile && Page > 0)
{
// Sitemap file for links of page X
ExecuteSitemapResult(context, flattenedHierarchy, Page);
}
else
{
// Sitemap file for all links
ExecuteSitemapResult(context, flattenedHierarchy, 1);
}
}
//.. the same old methods..
} |
I haven't spotted the problem in your code, but the error message clearly indicates it is trying to return a View and attach the Layout.cshtml page to the View. I suspect this has to do with the way your routes are registered. If these particular URLs are matching a different route in your application than the intended one, they may be routed to a different part of your application than this XmlSiteMapController. As an experiment, remove all of the registration of routes that deal with the XmlSiteMapController and move them into your RouteConfig.cs file. Make sure that the one in the MvcSiteMapProviderConfig.cs file is removed as shown below. internal class MvcSiteMapProviderConfig
{
public static void Register(IDependencyInjectionContainer container)
{
// Setup global sitemap loader
MvcSiteMapProvider.SiteMaps.Loader = container.GetInstance<ISiteMapLoader>();
// Check all configured .sitemap files to ensure they follow the XSD for MvcSiteMapProvider
var validator = container.GetInstance<ISiteMapXmlValidator>();
validator.ValidateXml(HostingEnvironment.MapPath("~/Mvc.sitemap"));
// Register the Sitemaps routes for search engines
// XmlSiteMapController.RegisterRoutes(RouteTable.Routes); <-- Remove this line
}
} Also remove any registration code that you have added to the startup code for your new controller. Then add the route registration directly to your RouteConfig.cs (or where ever you have your routes registered for your application). public class RouteConfig
{
public static void RegisterRoutes(RouteCollection routes)
{
routes.IgnoreRoute("{resource}.axd/{*pathInfo}");
routes.MapRoute(
name: "SitemapDefault",
url: "sitemap.xml",
defaults: new { controller = "XmlSiteMap", action = "Index", page = 0 }
);
routes.MapRoute(
name: "SitemapPage",
url: "sitemap-{page}.xml",
defaults: new { controller = "XmlSiteMap", action = "Index", page = 0 }
);
// Your application's other routes should be here
routes.MapRoute(
name: "Default",
url: "{controller}/{action}/{id}",
defaults: new { controller = "Home", action = "Index", id = UrlParameter.Optional }
);
}
} As long as you add these 2 routes before any other routes in your application, that should rule out whether your existing routes are interfering with these specific URLs. The XmlSiteMapController's route registration tries to add the routes just above the catch-all route, which may not be the right place for your application. You may have to order them manually in your configuration. You might also be getting a naming conflict with the existing XmlSiteMapController. If changing the routes doesn't work, you might try changing the name to something else (both in the class and the routes). |
Yes, I've already done all the things you said. I've already deleted RegisterRoutes method of the controller and moved the sitemap routes registration to RoutesConfig.cs. I'm sure that my controller is having been called - I tried to return a string with the Content() method of the controller and it's working as expected. I also renamed the controller name especially when i posted the code to be more clear. |
We generate a sitemap with 15000 urls. Most of them have localized urls. It takes a while to enumerate them all from the database and add the corresponding localizations. Both our webservers run 4 instances of the site. When the sitemap must be generated the servers completely lock up. We use the breadcrumb feature, so I think the servers try to generate the sitemap in order to serve the page including the breadcrumbs.
For our regular caching we use appfabric. I didnt implement a DI to configure the sitemap cache to use appfabric yet. I think my problem is more related to the generation of the sitemap.
Is there a way around this performance problem for me?
The text was updated successfully, but these errors were encountered: