Towards the end of the yesterday an interesting bug started to show up whenever anyone went into the Modules section of the CMS "Stack Trace: /r/nError: The server method 'GetContentItems' failed./r/nStatus Code: 500/r/nException Type: /r/nTimed Out: false". This was not good news. So I opened up Firebug and sure enough the call to that webservice was coming back with a 500 error status, looking in the Response tab all I could see was that the server was returning a 500 error status.
First things first, I tried to reproduce the problem locally but without success, everything was working fine for me. I checked the builds for Live, Staging and Development and they were all the same, but I couldn’t reproduce the error on Dev or Staging. OK, I’ve set the servers machine.config into production mode with deployment = retail, so went in and changed that. Not the best thing to do on a production server because potentially any errors would now be visible. However as I was unable to reproduce the error locally or on either the development or staging servers I didn’t have much choice. Annoyingly Firebug was still giving me the generic error page, nothing I could use to debug or see what was happening. I even gave Fiddler the chance to shine, but again all I could see was the generic error. Even though I’d disabled deployment=”retail” for the servers and set customErrors off in the application web.config I wasn’t seeing the real error. I tried to run the application locally on the production web server, but that box is locked down quite tightly and I couldn’t get it to run (even as an virtual directory from within the Default Website).
Enter ELMAH
This wasn’t going well, I needed to see the real error but couldn’t seem to get to it, the logs weren’t being very helpful either. I’ve been meaning to install ELMAH on the web server globally for a while, I even started but couldn’t get it to run, so figured I’d just quickly install ELMAH for this site only. For anyone who doesn’t know ELMAH rocks! This is what the site says about it:
Once ELMAH has been dropped into a running web application and configured appropriately, you get the following facilities without changing a single line of your code:
- Logging of nearly all unhandled exceptions.
- A web page to remotely view the entire log of recoded exceptions.
- A web page to remotely view the full details of any one logged exception.
- In many cases, you can review the original yellow screen of death that ASP.NET generated for a given exception, even with customErrors mode turned off.
- An e-mail notification of each error at the time it occurs.
- An RSS feed of the last 15 errors from the log.
- A number of backing storage implementations for the log, including in-memory, Microsoft SQL Server and several contributed by the community.
Setting it up for the site was really easy, I dropped the dll into the bin and added the relevant lines to web.config - setting it to send me emails for the errors. So back to my website and cause the problem, within seconds the first email arrived in my inbox and there was an interesting bit of information in there. The web service was being called as “getcontentitems” and not “GetContentItems”, the error was even kind enough to explain that method names are case sensitive.
The penny drops
At this point I remembered that we’d installed the URL Rewrite module for IIS at work earlier this week in order to fix up some issues the SEOToolkit was reporting. As well as the canonical url issue of missing ‘www’ from our urls we’d wanted to enforce lowercase urls. Personally I prefer seeing SentenceCase on urls but if this causes SEO issues then as it is only a matter of taste I saw no reason not to lowercase everything by the URL Rewrite module and then it doesn’t matter what file name authors give to their pages. We had added a rule at server level to lowercase everything, set up rules at application level to enforce the www in urls and got on with other things. All was good when we did this and no one was complaining about broken functionality. Clearly this was the cause of our problems and explained why we weren’t seeing this elsewhere.
So we deleted the rule and refreshed the CMS, no more errors, problem solved! I’m getting emails quite often still and am tempted to divert these to the Helpdesk system so it can log them and we’ll work on them as and when there’s time. I also need to look into ELMAH a bit more to enable filtering so it only sends an email if a certain type of error occurs more than say 10 times in 10 minutes so we’re not bombarded with emails.
Warning
For anyone rewriting their urls be careful it may break things that are case sensitive!
16ae055f-b4c5-4977-ac9a-caa8fbaf2caa|1|5.0
Permalink |
Comments (7)