Performance Testing the Dynamics NAV Middle Tier


How does the Dynamimcs NAV Server perform?

With the delivery of the 3-tier architecture in Microsoft Dynamics NAV then it has opened up the possibility of developing applications utilising the NAV 2009 middle tier web services. Performance testing the middle tier from the Role Tailored Client (RTC) is difficult to perform, but testing the performance of the middle tier web services is much easier.

Recently we were tasked with completing some work for an existing client deployed with NAV 2009 to allow 450 users to connect to some specific functionality in NAV. They didn't want to purchase full NAV RTC users so instead we opted to develop a web based solution where the per user cost would be a lot cheaper. Perfect! This is exactly what the middle tier allows us to do. From here on in this article I may refer to the middle tier services as the NAV Service Tier or NST.

Let the learning curve commence!

3 tiers on 3 computers

First off we decided early on that there is no way this solution could be installed with the NST's on the SQL Server box. So we were going to dive in to the solution having the tiers being spanned across different servers. Early on we decided we'd need the following (although at this stage we hadn't decided how many of each we'd need);

  • Client machines connecting to NAV through the Web Application and designed for Internet Explorer.
  • Web Server (IIS) hosting our web application
  • Middle tier server hosting our NST instances.
  • SQL Server

Service Principal Names

Challenge number one (and there were many!). Service Principal Names (SPN's). Now for most classic NAV installers this is new territory, and to be honest the NAV documentation doesn't explain SPN's that well. However there is plenty of information out there explaining what they are and why they need to be setup. Our install also required some additional SPN's to be setup because we had a web server sitting in front of the NST layer.

Next decision we made was to run the different services under dedicated service accounts. That included the NST instances and the SQL Server service. Ok, so everything setup we tried to test by calling the base "services" web service in NAV to see what we could get returned from the system service.

Challenge number two. We are geting "403 unauthorised". Oh no...! Ok, so a lot of Wireshark and Kerbtray use later we realised that we had a duplicate SPN setup somewhere. Unfortunately if you look at the message returned at the network layer, you get the same error message returned if you don't have an SPN setup or you have duplicates. This isn't great as I can imagine a lot of NAV installers would try and add an SPN on one layer, then another, then take something away, and before they know... It's in a right mess. Eventually we got this working, and with a little tweaking for Fully Qualified Domain names, then all was up and running.



The Web Server

Next we needed to look at how we would install and setup the web server on IIS. For the purposes of our tests we had one web server with a quad core processor available and 4GB of RAM on Windows Server 2003 - 32 bit. We decided to install a web farm with 4 worker processes. One for each core. We allowed IIS to handle the State Management on the Web Server. The site was setup to not allow Impersonation and to use Windows Authentication (important if we're going to get Kerberos to work!)

The Middle Tier Server

The middle tier server also had a quad core processor, 4GB of RAM on Windows Server 2003 - 32 bit. For the purposes of the tests we installed 10 instances of the NST's. They were simply called DynamicsNAV1 through to DynamicsNAV10.

Load Balancing the NST Middle Tier

Challenge number 3 - or is this 4... I'd lost count... As we all know, automatically load balancing the middle tier is not possible through the RTC, and more on that will follow in a separate blog post... However, it is possible to do this from a web app. We built the load balancing in our web app and it was based on a simple round robin fashion. We had a web.config where we coud setup the generic web service addresses as well as the number of NST instances available. However, there is also no failover of the NST's in the middle tier either (and a blog post on that will follow shortly too) and we needed to ensure we had that working for sign off. So how did we achieve it? Quite simply one of our developers decided to make HTTP requests routinely to the middle tier services and analyse what their response is. This way we can keep track of which ones are running and shift users around accordingly in the data layer of our web app.

The Tests


Before details of the test - then a few more facts about the web app. The web app was built using .NET Framwork 3.5. The web pages were ASPX based. The sites utilises AJAX and Java quite heavily. Certain "look up" data is cached and serialised on the web server for increased performance.

The test performed was of a "worst case" scenario. We wanted to get comfort around large loads of both our web server and the NST's. The test simulated a user entering a job journal line and submitting that to the database. The test involved performing a number of lookups and maximising the lookup data being retrieved. The test scenario simulated the user closing down the browser after the successful insertion of the job journal line.This ensured that we were fully loading the web server a lot more than it would be loaded in the real world scenario.

We were licensed to test a maximum of 250 concurrent users with our performance testing software.

The test was ran for 10 minutes and up to 215 concurrent users were tested. The load was generated from 3 client machines running the test case scenario detailed above. The baseline test size of the data involved in the test scenario was 548.8KB. Random "think time" was used during test case scenario generation. Over the 10 minute testing cycle then 2147 test case scenarios were completed - a lot of job journal lines entered in 10 minutes!

A screen shot showing a summary of the test results is shown below.



There were 7 tests performed covering;

  • Page Completion Rate
  • Page Duration
  • Transaction (URL) Completion Rate
  • Page Failures
  • Bandwidth Consumption
  • Test scenarios per minute
  • Waiting users

Of these 7 goals analysed then all 7 goals were passed. The graphs produced (shown below), scaled linearly which implies that limitations are likely to be hardware based rather than the software itself.

The page duration goal was set to an average page duration of less than 4,000ms.

There were no page failures during the test.



The average page duration was well below the 4,000ms limit set. There were some spikes as shown below. Also as the number of users increased then we did see some waiting users as expected, however the results were reasonably encouraging.




The Page Completion Rate chart shows the total number of pages completed per second during the sample periods summarised for each user level. In a well-performing system, this number should scale linearly with the applied load (number of users).



The Transaction (URL) Completion Rate chart shows the total number of HTTP transactions (URLs) completed per second during the sample periods summarised for each user level. In a well-performing system, this number should scale linearly with the applied load (number of users).


The Bandwidth chart shows the total bandwidth consumed by traffic generated directly by the load test engines during the sample periods summarised for each user level. In a system that is not constrained by bandwidth, this number should scale linearly with the applied load (number of users). Note that other sources of bandwidth may be active during the test and may even be caused indirectly by the load test but may not be included in this metric.


The Testcase Completion Rate chart shows the total number of testcases completed per minute during the sample periods summarised for each user level. In a well-performing system, this number should scale linearly with the applied load (number of users).




The Waiting Users and Average Wait Time metrics help diagnose certain types of performance problems. For example, they can help determine what pages users have stopped on when a server becomes non-responsive. The 'Waiting Users' metric counts the number of users waiting to complete a web page at the end of the sample periods summarised for each user level. The 'Average Wait Time' describes the amount of time, on average, that each of those users has been waiting to complete the page.


Sorted by the elapsed test time, this table shows some of the key metrics that reflect the performance of the test as a whole. Note times are in milliseconds and bandwidth in bytes...


UsersPages per secHits per secBandwidth OutMin Page Dur Avg Page Dur Max Page DurWaiting UsersAverage Wait Time
10.49.6693361444313000
10.13.198951971971970
10.26185161641801961220
10.510.1854431152115540
10.13.198951981981980
10.26185161805198580
92.470.75743692511168621050
143.380.6445324134815002416
184.2104.6538417133812421157
206157.71024960042815893565
235.8130.2806911033915783406
267.3184.8998694032214841440
298200.11265628037418673117
328.4209.81345434036414185485
338.1187.7956372031924524631
3610.5260.11639150042318984349
4311.43041929070040620016382
4811.9270.51397348035547587512
5314.9369.12267361040624917619
6016.7419.52526792040424223532
6315.7361.62017998033529035442
6918.1470.8273697004265514111128
7220.6463.92865025044744205352
7317.5424.224079600371190012438
7721.855533510910416276512687
7722.44962914428038829948661
8419.6527.331094240417292316551
8925.960237359800405353217663
9325.6614.634505990399231211296
9724.6619.336991480394290713353
10328.3691.543665410413250920634
10628.2653.236318860409416214340
10728.5704.341124460399244213369
11332.4791.650170140433394213453
11830.9726.241587480434370411383
12331.578243384730409260312282
12635.9861.156369830454297818700
13234.2830.947422750433456919496
13434.7815.445007620382262917487
13538.3904.756905130469337918793
14038916.554274300472583512359
14335861.848694090405443323638
15042.71028.662709730459395521535
15442.3102861285850455663914351
15839.2958.153964360413430929514
16345.31100.767979210489484523566
16846.41113.665632560480448321391
17341.11013.258774910477423226482
17549.11177.770973040501413527805
17748.61120.7665688404879557191052
18243.21111.5649621405436244261074
18652.11232.672740290512562726522
19452.81240.574631500505547623380
19647.71177.368925210499522428691
20053.1132178719080551558427989
20156.3127977444530513516324608
204501243.870959800515455930540
20854.51342.383944480541708333681
21560.41434.982366060532458027386
21556.91309.878189410465485824783
16444.51032.2539077105043865241054
10136.8796446685115151067712196
4416.1306.41747644134518945590
168.1164.394363414943517944305
02.3292894251533566920


Conclusions
Apart from the conclusions already drawn above, then it is worth noting some additional observations.

Realistically we found that allocating about 20 users per NST is a realistic figure for performance purposes.

Each NST consumed about 50MB of RAM each.

Consumption of RAM and processing power on our SQL Server box was not an issue.

IIS was not fully utilised at about 60% of the total CPU available.

IIS was likely to fall over given scaled testing results before the NST's would. Clearly scaling of the NST's further would have been through installation of another instance and changing the web.config.

We performed similar tests virtualising the IIS layer and the NST layer and noticed some performance drops. Testing further showed that this drop was probably due to virtualisation of the NST layer as opposed to the IIS layer.

Given the figures generated in our test we were able to scale to support 450 users. The eventual setup ended up being hosted with the web server and NST layer being virtualised.

Some Points to Remember

  1. Read up and learn about SPN's.
  2. Get familiar with Wireshark and Kerbtray.
  3. Use Fully Qualified Domain Names on the SPN's.
  4. If developing something like a web app then try and get everything working properly first with the RTC connecting from a client PC. Then you can tell if the issue is in your setup or in your external app.
  5. Have a Domain Admin present to assist you. Best not to go playing around on the client's domain!
  6. Go 64-bit if you can.
  7. Always use the latest releases of the NST's if you can get them from Microsoft as there may be a number of performance improvements on going all the time.
  8. Forget about trying to do this on NAV2009 if you aren't using SP1. We noticed significant performance increases between the non-SP1 service tiers and the SP1 equivalents.
  9. If setting up a large install of the RTC then there is no automatic load balancing or automatic fail over of the NST's.
  10. If your server allows it then install more NST's than realistically needed.
  11. Ensure you have a suitable fail over position in place should one NST fail - watch this space for a future blog post on this subject...
  12. Be prepared for a few headaches as the 3 tier technology that is being made available now in Microsoft Dynamics NAV is still quite new.

I hope you have found this post to be helpful and of some use. If you have read all the way to here then THANK YOU! as it did take me some time to put this blog post together...

14 Response to Performance Testing the Dynamics NAV Middle Tier

  1. PDJ says:

    Great blog entry - thanks for sharing!
    Any idea why this entry does not show up when the RSS feed is used in Google Reader? (Does it work in other RSS readers?)

  2. Hi PDJ.

    Thank you for the feedback. Much appreciated. Glad to know that people are actually reading it...

    I have a colleague that is getting updates via RSS and I think he is using Google Reader so I will check with him.

    It was setup using Feedburner so it should be absolutely fine.

    I will test it myself as well...

  3. Hi PDJ. I have checked it in Google Reader and it appears to be working fine. Also a colleague is subscribed to the posts through Google Reader and can confirm it is ok. I just searched for "sobersmarties" and subscribed to it all ok?

  4. Tim says:

    Really interesting article. Thanks for sharing.
    I did smile to myself over the SPN's. It reminded me of my initial problems getting the delegation working on a client site after the consultant had installed it incorrectly. Duplicate SPN's. I only found the duplication because I deleted the SPN's for the service account and tried to connect. You get an error telling you that the SPN's are setup against a different user.

    Your right about the documentation it is especially bad for web services so you really have to understand KERBROS and not simply follow the guides.

  5. Thanks Tim. I still wake up in the middle of the night screaming "Duplicate SPN's!!!"... After some counselling and treatment the frequency of these occurrences is now less and less... ;o)

    Seriously though I do think that NAV now requires more and more varying skill sets. My observations are that traditional NAValist's are stuggling to keep up... Or just resistant to change...

    Microsoft need to provide more and more support if future implementations are to be successful, hassle free and in turn the uptake of the RTC should increase...

  6. hi..

    can you let me know which tool you have used for testing the application?? it would be of great help.

  7. Hi Dharmesh.

    We used some great software called Web Performance Load Tester. A link to their website can be found here.

  8. Hi,

    Thanks a lot for the info. I have come to know that Navision client uses HTTP and RDP protocols to communicate. I wonder whether Web Performance Load Tester records RPD communication as well.

  9. Great post!

  10. Really interesting article. Thanks for sharing.
    I am confused with R2 installation.We are trying to provide NAV as saas and planning the architecture is a really pain.

    Again,great blog entry

  11. Anonymous says:

    Excellent article, Thank you. Shows how NAV Webservices scales up to support many web users! One question... have you found any issues with small SOAP packet sizes requests in quick succession and a minimum of 200ms delay per call... I would appreciate your thought on http://www.mibuso.com/forum/viewtopic.php?f=32&t=48167 and if you have experienced this?

  12. I have not seen problems with small packet sizes, but what do you define as small? I tried to follow you link to mibuso but could not find the article - perhaps you could re-post?

    Thanks

    SS

  13. Scott says:

    MTU size ie. less than 1460 bytes... Someone replied to my Mibuso post and I we have managed to resolve. If you copy the whole url 'http://www.mibuso.com/forum/viewtopic.php?f=32&t=48167' hopefully you can get to it as it could be of interest! Thanks for replying :)

  14. Anonymous says:

    Even here in 2013 it is a great Blog to learn from - but I haven't been able to find your Blog about the "Load Balancing" functionality you developed.
    Could you post a link about this?

    Thank you very much