A Fortune 500 Company, a Performance Tuner, Microsoft, and Slow Dynamics AX Performance

A Fortune 500 Company, a Performance Tuner, Microsoft, and Slow Dynamics AX Performance

Dynamics AX Performance Tuning Text

I’ve tuned dozens of implementations with all kinds of strange issues, but last week I encountered an especially challenging case involving a Fortune 500 company, Microsoft, and slow Dynamics AX performance. Multiple members of Microsoft’s Premier Support (client had a dedicated support contract) had been on the case for months with no progress while I was brought in to see if I could make any headway in one week. The issue was perplexing and I feared that I wouldn’t be able to solve it in one week, but a hint from an employee came through and saved the day. For Dynamics AX performance tuners, I recommend that you read this case carefully. It will open your eyes to something that I’ve never seen documented, but I’m sure that it happens a lot more frequently than identified.

 

Previously, I had talked about standard performance, but I did mention that there are those special cases.  They almost always involve performance on the AOS side and not the SQL side.

The Managers Performance Tuning Survival Guide for Day 1 Slow Dynamics AX Performance

 

Case of the Slow Dynamics AX Performance for the Fortune 500 Company

The client had purchased brand new fast servers (the kind that put Azure to shame), but things were running unusually slow. AOS processes were going unusually slow. For example, in one case, confirming a group of sales orders was taking nearly 3 minutes. The same confirmation operation took 1 minute and 15 seconds on the 5 year old slower servers. It didn’t make sense. Microsoft had been brought in to solve the issue, but had struggled with it for several months with several premier support people unable to find a cause. Ultimately, they had come to the conclusion that the hardware vendor was at fault. The hardware vendor had come to the conclusion that Microsoft was at fault. Adding to the complexity is that SQL showed up as being just fine. In fact, this was arguably the best SQL tuning/setup that I had seen. The client had went completely out of the box and employed a technique that made their SQL box faster than any other AX company that I’ve seen. It was truly amazing to watch how it sped up the database layer.

 

What could it be?

 

 

FACTS
The case had baffled even Microsoft’s most advanced premier support people who had made no progress on the issue
SQL was not the problem
Hardware could not be ruled out
Dynamics AX was extremely slow
The problem wasn’t consistent. It didn’t always exist
Power Setup and other things were just fine according to Microsoft and the hardware vendor

 

An Employee shows me an interesting tool called System Information Viewer

I’m surprised that I had never tried out this free tool. For a free tool, this was awesome. I’m used to using Windows Performance Monitoring, but it can be a hassle to get it installed and setup and usually requires going through lots of approvals. This tool was straight to the point.

I opened it up and clicked on the “Memory Speed” option.


 
So, after running the memory speed test, the performance architect for the company showed me that it was only 2400 MB/s. That was slow for the hardware and way less than expected. That was a strong hint. Ram transfer rate performance was down.


 
Note: I love this tool. It’s a nice little quick way to compare your Azure performance to your base performance in terms of speed.

 

So, I suspected that there was something to the RAM Performance and so did the internal Performance Architect. He recommended that we contact the Premier Support Representative and get an opinion.

 
So, at the performance architects advice, we went back to premier support and told them what we had found. They responded that it was nothing and to ignore the issue.
 

I didn’t agree with the Premier Support Assessment and asked if we look at the AOS directly. I just wanted to look at it more closely.

Keep in mind that at many big companies, getting permissions to troubleshoot is often a challenge. Security has to be there and this always involves going through multiple approvals. So, I asked would it be okay if I could look at the AOS and I saw something interesting. I was granted permission and I just watched the AOS.

Why would an AOS instance for a Fortune 500 company on a mega powerful server only be using 1.3 GB’s of RAM?

And what happened next came through as very interesting. The AOS was only using 1.3 GB’s of RAM which is unusually low. I wanted to run an incremental compile on it to see if went up (easiest way to partially stress test an AOS), but couldn’t quite get permissions to do that. But I felt like the answer was there. I just watched it and it wasn’t capturing memory right.


 

I remembered a case of slow Dynamics AX Performance that I encountered two years ago where AX ran as if it had no memory

Dynamics AX can run in low memory conditions. I’ve seen it before on VM’s that had dynamic memory or failing ram. AX will simply throttle itself into much slower performance. I nicknamed the condition “Restrictive Mode.” Scarcity mode may have been a better term, but you get the point.
 

So, we again went back to the Dedicated Premier Support Engineer and shared our findings, but he told us it was meaningless

 

Remember, that troubleshooting with big organizations often means troubleshooting by committee. Each step carries an overhead with getting approvals. The company owned the system, but a contract with Microsoft gave Microsoft first jurisdiction for keeping performance up. I was second. So, we went back to the Microsoft representative and shared my findings. The Microsoft Premier Support representative listened to my suspicions on something with RAM, but came back with a very strong answer saying that the condition I had noticed was absolutely meaningless and normal. Furthermore, he explained that there was no such thing as a restrictive performance condition and that the behavior that I saw as abnormal was normal.
 

I didn’t trust Microsoft’s opinion here and asked if it would be possible to take my own assessment

 

The performance internal architect agreed and got an expedited ticket done for me to run my old favorite memory management tool – RAMMAP. So, we ran RAMMap and saw something very interesting. You may remember my first mentioning of RAMMap awhile ago:
 

https://www.instructorbrandon.com/troubleshooting-ax-performance-case-of-a-cloud-vm-running-out-of-memory/
 

Low and Behold RAMMAP showed an unusual amount of RAM in Standby Mode

 
Look at that. Mapped File, which corresponds with a file from disk or dynamic memory is holding an unusual amount of RAM in standby mode. It’s holding almost 15GB’s of RAM.
 

 
Interesting enough, another support department, other than the Dynamics team at Microsoft provides an explanation of the issue here along with it’s known effects of slowness:
 

https://answers.microsoft.com/en-us/windows/forum/windows_10-performance/windows-10-not-releasing-standby-memory-when/874484bc-3c4d-4f0f-83ed-000e9dab971b?auth=1
 
This actually made a lot of sense. In a classic blog post, the Dynamics AX support team explains the memory mechanism for Dynamics AX performance. While standby memory restricting free memory isn’t explained, you can see some useful tips for coding with RAM in mind or tuning.
 
https://blogs.msdn.microsoft.com/axsupport/2012/04/30/memory-usage-in-xppil-code/
 

Nobody knows everything. Why had this issue got past 3 representatives at Dynamics AX Premier Support?

The Dynamics Support team provided by an excellent introductory post on getting Dynamics AX performance. These are essential skills that everyone needs for day to day implementation running.
 
https://blogs.msdn.microsoft.com/axinthefield/dynamics-ax-performance-step/
 
But the steps would have never covered an issue like this, AX running in a low memory state due to standby memory usage. So, a restrictive ram condition was covered by the skillsets of the Windows Performance team but not the Dynamics support team. This is the typical big company stuff with getting things to the right people. Sometimes, it can be a challenge.

 

Now for the final Cool Part. Watch what happens next?

 

So, I gave the internal performance architect a theory. I said that if we cleared the standby RAM, and restarted Dynamics AX, we should notice Dynamics speed up. We couldn’t get an approval to restart Dynamics AX, but we did get approval to clear the Dynamic RAM. So, once again, using our handy dandy tool RAMMAP, I cleared the standby memory.

 

Dynamics AX Performance was actually fixed after the Dynamic Memory Cleared

 
So, after it was cleared, I was surprised. RAM throughput doubled to 4900 MB/s. The sales order confirmations went from a speed of 3 minutes to 24 seconds based on the RAM performance!!!! Dynamics AX also began to creep up in memory and went to 2.4 Gigabytes of ram usage within a few minutes.

 

At nearly 50 implementations, I was shocked at what I learned!!!!!

 

I knew that low RAM could affect Dynamics AX Performance, but I never knew that RAM speed or standby RAM could have such an effect. The results of this case provide several new directions for performance tuners wishing to optimize AX after getting SQL running well. Performance of the RAM doubled with System Information Viewer. The performance of the AOS was nearly 15 times faster! Wow. Now, if we can just get a way to see standby memory as one of the monitoring conditions on Azure VM’s that would be really nice.

 

In summary, this is where it ends

 

So, the idea of slow standby Memory and Dynamic Memory was quite a find. I really enjoyed troubleshooting it even though I wasn’t certain we could get to the root cause of the issue in a week. The cause of standby memory doing that is often at the application level, but I was asked to let the Microsoft windows team find the problematic application. I wrote a document with how to limit the mapped memory or provide an easy solution to clearing the RAM whenever the condition is encountered. I told the company to call me if they ever needed me to go deeper and find the root cause of the issues. That was that and it was a good week with plenty of other wins. However, this was the one that really stood out. I’ll be sure to check the memory state more often.

SPECIAL THANKS TO THE ANONYMOUS FORTUNE 500 COMPANY: It was a blast working with all of you. You guys really knew your stuff!

3 thoughts on “A Fortune 500 Company, a Performance Tuner, Microsoft, and Slow Dynamics AX Performance

  1. Mathieu says:

    Interesting situation. Did you have the same problem with all AOS servers ? If not, did you notice the performance issue was only on a specific AOS ? It seems that this issue could have been solved just by restarting the server, am I wrong ?

    • Brandon Ahmad
      Brandon Ahmad says:

      Sadly, we never got to check the other AOS servers due to problems with getting approval. Security for these big companies on the stock market is always tough because even the internal departments have to go through external auditors. You are absolutely correct. Restarting the server will force a clear of all RAM. And Lord knows, I’ve been in several situations where something strange was happening and we restarted the server while watching everything just start working. I now wonder if some of those restarts that I did were actually caused by this action. For this issue, I realize why it was intermittent. The issue wouldn’t be induced till enough ram was occupied as standby to cause a memory performance issue. Any periodic restarts would have fixed the issue until the bad application with the behavior eventually compromised the ram.

Leave a Reply

Your email address will not be published. Required fields are marked *