About the performance issue

I’m Shangwei Yin, the tech lead of the ThoughtWorks SIGLUS team.
We customize the OpenLMIS v3.6 for Mozambique, and there are more than 1,300 products(orderables) in our system.

As I mentioned at the last OpenLMIS Product Committee, when we run a test with these products, we found some performance issues and we listed two below:

  • Requisition initiate performance issue.

When we call the initiate requisition API, it needs more than 3 minutes to get the response. The reason why it is so slow is as below:

  • During initiating requisition, we need to call another service stock management service to get stock on hands, beginning balances, ideal stock amounts for average, the below screen capture shows the details. The common thing is the request parameters contain a list of approved product id which we get by facility and program before. And then the BaseCommunicationService will split the request URL into many sub-requests if the URL is longer than max URL length.

  • image1.png

  • The below capture shows the BaseCommunicationService URL split logic. If we have too many approved products, for example, 2000 products, one sub URL will contain only 20 products. That means the one request will be divided into 100 subrequests and they are in serial, not in parallel. That’s the main root cause of the performance issue.

  • image2.png

  • image3.png

  • Stock on hand performance issue .

    • For the stock movement, we only save moment’s quantity in OpenLMIS V3’s new logic. When we need to calculate the latest stock on hand, it will calculate every quantity movement from the first record. If we only have several month’s movement data, it doesn’t matter. However, if we have several years of data, then we will have a performance issue. Think about the real data, every facility has an average 300 product that needs to calculate the stock on hand when submitting a requisition, and every product has average 5 lots, every lot has several year’s stock movement. Then we calculate the stock on hand following the below logic, it will be very slow. In Mozambique, we have such real data in production env.

    • image4.png

    • image5.png

Please help us to review the issue we mentioned above and let me know if you have any questions.

Hello Shangwei,

thank you for reaching to us with those concrete problems and analysis of the issues. The images do not load for me, but I was able to make sense of it, just basing on your write-ups.

Concerning the general performance of requisitions and retrieval of orderables and facility type approved products - it’s our main goal to improve those areas for OpenLMIS 3.8. The first step made in version 3.7 was to introduce the caching, but the performance gain was not satisfying yet, and this does not improve the performance of service-to-service communication. The biggest bottleneck currently is the FacilityTypeApprovedProducts endpoints. They work quite slow with big number of products and they are used in many places.

Concerning the SoH recalculation with each movement - this is a good observation. We have already fixed that in 3.7 as this was also identified as a high-risk area for other implementations, like Angola.

I also wanted to mention that we discuss the improvements we make after each sprint on the Sprint Showcases (every other Wednesday, 4pm CAT). If you are interested in hearing more details about specific fixes we make, you are welcome to join us! The same applies if you would like to point specific areas that still need improvements or if you have some ideas/feedback what to focus on.

Thanks,
Sebastian

1 Like