Sat. Feb 24th, 2024

In today’s time of high-performance computing, we no longer have the luxury of measuring our time in milliseconds, we need to move past this into nanoseconds. When I first started programming in the ’90s we measured our code in seconds. That quickly gave way to milliseconds, even before the ’90s were over. Computing power has reached new and incredible thresholds. Today’s systems need to be evaluated at the nanosecond level, not the millisecond level.

We have systems that need to be able to perform not millions, but billions or trillions of transactions. Now we are not at a point where we can expect a single service, to be able to perform a trillion transactions in a minute. We have to look to our systems on how to do that. This article is to discuss some of the steps needed to reach those levels.

Estimating Load

This might be the most difficult part of any system. Is to estimate the load your application must handle. Is there a right way or wrong way? Only if you just guess. Unfortunately, I see that happening far too often. You will need to tweak systems for your environment and design. For the sake of this discussion, I’m going to use nice simple numbers.

Let’s assume we have 5 microservices, and each microservice has 2 rest APIs. Let’s say I sell a product internationally, however my largest product area is the United States with 60% of sales. My volume is 1,000,000 units a month. Sales are pretty evenly distributed over the month. Now for my product setup, it needs to hit all 10 APIs in my system to be complete on setup. Now my product is mainly used in the office.

Now to estimate the number of calls that my microservices need to handle at a time, will be at peak time.

The peak time will be during the hours of 11 AM EST to 5 PM EST, These office hours will overlap the entire continental United States. Now I have 60% of 1,000,000 or 600,000 units that could be loading over those peak hours. 20 business days in a month so that leaves me 600,000 / 20 = 30,000 units a business day. Now 30,000 / (6 hours * 60 minutes * 60 seconds) = ~1.4 calls per second.

Now in this scenario, if each API call can execute in 500ms or less, we not only exceed our needed amount but have a buffer. Is this fully accurate? No, it’s not going to be. Does it give you a ballpark to aim for when designing? Yes, it does. If you want a more accurate way to estimate your needs, hire a statistician to calculate for you.

Microservices v’s Monolith systems

Will discuss the pros and cons of these 2 systems designs. They each have a number of pros and cons.


An approach to software architecture that divided a large complex system into smaller individual business processes that run independently but work together.


  • Independent Load Balancing
  • Scale business process horitzontally
  • Small codebase easier for onboarding and learning
  • Easy to update


A single system that contains all the business processes of an entire system. One large-scale system combined into a single process that performs all of the work necessary to complete the processing required.


  • No interprocess communication required
  • Can be scaled horitzontally or vertically
  • Single database


  • Interservice communication adds latency


  • System failure and the entire system is down.
  • Requires bringing down the entire system to update.


  • If I spend more time communicating with other microservices than in the microservice originally called, my architecture is wrong.


  • If more than 10% of my api calls are around a single business process then I may need to split that as it’s own microservice.

Monolith with microservices

I for one, never settle for saying one method is better than another. I often will find a compromise between multiple different approaches is often the better solution. Maybe a portion of your application can be run as a monolith, with certain business processes being split off to be a separate microservice. Or having that one business process that is particularly large for a microservice, however splitting it out is going to cost you in service to service communication. That you end up with a Monoservice, a single business process, but one that is extremely large for a microservice.

Microservices Fanatics

You all will say that a Monoservice just was not properly divided into business processes. On the surface, this would appear true. However there are business processes that are extremely large, do not split easily, without generating a vast amount of microservice to microservice calls, or a mass of messaging for downstream services, that can cause their own issues. I’ve worked on a system we tried to split many times to find a dividing line. While on paper it looked good, in practicality it did not work. The dependence was too high. We spent more time having to call other microservices than we spent in our original service.

Horitzontal v’s Vertical Scaling

Vertical Scaling

Adding additional computing resources such as CPU(s), memory, network, and disk space. Nothing changed in the code, it remains the same, it just now runs on a system with higher specs.

Horitzontal Scaling

Adds additional instances of the system to run in parallel. Code may have to be modified to support synchronization between the processes on shared resources.

Both methods are extremely valid, and both will work. However sticking to a single strategy without considering the other possibilities, is not a good option. I have something I call Diagonal Scaling.

Diagonal Scaling

This is looking at your cost analysis for scaling. Adding additional resources to an instance is going to incur one cost while adding additional instances of an instance will incur a second cost.

Practicality v’s Design

This is one of my favorite topics. I have seen on paper many designs that look amazing, even beautiful, both in their flow, layout, and synchronization. Some that I would say would be a solid 15 on a scale of 10. I will tell you these designs were so well thought out, with the intention to detail beyond anything you could imagine. Honestly, I have seen some I am jealous of, wishing I could say I designed it.

The problem with these, they are not practical. On paper, it works, but in implementation, it leads to so many issues.

  1. Complexity when everything is interconnected, it leads a complexity that an cause major issues when a minor change is introduced at one level, and it causes a error/exception to occur elsewhere in an unrelated piece of code.
  2. Flexibility is often given up when a system is detailed to certain levels. It becomes difficult to modify something as simple as a data structure, without change effecting a large number of components.


Whether it’s a database connection, HTTPClient, or dozens of other possibilities, you need to find a way to reuse as many components as you possibly can. The reuse of objects is a significant key to reducing the milliseconds necessary for processing.

Internet connections take time to establish, and depending on what you’re connecting to can take up valuable connections. Keeping one around to use removes that setup time and significantly speeds up your code. Also properly closing a connection when you are entirely done with it is just as important.

HttpClient under spring can under the right circumstances have significant performance hits on initialization. Not sure on the exact underpinnings of it, but I have seen it take minutes for an HttpClient to be created. Crazy? I thought so as well, it took me a week of investigating to find the exact reason it was failing. Whether it’s an HttpClient, or HttpTemplate make reuse something you do!


Properly closing resources when they are no longer needed can significantly help on resources. This can lead to problems instantiating new objects if they need resources that are limited. Make use of the try-with-resources Tip 17, this guarantees cleanup under any circumstances.

Hibernate / Database

Most projects will be using Hibernate for talking to their databases. It’s a great tool, how we use it can make a significant difference in how well it performs. So let’s take a quick look at a few things with it.

Here are some details on my Java Tips that go over ways to boost performance:

Java Tips 3

  • Tip 11: Use Hibernate Statistics
  • Tip 12: Hibernate Slow Query Log
  • Tip 15: Pad Hibernate Parameters in “IN” clause

Java Tips 4

  • Tip 16: Perform Bulk operations with Native Queries, or Stored Procedure

Java Tips 5

  • Tip 21: Use Prepared Statements

These are simple steps you can do to help identify and fix slow database action. Make use of these to help improve your performance. These tools are significant.


This often is overlooked and many developers don’t understand the cost. While extra processing of a single log print may not seem like much, it’s the cumulative effect that adds up. Just a few steps can make a significant improvement overall. Please see Tip 4: Logging with Formatting on how to use formatting to your advantage. Also, a most overlooked Tip 6: Don’t call .toString() on slf4j logging calls has the same effect but people don’t realize it’s not necessary.

Java Programming Tips

Not going to go over tips on programming Java for performance, instead please see my Java Tips series. That is an ever-growing list of tips on Java Programming for both performance and bug avoidance.

Performance Testing

If your building large transaction systems, then nothing is more important than implementing performance testing from the start. This begins with the developer and the tool that I recommend is Apache JMeter. Rest APIs are the cornerstone of large transaction systems. JMeter provides a system for making those API calls with multiple threads as if they were multiple clients. Being able to perform dozens, hundreds, or thousands of clients against your system to test stability and performance, is priceless. Begin with this tool from the start and keep extending it as you develop your application.

Often not considered for performance testing is JUnit/Mockito. Have your JUnit tests in place to perform testing of the code, but take it to the next level by adding in performance testing. Use Mockito to mock remote services, so you only test locally.

There are plenty of methods and frameworks for testing performance. Find one, and make use of it in your project.


Twenty years ago processing a million transactions a day seemed unlikely. Ten years ago a billion transactions weren’t likely. Today trillions of transactions are happening. Applications are no longer the correct term to use for these systems. They are a system, very complex, high compacity, and requires extreme performance. I hope I have shown you how small choices we make can impact our performance.

Just think if we experience a trillion transactions if we lose 1 nanosecond, that is a cost of 16.66 minutes. Keep that in mind every nanosecond we lose at a trillion transactions is nearly 17 minutes of time. Let me say that one more time:

1 Nanosecond @ 1 Trillion Transactions = 16.66 minutes

1 Millisecond @ 1 Trillion Tansactions = 31.7 years

Let that sink in. 1ms given up over 1 trillion transactions = 31.7 years, we can no longer give up a millisecond. We have to measure at the nanosecond level, we have to optimize like never before. That means we have to look for every way we can imagine to boost performance. Everything has to be watched.

By Jeffery Miller

I am known for being able to quickly decipher difficult problems to assist development teams in producing a solution. I have been called upon to be the Team Lead for multiple large-scale projects. I have a keen interest in learning new technologies, always ready for a new challenge.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.