Phi 21 Technologies

Back

Architecture Consulting to address performance and scalability challenges in a Fleet Management

Overview

The customer is a leader in providing Fleet Management Solution which can be used to track, monitor, optimize, comply with regulations and ensure safety of the fleet of vehicles operated by fleet operators. As the customer base grew over the years and the fleet size of customers also increased it put stress on the performance and scalability of the platform. The client’s development team was looking at ways to stabilize the current platform. Phi 21 was engaged to perform architectural assessment of the current platform to identify root cause for performance issues and recommend solutions for overcoming those issues as well as review and provide recommendations for solutions that were under consideration by the client’s development team.

Approach

The project execution was broken down into 3 key phases for a period of four months as shown below.

Phase 1 - Discovery

This phase involved assessment of current state architecture of the fleet management Platform using Phi 21 Architecture Assessment Methodology. Our methodology includes the review of the system from 5 different perspectives as shown below.

As a result of this assessment Phi 21 was able to get insights into the use cases, logical and deployment architecture, interactions between components, summary data model as well as top 10 use cases with performance challenges of fleet management platform.

Phase 2 - Deep Dive

Each of the top 10 use cases with performance challenges were analysed deeply by reviewing the data model, entity relationships, queries, ingress/egress communication endpoints and past performance data. In addition to this profiling of business transactions, code level diagnostics, concurrency related bottlenecks, efficiency of cache and connection pool utilization, resource leaks and bottlenecks at the infrastructure layer were analysed. The outcome of this exercise was Root Cause Analysis Report which summarized learning from deep dive analysis and outlined potential root causes.

Phase 3 - Solutioning

The learnings from the previous phase were used to:

Outline key improvements to stabilize the current platform as listed below.

Implement data partitioning.
Denormalize key tables to improve read performance.
Introduce batch processing of data instead of one record at a time.
Use of distributed locking instead of table based locking.
Improved query plan caching mechanism.

Process improvements

Defined Performance Scalability Reliability (PSR) testing strategy to ensure that every new release of the platform meets the speed, responsiveness, throughput, scalability and stability requirements under the workloads expected in production.
Data is archived periodically so that the volume of data that the platform manages does not keep growing over the years.

Outcome

The recommended solution provided ways to achieve significant improvement in performance compared to the existing platform performance.
The recommended solution provided ways to significantly reduce the load on the system and thereby improve the scalability of the platform.
Blueprint for PSR testing strategy to meet performance & scalability needs.

Case Studies