The Role of Data Science in Transforming Trucking
By Deepak Warrier
With approximately 75% of the goods being transported using trucks and with over 7 million trucks on road, the logistics economy in India is dominated by road freight. The small fleet operators who own less than 5 trucks are the backbone, the crucial players in the logistics economy. However trucking is a complex landscape and comes with its own challenges and inefficiencies.
Most trucks on an average run 15 - 16 days of the 30 billable days. 100% utilization of assets is practically impossible in this segment primarily because finding a load is difficult. It takes 1 to 3 days to get a load and a lot of trucks end up running empty on roads with no return loads. In addition unpredictable waiting times during loading/unloading and volatility in pricing pose fundamental challenges. Technology can play an immense role in addressing these challenges and inefficiencies plaguing the ecosystem.
Data Science, Machine Learning and AI have evolved significantly in the last decade, witnessing widespread mainstream deployments across industries and delivering true ROI. In logistics in particular, Data Science has emerged as a force multiplier in organizing and transforming the trucking domain in India.
To understand the deep impact of the role of Data Science in trucking, let us look at two scenarios. A Trucker taking a Full Truck Load (FTL) typically travels an average of 600kms per trip moving goods between states over 2-4 days. Scenario 1 - a trucker is moving coal from Kandla, Gujarat to Muzaffarnagar, UP covering 1200 kms over 4 days. Upon arrival at Muzaffarnagar, the truck takes anywhere between 3-12 hours for unloading the goods depending on queues at the destination warehouse. Post unloading, the trucker now starts calling his/her contacts to seek a good return load for the next trip. In a traditional and unorganized setting, it is normal for a trucker to wait 12-48 hours to confirm the next load and also end-up traveling at least 50-150 kms for the next load aka ‘dry run’. Which means after reaching Muzaffarnagar, a typical return loop starts from Karnal, Haryana 80 km away from Muzaffarnagar.
Let’s contrast the above example with the ideal behavior on a technology platform powered by data and algorithms. With Data Science, we can build models to anticipate when a trucker will finish his delivery and need another load on his return route. The platform can upfront recommend relevant loads before the trucker even completes his unloading as opposed to the offline world where a trucker has to call and look for the next load after he finishes unloading.
Now in scenario 2, if the same truck traveling from Kandla to Muzaffarnagar is fitted with a GPS device it can send GPS data-pings every 10 secs triggering a comprehensive suite of Machine Learning algorithms. The Data science efforts begin with endowing meaning to the raw GPS trajectory that is being streamed from the devices onto the digital platform’s data lakes. As it observes the trajectory of GPS pings moving away from Kandla and approaching Muzaffarnagar, it can continuously predict the potential state of the truck. It can deploy deep learning based Spatial-temporal sequence models to understand if the truck is still ‘in-transit’? or has it entered the ‘unloading-complete’ state? or has it moved into ‘is-available for the next load’ state? Based on the model's confidence on whether the truck is going to be available soon, an efficient match-making engine can proactively recommend new loads as per the truck’s proximity and relevance.
These recommended loads are generated via relevance models which are trained over trucker’s historical click-stream data. Machine learning models pick up signals every time a trucker engages with the platform to search for loads. The model learns to decode these signals to build a profile of the truckers’ latent preference across various entities - geographical lanes, product-types, price-points, shippers, ratings etc.
This profile creation helps in optimal allocation and distribution of relevant loads to the right type of trucker. DS deploys optimization algorithms in frequent intervals to optimally distribute and allocate the fresh loads among truckers based on availability, proximity and relevance constraints.
The end-goal is any trucker who is engaging on the platform is already aware of his next load even before he completes his unloading. These recommendations are optimized to create greater discovery of loads and at the same time significantly reduce the idle time of a truck and cut-down potential dry-run kms. It is aimed at improving revenues as well as reducing operational costs for a truck operator. A digital freight network that is well optimized can empower a truck operator to better utilize his assets; reduce idle time by 40% - 50%, and improve his earnings by 20% to 30%.
However, Data science and machine learning modules need to constantly evolve. Algorithms and modules must be continuously re-calibrated and re-optimized based on both implicit and explicit feedback obtained via customers, in this case truckers who respond to the algorithm outputs. This ensures that the match-making engine is consistently able to better match the right trucker with the right load at the right time improving the rate of deal-closures on a platform.
Blog design: Puranjai Pratap
This article was originally published in ET CIO