Author Image

Hey, I am Ari

Arihant Surana

Principal Data Systems Engineer at Nearmap

I am a Principal Machine Learning Engineer and Technical Leader with over 12 years of experience building large-scale data products, machine learning and deep learning platforms. My expertise lies in developing global engineering teams and driving tactical and strategic roadmaps that deliver cutting-edge machine learning products. Throughout my career, I have honed my skills in quality software engineering, designing and implementing complex algorithms, data modeling, and data analysis to provide robust and scalable solutions.

Skills

Experiences

1
Principal Machine Learning Engineer
Nearmap

Feb 2021 - Present, Sydney, Australia

Nearmap is a technology company that provides high-resolution aerial imagery, city-scale 3D content, and software as a service (SaaS) solutions to businesses and government entities. The company was founded in 2007 in Australia and has since expanded to the United States, Canada and New Zealand.

Responsibilities:
  • Led the design and implementation of the data engine platform, which automates the accumulation, refinement, and processing of training datasets for Nearmap’s AI product. The platform improves system reproducibility, observability, and reliability by providing a unified data pipeline that integrates heterogeneous data sources such as shape, GIS, imagery, and digital surface models.
  • Developed and automated end-to-end training and inference feedback systems that leverage machine learning algorithms to improve data quality and labeller efficiency. The systems facilitate continuous model improvement by automated feedback loops on model performance and training data quality.
  • Led the design and implementation of a custom labelling data capture system that maximizes training data quality and capability. The system provides an intuitive interface for expert human labellers to label complex GIS and image datasets, resulting in improved accuracy of deep learning models.
  • Developed and executed a multi-year strategic roadmap for Nearmap’s AI data platforms, which included initiatives such as data pipeline automation, continuous model improvement, and data management and discovery. The roadmap drove revenue growth and operational efficiency.
  • Championed a culture of technical excellence and machine learning engineering best practices across the organisation. Led the adoption of software engineering standards, resulting in increased team efficiency and delivery of high-quality solutions.
  • Developed and automated large-scale sampling systems that effectively sampled petabytes of GIS and image data from Nearmap’s imagery archive. The systems ensure data accuracy and integrity by providing a statistically representative sample of the data for model training and validation.

hipages group

February 2018 - January 2021, Sydney, Australia

hipages, ASX:HPG is Australia’s largest online tradie marketplace, connecting homeowners and businesses with trusted tradies.

Principal Data Architect

September 2020 - January 2021

  • Spearheaded the technical direction for Data Engineering at hipages, managing the Data Platform and its delivery roadmap, and driving the transformation of data engineering capabilities towards a comprehensive data lake and machine learning platform that delivered business-critical analytics and machine learning products.
  • Identified areas of the product that were viable for implementing bespoke machine learning solutions, resulting in better conversion rates and an improved user experience.
  • Designed and built machine learning products that delivered automation and experience improvement, including a machine learning inference API using Python, Hyperopt, and Flask. This solution provided a horizontally scalable REST API that supported A/B test capabilities and integrated model training with a distributed hyper optimizer.
  • Open-sourced the hyperparameter optimization tools used in the machine learning inference API, including the hyperkops and chart-hyperopt repositories on GitHub.
  • Developed a modular event pipeline based on Snowplow technology using Scala, Python, Spark, Hive, and Kafka. This resulted in a significant reduction of the clickstream analytics latency from 24 hours to just 60 seconds, enabling real-time insights and better decision-making.
  • Applied evolutionary architecture principles and delivering agile solutions that impact the business.
Data Architect

February 2018 - August 2020

  • Modernised datainfrastructure and BI capabilities and setup the data engineering practice at hipages.
  • Designed and delivered Machine Learning models that integrate with the product delivering automation and experience improvement
  • Highly scalable machine learning model inference services with Python and Kubernetes
  • Delivered a scalable, distributed and configuration driven, data acquisition framework. This new framework allows the organisation to process vast amounts of relational data into a scalable data lake ecosystem.
  • Developed a modular event pipeline based on FOSS Snowplow technology using Scala, Python, Spark, Hive and Kafka. This resulted in a reduction of the clickstream analytics latency from 24 hours to 60 seconds. Also, contributed to the Snowplow project.
  • Developed a suite of data acquisition and transformation frameworks written in Scala and Python. As a result, the total data ingestion increased from less than 5 GB/month to more than 1TB/month meanwhile driving an increase in the analytics adoption, increasing daily active users from 5% to 30% of the organisation.
2

3
Senior Data Engineer
BigCommerce

August 2016 - February 2018, Sydney, Australia

BigCommerce, NASDAQ:BIGC is a leading Open SaaS solution, and empowers merchants to build, innovate and grow their businesses online. Simply put, we focus on being the best commerce platform so our customers can focus on what matters most, growing their businesses.

Responsibilities:
  • Implemented a real-time stream processing engine that delivered a high-quality consumer data product by leveraging Apache Beam, Cloud Dataflow, BigTable, PubSub, and Cloud Spanner. The system was capable of auto-scaling from 10 events/sec to 100 thousand events/sec in under 5 minutes, providing customers with real-time data insights.
  • Designed and deployed the next generation of merchant analytics and insights products, resulting in $1+ Million in revenue on the platform. Leveraging Scala, Kafka, HBase, and Phoenix, a real-time event analytics platform was built, enabling merchants to gain insights into their sales and customers in real-time.
  • Implemented an autonomous data audit and anomaly correction system that significantly improved data quality by reducing data anomalies from 5% to less than 0.01%. This system allowed the team to detect and correct data anomalies in real-time, improving overall data accuracy and integrity.
  • Integrated and developed internal data sources into a single source of truth using Python, Java, Spark, and Apache Airflow, resulting in a more resilient system with a significant reduction in failure and incident rates. This improved the overall stability of the data platform and provided better support to internal stakeholders.

Deloitte

Nov 2013 - July 2016, NCR, India & Singapore

Deloitte Touche Tohmatsu Limited, commonly referred to as Deloitte, is a multinational professional services network with offices in over 150 countries and territories around the world. Deloitte is one of the Big Four accounting organizations and the largest professional services network in the world by revenue and number of professionals.

Senior Consultant - Data Science

April 2016 - July 2016

  • As a Senior Consultant for the Technology and Analytics consulting practice at Deloitte, I managed relationships and drove data analytics and engineering projects for clients across multiple industries. Some of my notable contributions include the following.
  • Monetary Authority of Singapore - Data warehouse and financial compliance system
  • As part of the project team for Monetary Authority of Singapore, I delivered a financial audit and warehousing system. This system reduced the annual audit time from 2-3 months down to just 15 days, resulting in increased operational efficiency.
  • GE Healthcare, US, India - Digital transformation and data governance
  • I played a key role in the digital transformation and data governance project for GE Healthcare across the US and India.
  • VMWare, India - Data engineering and analytics technology transformation
  • I designed and developed a real-time stream processing platform for VMWare’s BI data streams using Java, Hadoop, Spark, Kafka, and Elasticsearch. The implementation of this platform significantly improved the efficiency and effectiveness of VMWare’s data engineering and analytics operations.
  • United States Golf Association, US - Digital transformation, cloud modernisation and data governance
  • For United States Golf Association, I led a team of 10 backend engineers to design and implement the backend for a real-time handicap calculation and analytics tool. This was achieved using MS SQL Server, SSIS, and C#.
  • During my time at Deloitte, I collaborated with diverse teams and received two promotions within two years, eventually leading me to manage 15 individuals in engineering teams located across India, Singapore, and the US.
Consultant

April 2015 - JulAprily 2016

Analyst

Nov 2013 - April 2015

4

5
Software Engineer
BirlaSoft

July 2011 - Oct 2013, NCR, India

Birlasoft, NSE:BSOFT is a multinational Information Technology services provider. It has a global footprint and best-in-class delivery centers.

Responsibilities:
  • As a backend and data warehouse developer at UTC, I successfully managed and integrated 400 heterogeneous data sources across three continents. Through optimization of critical ETL processes, I was able to reduce processing time by 20x.
  • At GE Healthcare, I developed a data synchronization tool that tracked and predicted maintenance cycles for medical equipment, resulting in a 30% decrease in mean time to failure rate.

Education

Bachelor of Technology in Computer Science (Hons.)

Recent Posts