HINAL DESAI
DATA ENGINEER
Email: [email protected] Contact: 667-379-5616 GitHub LinkedIn
SUMMARY
Data Engineer, with 6+ years of experience in building and optimizing scalable data models, pipelines, and architectures. Expertise in
data analysis, statistical modeling, and predictive analytics to support business decision-making. Proven ability to improve data quality,
accessibility, and insights while reducing operational costs by 30%. Adept at collaborating with cross-functional teams to deliver data-
driven solutions that enhance business efficiency, improve forecasting accuracy, and drive actionable insights.
EDUCATION
Masters in Data Science Dec 2023
Post Baccalaureate Certification in Data Science CGPA: 3.93/4
University of Maryland Baltimore County
Bachelor of Computer Science May 2016
Sardar Patel Institute of Technology, University of Mumbai CGPA: 8.53/10
SKILLS
• Methodologies: SDLC, Agile, Waterfall
• Languages & Frameworks: SQL, Python, Scala, Java, JavaScript, C#, Apache Spark, Azure ML
• Data Engineering & Cloud Tools: Apache Kafka, Apache Airflow, Azure Data Factory, AWS Glue, AWS DynamoDB, Microsoft Azure
(Data Lake, Synapse, CosmosDB, Blob Storage, EventHub), AWS (S3, Lambda, Redshift)
• Databases: SQL Server, PostgreSQL, MySQL, SingleStore, CosmosDB
• Data Warehousing & ETL: Azure Synapse Analytics, Snowflake, Talend
• Data Visualization: Power BI, Tableau, Dash
• Machine Learning & Data Analysis: Scikit-learn, TensorFlow, Pandas, NumPy
• Version Control & CI/CD: Git, Jenkins, VSTS (CI/CD), Docker
EXPERIENCE
Xcaliber Health, Boston, MA Aug 2023 – Present
Sr. Data Engineer
• Developed scalable data architecture solutions using AWS Glue, improving data processing speed by 40%.
• Led a team in designing real-time data pipelines, processing over 10 million events per day using Apache Kafka and Airflow.
• Performed data mining and predictive analysis with Python and SQL, identifying trends that improved operational decision-making
by 15%.
• Reduced deployment times by 60% through CI/CD pipelines using Git, increasing system reliability and minimizing manual errors.
• Collaborated with stakeholders to develop custom dashboards in Power BI, enhancing decision-making speed by 20%.
• Developed machine learning models predicting patient outcomes, improving healthcare efficiency and accuracy in patient care
predictions by 18%.
• Automated infrastructure provisioning using Terraform, improving scalability and reducing setup time by 30%.
• Designed and implemented a highly available AWS DynamoDB solution for real-time data access, reducing data retrieval times by
50% and supporting high-volume transactions.
• Generated insights from complex healthcare datasets through statistical analysis, supporting business strategy formulation and
increasing operational efficiency by 10%.
• Optimized AWS DynamoDB read/write capacity settings to handle fluctuating workloads, improving cost efficiency by 20% while
maintaining low latency for data access.
• Conducted A/B testing on feature deployments, improving user engagement and reducing churn rate by 5%.
• Implemented data quality checks and validation processes, reducing data errors by 25% and ensuring accuracy for reporting.
Microsoft, Hyderabad, India Jun 2016 – Dec 2021
Data Engineer
• Improved platform reliability by 5.4% by stabilizing and optimizing systems using Azure Data Factory and Azure Data Lake for data
ingestion and transformation.
• Performed exploratory data analysis (EDA) to uncover actionable insights from large datasets, contributing to a 10% increase in team
productivity.
• Reduced processing times by 7-8 hours by leveraging Azure Functions for automated job execution and Apache Kafka for real-time
streaming.
• Built machine learning models with Python and Azure ML to predict user behavior and optimize platform performance, resulting in
a 12% improvement in service reliability.
• Created telemetry dashboards and automated monitoring solutions using Azure Synapse, cutting manual oversight time by 4 hours
weekly.
• Designed data visualizations in Power BI to communicate key findings to senior management, improving strategic decision-making.
• Streamlined CI/CD pipelines using VSTS, reducing deployment times from 2-3 hours to 15 minutes across multiple teams.
• Migrated legacy processes to Azure DataBricks, reducing execution time from 2 days to 3 hours and boosting system scalability by
6.8%.
• Utilized SQL and Python for ad hoc data analysis, providing timely and accurate reporting that reduced operational inefficiencies by
7%.
• Collaborated with data scientists to develop and deploy predictive models that forecast user demand, leading to a 5% cost reduction
in resource allocation.