I'm a Data scientist and Machine learning enthusiast passionate about transforming complex data into actionable insights. With extensive experience in deploying scalable AI solutions on cloud platforms and a strong background in statistical analysis and data visualization, I'm dedicated to delivering data-driven solutions that enhance business outcomes.
- 📍 I'm based in Canada (Open to Remote and Relocate)
- 🌱 I’m currently advancing my knowledge in Generative AI, Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG).
- 👯 I’m looking to collaborate on open-source projects and innovative AI solutions.
- 💬 I'm happy to chat about R, Python, Machine Learning, Data Analysis, Chatbots, LLMs, Natural Language Processing, Deep Learning, MLOps and Cloud Computing.
- 📫 You can reach me via email: [email protected]
- ⚡ Fun fact: I enjoy travelling, hiking and exploring new technologies in my free time.
-
Duration: Aug'24 — Present
-
Location: Montreal, Canada
- Package Development: Built and deployed a custom Large Language Model (LLM) software for research synthesis and high-level text summarization, using open-source LLM models (Llama) to streamline systematic literature reviews and synthesize information.
- Automated Summarization: Developed advanced summarization capabilities, enabling researchers to extract key insights from large volumes of literature efficiently, thereby accelerating systematic reviews.
- Scalability and Modularity: Structured the package with a modular design to support flexibility in research applications, enabling easy scaling and integration with diverse research workflows.
- Documentation and Usability: Curated detailed documentation, including example workflows and best practices, to assist users in quickly adopting and customizing the tool for their specific research needs.
- Version Control and Community Engagement: Leveraged GitHub for version control, encouraging collaboration and community contributions to continuously improve functionality and user experience.
-
Duration: Jul '24 — Present
-
Location: Montreal, Canada
- Package Development: Developed a comprehensive Python package named GDriveOps to streamline interactions with Google Drive, enabling seamless file operations such as download, upload, and conversion.
- File Management Functions: Implemented core functions for downloading PDFs from Google Drive, converting PDFs to text, and uploading text files back to Google Drive, ensuring efficient file handling and processing.
- Error Handling and Optimization: Integrated robust error handling and optimization techniques to ensure the package performs reliably under several conditions and minimizes potential disruptions during file operations.
- Unit Testing: Designed and implemented extensive unit tests to ensure the reliability and correctness of the package's functionalities, achieving high code coverage and maintaining code quality.
- Documentation and Usability: Provided comprehensive documentation, including usage examples and detailed instructions, to facilitate ease of use and quick integration for end-users and developers.
- Modular Design: Adopted a modular design approach, allowing for easy maintenance, extension, and customization of the package, catering to different user requirements and evolving project needs.
- Version Control and Collaboration: Utilized GitHub for version control and collaborative development, maintaining a well-structured and organized repository to support community contributions and continuous improvement.
- API Integration: Implemented flexible API integration capabilities to enhance the package's interoperability with other tools and services, providing users with a versatile and powerful solution for Google Drive operations.
-
Duration: Jun '24 — Present
-
Location: Montreal, Canada
- Developed an AI-powered chatbot (using large language models) that provides users with detailed information about scholarships, including eligibility criteria, application processes, and deadlines. Hosted the chatbot on Heroku, allowing users to ask questions and receive personalized scholarship guidance.
- Design and Implementation: Designed the chatbot’s architecture and developed it using large language models. Created a detailed system prompt to guide the chatbot's interactions, ensuring it provides accurate and relevant information.
- Integration: Integrated LangChain for natural language processing and used OpenAI and Groq for the AI model.
- Backend Development: Utilized Flask to build the backend infrastructure, including creating APIs for handling user requests and responses. Implemented routes and request handlers to manage the chatbot’s functionalities.
- Deployment: Deployed the chatbot on Heroku, making it accessible to users online. Documented the entire project on GitHub, including setup instructions and usage details.
- User Interaction and Experience: Implemented a conversation memory feature to maintain context over multiple interactions. Enhanced user experience by applying clickable link formatting to responses.
- Achievements: Successfully created a functional chatbot that assists users in finding and applying for scholarships. Enhanced user experience by implementing clickable link formatting and a structured interaction process.
- Technologies: Python, LangChain, OpenAI, Groq, Flask, Heroku, GitHub
-
Duration: Apr '24 — Jul '24
-
Location: Toronto (Remote), Canada
- Developed an AI-driven chatbot to provide information on diabetes to patients. The chatbot offers users concise and relevant answers to their queries about diabetes management.
- Design and Implementation: I designed and developed the chatbot’s architecture using large language models.
- Integration: Integrated LangChain for natural language processing and Voyage AI for vector embeddings. Used Pinecone for vector storage and retrieval, enhancing the chatbot's response accuracy. Employed the Groq platform to enhance the chatbot’s AI capabilities.
- Deployment: I deployed the chatbot on Streamlit, making it accessible to users online. Managed data storage and retrieval using AWS services and documented the entire project on GitHub.
- Retrieval-Augmented Generation (RAG) Approach: Implemented a RAG approach to improve the quality and relevance of the chatbot’s responses. Combined information retrieval and generation techniques to provide comprehensive and precise answers.
- User Interaction and Experience: Enhanced user experience by applying clickable link formatting to responses and providing pre-signed URLs for additional information. Implemented conversation memory to maintain context over multiple interactions.
- Achievements: Successfully created a functional chatbot that assists patients in understanding and managing diabetes. Improved the chatbot’s user interaction flow, resulting in a more intuitive and helpful user experience.
- Technologies: Python, Streamlit, LangChain, OpenAI, Pinecone, Voyage AI, Groq, AWS (S3, EC2 & Lambda), GitHub
- Duration: Jan '23 — Apr '23
- Location: Kingston (Remote), Canada
- Developed machine learning models to forecast greenhouse gas emissions in Canada and assess how Canada can meet its climate targets as part of the Global Methane Pledge.
- Data Analysis: Collected and preprocessed historical (30 years) greenhouse gas emission data for Canada. Conducted exploratory data analysis to identify trends and patterns in the data.
- Model Development: Developed time series forecasting models using SARIMA and Prophet to predict future emissions. Validated and optimized the models to ensure accurate forecasts.
- Evaluation and Reporting: Evaluated Canada’s climate policies and their effectiveness in meeting emission targets. Projected future emission scenarios based on different policy interventions. Created visualizations and interactive Jupyter Notebooks to communicate findings.
- Documentation and Collaboration: Documented the entire project process and findings on GitHub. Collaborated with team to refine models and improve analysis.
- Achievements: Successfully developed accurate forecasting models that provide valuable insights into Canada’s progress toward emission reduction targets. Contributed to policymaking by offering data-driven recommendations and projections.
- Technologies: Python, SARIMA, Prophet, Pandas, Matplotlib, Jupyter Notebooks, GitHub
- Duration: Jun '24 - Jan '25
- Location: Montreal, Canada (Remote)
- Developed an AI voice agent to revolutionize user interactions using Large Language Models (LLMs), Natural Language Processing (NLP), and Generative AI.
- Conducted in-depth research on various LLMs to identify the most suitable models for the AI voice agent and implemented them to optimize performance for voice recognition and response generation.
- Developed and integrated advanced NLP techniques to enhance the AI voice agent's ability to understand and respond to spoken language accurately.
- Designed, trained, and fine-tuned machine learning models to improve the accuracy and responsiveness of the AI voice agent, utilizing various machine learning algorithms and deep learning methods.
- Collected and processed voice data to ensure high-quality inputs for model training and validation and conducted exploratory data analysis to identify patterns and insights.
- Collaborated with cross-functional teams to integrate the AI voice agent with other systems and services, performed rigorous testing and validation to ensure effective operation in real-world scenarios.
- Continuously monitoring the performance of the AI voice agent, implementing updates and optimizations as needed, and staying updated with the latest advancements in LLMs, NLP, and AI technologies to leverage cutting-edge solutions.
- Duration: Sep '20 — Jan '25
- Location: Montreal, QC, Canada
- Built and deployed a custom Large Language Model (LLM) software for research synthesis and high-level text summarization, using open-source LLM models (LLama) to streamline systematic literature reviews and synthesize information.
- Conducted data-driven research using extensive datasets on species distributions and climate change impacts, applying statistical tools using Python, R, and Bayesian inference for in-depth analysis.
- Developed machine learning models to forecast species distribution under climate change, leveraging regression techniques and random forests to model habitat suitability and migration patterns.
- Applied Bayesian Statistics to model species responses to environmental changes, focusing on how migration patterns influence specialization.
- Collaborated with cross-disciplinary teams (from Canada, USA, UK, and Africa) to integrate ecological forecasting models with real-world environmental data, improving conservation strategies through predictive analytics.
- Duration: May '24 — Oct '24
- Location: Iowa, United States (Remote)
- Built a chatbot using Retrieval-Augmented Generation (RAG) and the Langchain framework to enhance user interaction and support. This project integrates advanced NLP techniques to create a conversational AI capable of understanding and responding to complex queries, improving customer service and engagement.
- Worked on data collection, cleaning, and processing to ensure high-quality data for analysis.
- Assisted in software engineering tasks, contributing to the overall development process.
- Duration: Apr '24 — Present
- Location: Toronto, Canada (Remote)
- Developed AI-driven chatbots to revolutionize patient education using Generative AI and Retrieval-Augmented Generation (RAG) techniques.
- Leveraged Large Language Models (LLM) and LangChain to create intelligent, responsive conversational agents.
- Utilized vector databases and AWS services (EC2, Lambda, S3) for scalable and efficient chatbot deployment.
- Collaborated with engineers and project leads to design and implement AI solutions that address key challenges in the healthcare sector.
- Identified and analyzed business problems to develop solutions that enhance patient education and engagement.
- Gained comprehensive knowledge of the technical and business aspects of healthcare AI applications.
- Successfully developed and deployed a chatbot that improved patient interaction and education, showcasing practical applications of AI in healthcare.
- Enhanced chatbot performance by integrating advanced RAG techniques, reducing response times and increasing accuracy.
- Demonstrated strong project management skills to ensure timely delivery of AI solutions.
- Implemented and developed machine learning (ML) models and initiatives to drive business value and innovation.
- Staying updated with emerging technologies and integrating them into solutions.
- Duration: Sep '20 — Apr '24
- Location: Montreal, QC, Canada
- Assisted the Professor in delivering course tutorials and laboratory sessions on various biostatistics topics (including Regression Analysis, Correlation Analysis, Analysis of Variance, and Principal Component Analysis) to over 30 students.
- Conducted weekly classes to reinforce students' understanding of statistical concepts and answer questions.
- Provided guidance and feedback to students on weekly assignments, reports, and projects, promoting their comprehension and academic success.
- Graded assignments, exams, and projects in a timely and fair manner, providing constructive feedback to help students improve their skills and interests in R Programming.
- Interpreted complex statistical concepts and communicated them effectively to students.
- Encouraged intellectual curiosity and creativity among students through hands-on experimentation with data.
- Duration: Sep '23 — Dec '23
- Location: Montreal, QC, Canada
- Assisted the Professor in teaching Quantitative Research Methods (Statistical Analysis for Social Science Research) to about 30 PhD students using SPSS software. Topics in this course include Hypothesis testing, Analysis of Variance, Correlation, Regression Analysis, Hierarchical Regression, Mixed models, Factor analysis, and Principal Component Analysis.
- Responsible for preparing and presenting practical sessions associated with the course and grading submissions.
- Meeting with students to present practical aspects of the course, including using the SPSS software to run statistical analysis.
- Demonstrated strong communication skills by explaining complex statistical methods clearly.
- Promoted collaboration and teamwork among students during projects and assignments.
- PhD in Biology (Quantitative Ecology), Concordia University (Sep '20 — Apr '25)
- Professional Certificate in Data Science and Machine Learning, McGill University (Apr '22 — Dec '24)
- Master of Science in IWRM, McGill University (Aug '18 — Oct '19)
- Bachelor of Science in Wildlife & Environmental Management, Osun State University (Sep '11 — Sep '16)
- Oracle Cloud Data Management 2023 Certified Foundations Associate
- Oracle Cloud Infrastructure 2023 Certified Foundations Associate
- Data Science: Machine Learning (edx)
- Introduction to Machine Learning (Vector Institute)
- Excel to Python (Vector Institute)
Feel free to connect with me or explore my projects. I'm always open to new opportunities and collaborations!