0% found this document useful (0 votes)
7 views29 pages

Appendix

The document outlines the implementation of a deep learning model for environmental hazard classification using ResNet50, including training, prediction, and API modules. It describes the training pipeline, model architecture, and integration with a FastAPI backend for predictions and report generation. Additionally, it details a Streamlit dashboard for user interaction, allowing users to upload satellite images and receive hazard analysis and real-time environmental data.

Uploaded by

rudreshkansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views29 pages

Appendix

The document outlines the implementation of a deep learning model for environmental hazard classification using ResNet50, including training, prediction, and API modules. It describes the training pipeline, model architecture, and integration with a FastAPI backend for predictions and report generation. Additionally, it details a Streamlit dashboard for user interaction, allowing users to upload satellite images and receive hazard analysis and real-time environmental data.

Uploaded by

rudreshkansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Appendix B

Project Snapshots and Description


B.1 Model Definition and Training Module
Description: This module defines the ResNet50-based deep learning model used for environmental hazard classification. It implements transfer learning by fine-tuning only the layer4 and final
classification layer on a pre-trained ImageNet model. The training pipeline includes data loading, augmentation, validation, and model saving with early stopping based on validation loss. Key design
decisions include using Adam optimizer with learning rate scheduling, CrossEntropyLoss, and reproducibility through seeding for consistent results.

Listing B.1: Model Definition and Training Implementation

```python

import torch

import [Link] as nn

import [Link] as optim

from torchvision import datasets, transforms, models

from [Link] import DataLoader, random_split

import [Link] as plt

from tqdm import tqdm

import random

import numpy as np

import logging

from confi[Link]fig import DATA_DIR, BATCH_SIZE, EPOCHS, LR, IMG_SIZE, SEED, MODEL_PATH, DEVICE

from [Link] import train_transform, val_transform

from [Link] import get_model

def get_model(num_classes):

model = models.resnet50(pretrained=True)

for name, param in model.named_parameters():

if "layer4" not in name:


param.requires_grad = False

num_features = [Link].in_features

[Link] = [Link](num_features, num_classes)

return model

def train_model(data_dir=None):

if data_dir is None:

data_dir = DATA_DIR

torch.manual_seed(SEED)

[Link](SEED)

[Link](SEED)

dataset = [Link](data_dir)

classes = [Link]

train_size = int(0.8 * len(dataset))

val_size = len(dataset) - train_size

train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

train_dataset.[Link] = train_transform

val_dataset.[Link] = val_transform

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shu le=True, num_workers=2)

val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shu le=False, num_workers=2)

model = get_model(len(classes))

model = [Link](DEVICE)
criterion = [Link]()

optimizer = [Link](filter(lambda p: p.requires_grad, [Link]()), lr=LR)

scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode="min", patience=3, factor=0.3)

best_val_loss = float("inf")

train_losses = []

val_losses = []

for epoch in range(EPOCHS):

[Link]()

running_loss = 0

for images, labels in tqdm(train_loader):

images = [Link](DEVICE)

labels = [Link](DEVICE)

outputs = model(images)

loss = criterion(outputs, labels)

optimizer.zero_grad()

[Link]()

[Link]()

running_loss += [Link]()

train_loss = running_loss / len(train_loader)

train_losses.append(train_loss)

[Link]()

val_loss = 0

with torch.no_grad():

for images, labels in val_loader:


images = [Link](DEVICE)

labels = [Link](DEVICE)

outputs = model(images)

loss = criterion(outputs, labels)

val_loss += [Link]()

val_loss = val_loss / len(val_loader)

val_losses.append(val_loss)

[Link](val_loss)

if val_loss < best_val_loss:

best_val_loss = val_loss

[Link](model.state_dict(), MODEL_PATH)

plt.figure(figsize=(8,5))

[Link](train_losses, label="train")

[Link](val_losses, label="validation")

[Link]("Training Curve")

[Link]("Epoch")

[Link]("Loss")

[Link]()

[Link]fig("model_perf/training_curve.png")

[Link]()

return "Training completed"

```

B.2 Prediction Module


Description: This module handles inference using the trained model. It loads the saved model weights and performs classification on uploaded satellite images. The prediction function preprocesses the
input image using the same transforms as validation data and returns the predicted hazard class. Key design decisions include using the model's evaluation mode for inference and ensuring the input
tensor is moved to the appropriate device (CPU/GPU).

Listing B.2: Prediction Implementation

```python

import torch

import logging

from [Link] import get_model

from [Link] import preprocess_image

from confi[Link]fig import CLASSES, MODEL_PATH, DEVICE

model = get_model(len(CLASSES))

model.load_state_dict([Link](MODEL_PATH, map_location=DEVICE))

model = [Link](DEVICE)

[Link]()

def predict(image_file):

image = preprocess_image(image_file)

image = [Link](DEVICE)

with torch.no_grad():

outputs = model(image)

_, pred = [Link](outputs, 1)

predicted_class = CLASSES[[Link]()]

return predicted_class

```
B.3 API Module
Description: This module provides a REST API using FastAPI for model training and prediction. It includes endpoints for training the model and predicting hazards from uploaded images. The prediction
endpoint extracts coordinates from image filenames, uses regional defaults if not found, and invokes the LangGraph agent to generate comprehensive reports. Key design decisions include using FastAPI
for automatic API documentation, handling file uploads with proper MIME types, and integrating coordinate extraction for location-aware reporting.

Listing B.3: API Implementation

```python

from fastapi import FastAPI, UploadFile

from [Link] import train_model

from [Link] import predict

from [Link] import build_graph

import logging

from io import BytesIO

import re

app = FastAPI(title="GaiaGuard API", description="AI-powered environmental hazard detection from satellite images")

graph = build_graph()

def extract_coords(filename):

match = [Link](r"(-?\d+\.\d+)[,_](-?\d+\.\d+)", filename)

if match:

lon, lat = float([Link](1)), float([Link](2))

return lat, lon

return None, None

@[Link]("/train")

def train():

result = train_model()
return {"status": result}

@[Link]("/predict")

async def predict_image(file: UploadFile):

image_bytes = await fi[Link]()

hazard = predict(BytesIO(image_bytes))

lat, lon = extract_coords(file.filename)

if lat is None or lon is None:

lat, lon = HAZARD_LOCATIONS.get(hazard, (28.6139, 77.2090))

state = {

"hazard": hazard,

"image_bytes": image_bytes,

"lat": lat,

"lon": lon

result = [Link](state)

return {

"hazard": hazard,

"report": [Link]("report"),

"weather": [Link]("weather"),

"seismic": [Link]("seismic"),

"aqi": [Link]("aqi"),

"news": [Link]("news"),

"location": {"lat": lat, "lon": lon, "is_default": lat in [v[0] for v in HAZARD_LOCATIONS.values()]}
}

```

B.4 Agent and Report Generation Module


Description: This module uses LangGraph to orchestrate AI-powered report generation. It fetches real-time environmental data (weather, seismic activity, air quality, news) and uses Google Gemini AI to
create comprehensive incident reports that cross-reference visual evidence with ground data. The system includes fallback mechanisms for API failures and simulated data generation. Key design
decisions include using a single-node graph for simplicity, implementing model fallback for API reliability, and synthesizing data rather than just listing it in reports.

Listing B.4: Agent and Report Generation Implementation

```python

from [Link] import StateGraph

from typing import TypedDict

from [Link] import generate_report

class AgentState(TypedDict):

hazard: str

report: str

image_bytes: bytes

lat: float

lon: float

weather: dict

seismic: dict

aqi: dict

news: list

def build_graph():

builder = StateGraph(AgentState)

builder.add_node("report_node", generate_report)
builder.set_entry_point("report_node")

builder.set_finish_point("report_node")

return [Link]()

# From [Link]

import os

import base64

from dotenv import load_dotenv

from langchain_core.messages import HumanMessage

from langchain_google_genai import ChatGoogleGenerativeAI

from [Link] import get_weather, get_seismic_data, get_air_quality, get_news, simulate_air_quality, simulate_seismic_activity

from confi[Link]fig import GEMINI_PRIMARY_MODEL, GEMINI_FALLBACK_MODEL

def get_llm(model_name):

return ChatGoogleGenerativeAI(model=model_name, temperature=0.7, google_api_key=[Link]("GOOGLE_API_KEY"))

def generate_report(state):

hazard = state["hazard"]

image_bytes = [Link]("image_bytes")

lat = [Link]("lat")

lon = [Link]("lon")

weather_data = get_weather(lat, lon) if lat and lon else None

seismic_data = get_seismic_data(lat, lon) if lat and lon else None

aqi_data = get_air_quality(lat, lon) if lat and lon else None

news_data = get_news(hazard, lat, lon) if lat and lon else []

# Fallback logic for missing data


if not weather_data:

weather_data = {"temperature": "25°C", "condition": "Mainly Clear", "humidity": "50%", "wind_speed": "12 km/h"}

pm25_value = aqi_data.get("pm25") if aqi_data else None

if pm25_value is None:

pm25_value = simulate_air_quality(hazard)

aqi_data = aqi_data or {"location": "Detected Region"}

aqi_data["pm25"] = f"{pm25_value} µg/m³"

seismic_count = seismic_data.get("count_last_7d", 0) if seismic_data else 0

if not seismic_data or seismic_count == 0:

simulated_count = simulate_seismic_activity()

if simulated_count > 0:

seismic_data = seismic_data or {"status": "Simulated local activity"}

seismic_data["count_last_7d"] = simulated_count

seismic_data["simulated"] = True

elif not seismic_data:

seismic_data = {"count_last_7d": 0, "status": "No significant activity"}

ground_context = f"""

**Ground Data Context:**

- Weather: {weather_data}

- Seismic Activity: {seismic_data}

- Air Quality (PM2.5): {aqi_data.get('pm25', 'N/A')}

- Recent News: {news_data if news_data else "No significant regional news reported."}

"""

prompt_text = f"""
The deep learning model detected the following hazard: {hazard}.

{ground_context}

Please perform two tasks:

1. **Verification**: Look at the attached satellite image AND the ground data context. Use your intelligence to cross-reference the visual evidence with the real-time data.

2. **Reporting**: Generate a comprehensive and professional incident report. Crucially, **do not just list the data**; synthesize it.

Structure the report clearly with these sections only:

1. **Incident Summary**: A professional overview of the hazard, incorporating the specific location context and immediate implications.

2. **Ground Condition Analysis**: Analyze how the current weather/AQI/seismic data a ects the scale and progression of the incident.

3. **Ecological & Health Assessment**: Detailed assessment of risks to biodiversity, air quality, and local communities.

4. **Recommended Actions**: Clear, actionable steps for response and mitigation.

Do not include headers, metadata, or 'Prepared By' fields. Tone: Professional, data-driven, and urgent. Length: 300-500 words.

"""

report_content = ""

models_to_try = [GEMINI_PRIMARY_MODEL, GEMINI_FALLBACK_MODEL]

for model_name in models_to_try:

try:

llm = get_llm(model_name)

if image_bytes:

image_data = base64.b64encode(image_bytes).decode("utf-8")

message = HumanMessage(

content=[

{"type": "text", "text": prompt_text},


{"type": "image_url", "image_url": f"data:image/jpeg;base64,{image_data}"}

response = [Link]([message])

else:

message = HumanMessage(content=f"(Fallback mode - No Image)\n{prompt_text}")

response = [Link]([message])

if isinstance([Link], list):

report_content = "".join([[Link]("text", "") if isinstance(part, dict) else str(part) for part in [Link]])

else:

report_content = [Link]

break

except Exception as e:

if model_name == models_to_try[-1]:

report_content = f"Error generating report: {str(e)}"

return {

"report": report_content,

"weather": weather_data,

"seismic": seismic_data,

"aqi": aqi_data,

"news": news_data

```

B.5 Dashboard Module


Description: This module provides a web-based user interface using Streamlit for easy interaction with the GaiaGuard system. Users can upload satellite images, view analysis results, and read generated
reports. The dashboard displays hazard classification, location information, real-time environmental data, and related news. Key design decisions include using a two-column layout for image and results,
expandable sections for detailed data, and integration with the FastAPI backend via HTTP requests.

Listing B.5: Dashboard Implementation

```python

import streamlit as st

import requests

from PIL import Image

from io import BytesIO

st.set_page_config(page_title="GaiaGuard Dashboard", page_icon=" ", layout="wide")

[Link](" GaiaGuard: Environmental Hazard Detection Dashboard")

[Link]("""

Upload a satellite image to detect potential environmental hazards.

Our AI analyzes the image, cross-references it with **real-time weather, seismic, and air quality data**, and generates a comprehensive professional report.

""")

[Link](" Image Upload")

file = st.file_uploader("Choose a satellite image (PNG, JPG, JPEG)", type=["png", "jpg", "jpeg"])

if file:

file_bytes = fi[Link]()

image = [Link](BytesIO(file_bytes))

col1, col2 = [Link]([1, 1])


with col1:

[Link](image, caption=f"Uploaded: {fi[Link]}", use_container_width=True)

with col2:

if [Link](" Analyze for Hazards", use_container_width=True):

with [Link]("Analyzing hazard and fetching environmental context..."):

try:

response = [Link](

"[Link]

files={"file": (fi[Link], BytesIO(file_bytes), fi[Link])}

if response.status_code == 200:

data = [Link]()

[Link]("Analysis Complete!")

hazard = [Link]("hazard", "Unknown")

location = [Link]("location", {})

lat, lon = [Link]("lat"), [Link]("lon")

[Link](f"### Detected Hazard: **{[Link]()}**")

if lat and lon:

[Link](f" **Location Detected from metadata**: Lat {lat:.4f}, Lon {lon:.4f}")

else:

[Link](" **No location metadata found in image filename.** Using default regional analysis.")

[Link]()
[Link](" Real-Time Ground Data")

m1, m2 = [Link](2)

weather = [Link]("weather")

if weather:

[Link]("Temperature", [Link]("temperature", "N/A"), [Link]("condition"))

else:

[Link]("Weather", "N/A")

aqi = [Link]("aqi")

if aqi:

pm25 = [Link]("pm25", "N/A")

[Link]("Air Quality (PM2.5)", pm25)

else:

[Link]("Air Quality", "N/A")

with [Link](" Detailed Environmental Context"):

ec1, ec2 = [Link](2)

with ec1:

[Link]("**Weather Details:**")

if weather and isinstance(weather, dict):

[Link](weather)

else:

[Link]("No detailed weather data available.")

[Link]("**Air Quality Metrics:**")

if aqi and isinstance(aqi, dict):


[Link](aqi)

else:

[Link]("No detailed AQI data available.")

with ec2:

[Link]("**Related News:**")

news = [Link]("news", [])

if news:

for n in news:

[Link](f"- [{n['title']}]({n['url']}) ({n['source']})")

else:

[Link]("No recent news found for this hazard.")

[Link]()

[Link](" Incident Report")

report = [Link]("report", "No report generated.")

[Link](report)

```

B.6 Utilities and Configuration Module


Description: This module contains utility functions for data preprocessing, configuration management, and dataset preparation. The preprocessing utilities define image transformations for training and
validation, including augmentation techniques. Configuration centralizes all hyperparameters and API settings. Dataset building merges multiple source datasets into a unified format. Key design decisions
include separating training and validation transforms, using environment variables for API keys, and supporting multiple dataset sources for balanced training data.

Listing B.6: Utilities and Configuration Implementation

```python

# configs/confi[Link]

import torch
DATA_DIR = "merged_dataset"

MODEL_PATH = "models/gaia_guard_best_model.pt"

IMG_SIZE = 224

BATCH_SIZE = 64

EPOCHS = 5

LR = 1e-4

SEED = 42

CLASSES = [

"hurricane_damage",

"normal",

"oil_spill",

"wildfire"

if [Link].is_available():

DEVICE = "cuda"

elif [Link].is_available():

DEVICE = "mps"

else:

DEVICE = "cpu"

GEMINI_PRIMARY_MODEL = "gemini-2.0-flash"

GEMINI_FALLBACK_MODEL = "gemini-flash-latest"

# utils/[Link]

from torchvision import transforms

from PIL import Image


import torch

IMG_SIZE = 224

train_transform = [Link]([

[Link](IMG_SIZE),

[Link](),

[Link](),

[Link](20),

[Link](brightness=0.2, contrast=0.2, saturation=0.2),

[Link]()

])

val_transform = [Link]([

[Link]((IMG_SIZE, IMG_SIZE)),

[Link]()

])

def preprocess_image(file):

fi[Link](0)

image = [Link](file).convert("RGB")

return val_transform(image).unsqueeze(0)

# dataset_builder.py (excerpt)

import os

import random

import shutil

from pathlib import Path


DATASETS = "dataset"

OUTPUT = "merged_dataset"

TARGET = {

"wildfire": 1500,

"hurricane_damage": 1500,

"oil_spill": 1200,

"normal": 2000

for c in TARGET:

[Link](f"{OUTPUT}/{c}", exist_ok=True)

def get_images(folder):

imgs = []

for ext in ["*.jpg","*.jpeg","*.png"]:

imgs += list(Path(folder).glob(ext))

return imgs

def copy_images(src_folder, dst_folder, limit):

imgs = get_images(src_folder)

if len(imgs) == 0:

return

selected = [Link](imgs, min(limit, len(imgs)))

for img in selected:

[Link](img, dst_folder)

```
B.7 Data Fetching and Simulation Tools
Description: This module provides functions to fetch real-time environmental data from external APIs and simulate data when APIs are unavailable. It includes weather data from OpenWeatherMap,
seismic data from USGS, air quality from OpenAQ, and news from NewsAPI. Simulation functions generate realistic values based on hazard types for fallback scenarios. Key design decisions include
implementing timeout handling, API key management via environment variables, and hazard-specific simulation logic to maintain data relevance.

Listing B.7: Data Fetching and Simulation Implementation

```python

import requests

import os

import logging

import random

from datetime import datetime, timedelta

def get_weather(lat, lon):

api_key = [Link]("OPENWEATHER_API_KEY")

if not api_key:

return {"temperature": "22°C", "condition": "Clear", "humidity": "45%", "wind_speed": "10 km/h"}

try:

url = f"[Link]

response = [Link](url, timeout=5)

if response.status_code == 200:

data = [Link]()

return {

"temperature": f"{data['main']['temp']}°C",

"condition": data['weather'][0]['description'],

"humidity": f"{data['main']['humidity']}%",
"wind_speed": f"{data['wind']['speed']} m/s"

except Exception as e:

pass

return None

def get_seismic_data(lat, lon, radius_km=100):

try:

starttime = ([Link]() - timedelta(days=7)).strftime('%Y-%m-%d')

url = f"[Link]

response = [Link](url, timeout=5)

if response.status_code == 200:

data = [Link]()

count = [Link]("metadata", {}).get("count", 0)

if count > 0:

latest = data['features'][0]['properties']

return {

"count_last_7d": count,

"latest_magnitude": latest['mag'],

"latest_place": latest['place'],

"latest_time": [Link](latest['time']/1000).strftime('%Y-%m-%d %H:%M:%S')

return {"count_last_7d": 0, "status": "No significant activity"}

except Exception as e:

pass

return None

def get_air_quality(lat, lon):


api_key = [Link]("OPENAQ_API_KEY")

headers = {}

if api_key and api_key != "your_openaq_api_key_here":

headers = {"X-API-Key": api_key.strip()}

try:

url = f"[Link]

response = [Link](url, headers=headers, timeout=5)

if response.status_code == 200:

data = [Link]()

if data['results']:

location = data['results'][0]

sensors = [Link]('sensors', [])

aqi = {"location": [Link]('name', 'Unknown')}

pm25_found = False

for sensor in sensors:

param = [Link]('parameter', {}).get('name')

latest = [Link]('latestValue')

units = [Link]('parameter', {}).get('units')

if param and latest is not None:

aqi[param] = latest

if param == 'pm25':

pm25_found = True

if not pm25_found:

return None

return aqi

except Exception as e:

pass
return None

def simulate_air_quality(hazard_type):

hazard_type = hazard_type.lower()

if "wildfire" in hazard_type:

return [Link](150, 300)

elif "hurricane" in hazard_type or "cyclone" in hazard_type:

return [Link](20, 60)

elif "oil spill" in hazard_type:

return [Link](40, 90)

else:

return [Link](30, 70)

def simulate_seismic_activity():

r = [Link]()

if r < 0.80:

return 0

elif r < 0.95:

return 1

else:

return 2

def get_news(hazard, lat, lon):

api_key = [Link]("NEWS_API_KEY")

if not api_key:

return []

try:
query = f"{hazard} environmental impact"

url = f"[Link]

response = [Link](url, timeout=5)

if response.status_code == 200:

articles = [Link]().get("articles", [])

return [{"title": a['title'], "source": a['source']['name'], "url": a['url']} for a in articles]

except Exception as e:

pass

return []
B.8 Project Screenshots
Description: This section contains visual representations of the GaiaGuard system in action, including the Streamlit dashboard interface, API documentation, and sample analysis results. These
screenshots demonstrate the user experience and system capabilities.

Listing B.8: Dashboard Interface Screenshot

*Figure B.1: Main dashboard interface showing image upload and analysis features.*
Listing B.9: Analysis Results Screenshot

*Figure B.2: Sample analysis results displaying hazard classification and environmental data.
B.10 Model Performance Metrics

*Figure B.3: Visual representations of model training performance and evaluation metrics.
Listing B.11: Training Loss Curve

*Figure B.4 : Training and validation loss curves over epochs.*


Listing B.12: Confusion Matrix

*Figure B.5: Confusion matrix showing classification accuracy across hazard types.*

You might also like