Research Paper Agent with Local LLM

An intelligent research agent that finds academic papers, discovers related works through Connected Papers, and identifies relevant GitHub repositories. Works completely offline with local LLMs - no API keys required!

🚀 Quick Start

Get started in 2 minutes - no setup required:

# 1. Install dependencies
pip install -r requirements.txt

# 2. Run immediately with rule-based processing
python main.py "transformer neural networks"

✨ Features

🔍 Multi-source Paper Search: arXiv, Google Scholar integration
🕸️ Connected Papers: Discover related works and paper networks
📦 GitHub Discovery: Find code repositories for research papers
🤖 Local LLM Support: Ollama, Transformers, or rule-based processing
🔒 Privacy-First: Everything runs locally, no data sent to external APIs
📊 Structured Output: Clean JSON results with relevance scoring

🛠️ Setup Options

Option 1: Rule-Based (Instant Start) ⚡

No configuration needed - works immediately:

python main.py "machine learning"

Uses intelligent keyword matching and scoring algorithms.

Option 2: Ollama (Recommended for AI) 🧠

Best local AI experience:

# 1. Install Ollama from https://ollama.ai
# 2. Pull a model
ollama pull llama2:7b

# 3. Configure the agent
cp .env.example .env
# Edit .env and set: LLM_PROVIDER=ollama

# 4. Run with AI enhancement
python main.py "deep learning"

Option 3: Transformers (Alternative AI) 🤗

Offline Hugging Face models:

# Configure for Transformers
cp .env.example .env
# Edit .env and set: LLM_PROVIDER=transformers

python main.py "neural networks"

Option 4: OpenAI (If you have API key) 💰

# Edit .env and set:
# LLM_PROVIDER=openai
# OPENAI_API_KEY=your_key_here

📖 Usage

Command Line Interface

# Basic search
python main.py "attention mechanisms in transformers"

# Interactive mode
python main.py --interactive

# Limit results for faster testing
python main.py "computer vision" --max-papers 5

# Export results to JSON
python main.py "reinforcement learning" --export results.json

Python API

from research_agent import ResearchAgent

# Initialize agent (auto-detects configuration)
agent = ResearchAgent()

# Search for papers
results = agent.search("graph neural networks")

# Access results
print(f"Found {len(results['papers'])} papers")
print(f"Found {len(results['related_papers'])} related papers")
print(f"Found {len(results['github_repos'])} GitHub repos")

# Print summary
if results.get('llm_summary'):
    print(f"Summary: {results['llm_summary']}")

agent.close()

📁 Project Structure

connected-papers/
├── main.py              # CLI interface
├── research_agent.py    # Core agent orchestrator
├── paper_searcher.py    # arXiv/Scholar search
├── connected_papers.py  # Related papers discovery
├── github_searcher.py   # GitHub repository finder
├── llm_processor.py     # Local LLM integration
├── config.py            # Configuration settings
├── utils.py             # Helper functions
├── requirements.txt     # Dependencies
└── .env.example         # Environment template

⚙️ Configuration

Copy .env.example to .env and customize:

# LLM Provider: "ollama", "transformers", "rule_based", or "openai"
LLM_PROVIDER=ollama

# Ollama settings (if using Ollama)
OLLAMA_MODEL=llama2:7b
OLLAMA_BASE_URL=http://localhost:11434

# Transformers settings (if using Transformers)
TRANSFORMERS_MODEL=microsoft/DialoGPT-small

# Optional: GitHub token for higher rate limits
GITHUB_TOKEN=your_github_token

# Optional: OpenAI key (if using OpenAI)
OPENAI_API_KEY=your_openai_key

📊 Output Format

The agent returns structured JSON with:

{
  "query": "transformer neural networks",
  "papers": [
    {
      "title": "Attention Is All You Need",
      "authors": ["Vaswani, Ashish", "..."],
      "abstract": "...",
      "url": "https://arxiv.org/abs/1706.03762",
      "published": "2017-06-12",
      "relevance_score": 0.95
    }
  ],
  "related_papers": [...],
  "github_repos": [
    {
      "name": "pytorch/pytorch",
      "description": "Tensors and Dynamic neural networks...",
      "url": "https://github.com/pytorch/pytorch",
      "stars": 70000,
      "relevance_score": 0.88
    }
  ],
  "llm_summary": "Found 10 papers on transformer networks..."
}

🔧 Troubleshooting

Ollama Issues

# Check if Ollama is running
ollama list

# Start Ollama service
ollama serve

# Pull a smaller model if needed
ollama pull llama2:7b-chat

Transformers Issues

# Install PyTorch if needed
pip install torch torchvision torchaudio

# Try a smaller model
# Edit .env: TRANSFORMERS_MODEL=distilbert-base-uncased

Search Issues

# If getting rate limited, add GitHub token to .env
GITHUB_TOKEN=your_token_here

# For Google Scholar blocks, the agent will fall back to arXiv

📝 Examples

Research Query Examples

# Find papers and code for specific topics
python main.py "graph neural networks for drug discovery"
python main.py "attention mechanisms in computer vision"
python main.py "reinforcement learning robotics"
python main.py "natural language processing transformers"

# Academic paper investigation
python main.py "BERT model architecture"
python main.py "GPT transformer decoder"
python main.py "ResNet convolutional networks"

Advanced Usage

from research_agent import ResearchAgent

# Custom configuration
agent = ResearchAgent(max_papers=20, max_repos=10)

# Multiple queries
queries = [
    "federated learning privacy",
    "quantum machine learning",
    "explainable AI methods"
]

for query in queries:
    results = agent.search(query)
    print(f"Query: {query}")
    print(f"Papers: {len(results['papers'])}")
    print(f"Repos: {len(results['github_repos'])}")
    print("-" * 40)

agent.close()

🔄 How It Works

Query Processing: LLM or rule-based system expands and refines the research query
Paper Search: Searches arXiv and Google Scholar for relevant papers
Connected Papers: Uses paper IDs to find related works and citations
GitHub Search: Searches for repositories using paper titles and keywords
Relevance Scoring: Ranks results by relevance to the original query
Summary Generation: LLM provides insights and summary of findings

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feature-name
Make changes and test thoroughly
Submit a pull request

📄 License

MIT License - see LICENSE file for details.

🚀 Future Enhancements

Support for more paper databases (PubMed, IEEE, ACM)
Citation network visualization
Paper PDF analysis and extraction
More local LLM providers (LM Studio, GPT4All)
Web interface for non-technical users
Paper recommendation system
Research trend analysis

🙏 Acknowledgments

Connected Papers for paper relationship discovery
arXiv for open access to research papers
Ollama for local LLM infrastructure
Hugging Face for transformer models

Ready to explore research? Start with:

python main.py "your research topic here"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Research Paper Agent with Local LLM

🚀 Quick Start

✨ Features

🛠️ Setup Options

Option 1: Rule-Based (Instant Start) ⚡

Option 2: Ollama (Recommended for AI) 🧠

Option 3: Transformers (Alternative AI) 🤗

Option 4: OpenAI (If you have API key) 💰

📖 Usage

Command Line Interface

Python API

📁 Project Structure

⚙️ Configuration

📊 Output Format

🔧 Troubleshooting

Ollama Issues

Transformers Issues

Search Issues

📝 Examples

Research Query Examples

Advanced Usage

🔄 How It Works

🤝 Contributing

📄 License

🚀 Future Enhancements

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
__pycache__		__pycache__
cache		cache
logs		logs
results		results
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
connected_papers.py		connected_papers.py
github_searcher.py		github_searcher.py
llm_processor.py		llm_processor.py
main.py		main.py
paper_searcher.py		paper_searcher.py
requirements.txt		requirements.txt
research_agent.py		research_agent.py
utils.py		utils.py

mr-sarthakgupta/deep-paper-search

Folders and files

Latest commit

History

Repository files navigation

Research Paper Agent with Local LLM

🚀 Quick Start

✨ Features

🛠️ Setup Options

Option 1: Rule-Based (Instant Start) ⚡

Option 2: Ollama (Recommended for AI) 🧠

Option 3: Transformers (Alternative AI) 🤗

Option 4: OpenAI (If you have API key) 💰

📖 Usage

Command Line Interface

Python API

📁 Project Structure

⚙️ Configuration

📊 Output Format

🔧 Troubleshooting

Ollama Issues

Transformers Issues

Search Issues

📝 Examples

Research Query Examples

Advanced Usage

🔄 How It Works

🤝 Contributing

📄 License

🚀 Future Enhancements

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages