StuLearn AI
Research Workspace
The Problem
Researchers and students struggle to efficiently extract insights from lengthy PDF documents. Traditional search only finds exact keywords, missing conceptual relationships and making it difficult to synthesize information across multiple documents.
My Role
Full-Stack Developer & ML Engineer
The Solution
Developed an AI-powered research workspace that transforms PDFs into queryable knowledge graphs. The system uses semantic search, automatic summarization, and topic clustering with local LLMs from HuggingFace, providing an intelligent research assistant.
Technology Stack
Next.js
Modern React framework for the frontend interface
TypeScript
Type safety across the entire stack
Shadcn UI
Accessible and customizable component library
Node.js & Express
Backend API for document processing and queries
MongoDB
Document storage for PDFs and extracted content
Python & FastAPI
ML inference server for NLP operations
HuggingFace Models
Local LLMs for summarization and embeddings
Key Features
- •PDF upload and automatic text extraction with structure preservation
- •Semantic search across documents using transformer-based embeddings
- •Automatic summarization of lengthy documents or specific sections
- •Topic clustering to identify themes and relationships
- •Knowledge graph visualization showing connections between concepts
- •Query interface for natural language questions about uploaded documents
Impact & Results
Significantly reduced research time by enabling conceptual search and automatic summarization. Users can now query their entire research library semantically rather than relying on keyword search.
Screenshots
Lessons Learned
- •Learned efficient PDF parsing techniques preserving document structure
- •Gained experience running local LLMs for privacy-focused applications
- •Understood the challenges of knowledge graph construction from unstructured text
- •Mastered the balance between accuracy and performance in NLP pipelines