Fyp25031: Generative AI for Researching Archaeology and Teaching

Project objective

Deliver a web-based Retrieval-Augmented Generation (RAG) assistant tailored for archaeology. It provides accurate, citable answers to classroom and research questions, supports user document uploads, and operates on a curated domain corpus (no base model retraining needed for new knowledge). All answers are grounded in retrieved evidence, with precise citations at the document, section, page, and sentence levels.

Project Background

Retrieval-Augmented Generation (RAG) is effective for reducing hallucinations and improving traceability in LLMs, but naive RAG struggles in specialized fields like archaeology. It may miss domain terminology/cross-lingual variants, rely on single similarity signals for inaccurate ranking, and lack answerability checks or sentence-level citations—issues amplified by archaeology’s heterogeneous sources (scanned PDFs, bilingual texts, inconsistent formatting) and knowledge constraints (period, region, site).

General LLMs also have hallucinations/over-generalization in archaeology coursework, while large-scale fine-tuning is costly. Building on Prof. Cobb’s prior archaeology RAG baseline (which needs improvements in domain understanding, evidence diversity, etc.), this project develops a strengthened RAG pipeline. It optimizes corpus construction, retrieval, ranking, and generation to fit archaeology, supporting accurate, citable answers for teaching (aligned with materials, academic integrity) and research (evidence coverage, strict attribution).

Project Schedule

PhaseDatesGoal
P0 — Kickoff & Environment10/01–10/07Repo and infrastructure ready
P1 — Corpus & Chunking10/01–10/31Heading/paragraph chunking and ingestion
P2 — Query Preprocessing & Routing11/01–11/15Normalization + rule-based routing
P3 — Initial Retrieval & Candidate Slimming11/16–12/10Stable candidate pool
P4 — Re-ranking & Fusion12/11–01/10Cross-Encoder ranking integrated
P5 — Generation (3-step + citations)01/11–02/05Three-step generation with strict citations
P6 — Evaluation & Tuning02/06–03/01Metrics & hyperparameter tuning
P7 — Hardening & Deliverables03/02–03/20Stabilization & artifacts
P8 — Final Polish & Submission03/21–03/30Final handoff