Repository Record

Health Analytics

GeneAgent

by NCBI NLP Research

View on GitHub
95stars
17forks
Python

About

GeneAgent is a first-of-its-kind language agent built upon GPT-4 to automatically interact with domain-specific databases for gene set annotation. It generates interpretable and contextually accurate biological process names for user-provided gene sets, either aligning with significant enrichment analyses or introducing novel terms. At its core is a self-verification mechanism that autonomously interacts with various expert-curated biological databases through Web APIs, performing fact verification and providing objective evidence to support or refute raw LLM output, reducing hallucination and enabling reliable evidence-based insights.

Tech Stack

PythonGPT-4Azure OpenAIPyTorchPandasRequests

Quick Start

git clone git@github.com:ncbi-nlp/GeneAgent.git && conda create -n geneagent python=3.11 && pip install openai torch numpy pandas requests && python main_cascade.py