Conversation
Introduces src/tools.ts with functions and OpenAI tool specs for searching biological and chemical databases (ChEBI, PDB, PubChem, UniProt). Enables AI assistants to query molecular and protein information via LLM function calling.
The PDFUpload class now supports uploading specific pages from a PDF file using the pdf-lib library. New methods allow page selection, cleanup of temporary files, and improved error handling for invalid page numbers.
Added pdf-lib, p-queue, p-retry, and p-timeout dependencies for PDF manipulation and async control. Removed dotenv from dependencies.
Added ToolChainEvent to the list of exported types from the llm module to make it available for use in other parts of the application.
Introduces tool chain execution with automatic retry, timeout, and concurrency handling for OpenAI streaming responses. Adds types and event callbacks for progress tracking, updates input processing to support tool outputs, and refactors extractData to support async tool execution before streaming. Includes comprehensive documentation and examples for new features.
Updated documentation and type definitions to include PubChem as a supported database in the EnzymeML search tools. The changes clarify usage and parameters for searching multiple databases, including ChEBI, PDB, PubChem, and UniProt.
Adds the 'tool_choice: "required"' parameter to the planToolCalls function's OpenAI API call to ensure that a tool is always selected during reasoning.
There was a problem hiding this comment.
Pull Request Overview
This PR adds enhanced database search functionality to LLM tools and improves PDF upload capabilities with page selection. The main purpose is to enable AI assistants to search biological and chemical databases and allow users to upload specific pages from PDF documents.
- Added new database search tools supporting ChEBI, PDB, PubChem, and UniProt with integrated LLM function calling
- Enhanced PDF upload functionality with page selection and validation using pdf-lib
- Implemented comprehensive tool chain execution with parallel processing, retries, and progress tracking
Reviewed Changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tools.ts | New module providing database search tools and OpenAI function specifications |
| src/llm.ts | Enhanced with tool chain execution, event tracking, and integrated database search capabilities |
| src/input-types.ts | Added page selection support to PDFUpload class with validation and cleanup methods |
| src/index.ts | Exported new ToolChainEvent type for external consumption |
| package.json | Added dependencies for queue management, retry logic, timeouts, and PDF processing |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Adds exports for SearchDatabaseTool and SearchDatabaseToolSpecs from the tools module to make them available for external usage.
Eliminated the cleanup() method and related documentation from the PDFUpload class, as temporary file management is no longer required. Also refactored stream import to use ES module syntax.
Replaces the default value of the 'tools' parameter in extractData with a configurable value, allowing callers to specify their own tool list instead of always using SearchDatabaseToolSpecs.
Cleaned up import statements in llm.ts by removing unused UserQuery and SearchDatabaseToolSpecs imports to improve code clarity.
Replaces the previous Tool array with a new ToolDefinition type that pairs tool specs with their handler functions. Updates the tool chain logic and SearchDatabaseTool implementation to use this structure, improving extensibility and clarity in tool management.
The SearchDatabaseToolSpecs export was removed from src/index.ts as it is no longer needed.
Updated the package version to 1.5.0 and removed the 'example' script from the npm scripts section.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces several improvements and new features to the codebase, focusing on enhanced PDF upload functionality, new database search tools, and dependency updates. The most significant changes are the addition of page selection for PDF uploads, a new tool for searching biological and chemical databases, and the inclusion of supporting dependencies.
PDF Upload Enhancements
pagesparameter in thePDFUploadclass, including validation, processing, and cleanup methods. This allows users to select and upload only the relevant pages, improving efficiency and control. [1] [2] [3]pdf-liblibrary to enable PDF manipulation for page extraction. [1] [2]Database Search Tools
src/tools.tsthat provides theSearchDatabaseTooland its OpenAI function specification, enabling LLM-powered searches of ChEBI, PDB, PubChem, and UniProt databases.Dependency Updates
package.jsonfor queueing (p-queue), retry logic (p-retry), timeouts (p-timeout), and PDF processing (pdf-lib).Type Exports
ToolChainEventtype fromsrc/llm.tsinsrc/index.tsto support toolchain event handling.