LocalLab Server Package v0.1.0 Released and Updated Docs

UtkarshTheDev · UtkarshTheDev · commit 0e6165459eb0 · 2025-02-25T09:55:12.000+05:30
diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
@@ -0,0 +1,32 @@
+name: Publish Python Package
+
+on:
+  push:
+    branches:
+      - main
+
+jobs:
+  build-and-publish:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v3
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: "3.x"
+
+      - name: Install build dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install build twine
+
+      - name: Build package
+        run: python -m build --sdist --wheel
+
+      - name: Publish package to PyPI
+        uses: pypa/gh-action-pypi-publish@v1.4.2
+        with:
+          user: __token__
+          password: ${{ secrets.PYPI_API_TOKEN }}
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2023 Developer Utkarsh
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE. 
diff --git a/README.md b/README.md
@@ -2,6 +2,7 @@
 
 [![Build Status](https://github.com/Developer-Utkarsh/LocalLab/workflows/CI/badge.svg)](https://github.com/Developer-Utkarsh/LocalLab/actions)
 [![Coverage Status](https://coveralls.io/repos/github/Developer-Utkarsh/LocalLab/badge.svg?branch=main)](https://coveralls.io/github/Developer-Utkarsh/LocalLab?branch=main)
+[![LocalLab Version](https://img.shields.io/pypi/v/locallab.svg)](https://pypi.org/project/locallab/)
 [![Python Version](https://img.shields.io/pypi/pyversions/locallab.svg)](https://pypi.org/project/locallab/)
 [![License](https://img.shields.io/github/license/Developer-Utkarsh/LocalLab.svg)](https://github.com/Developer-Utkarsh/LocalLab/blob/main/LICENSE)
 
@@ -17,22 +18,24 @@ LocalLab is a powerful, lightweight AI inference server designed to deliver cutt
 ## System Requirements
 
 ### Minimum Requirements
-| Component | Local Deployment | Google Colab |
-|-----------|-----------------|--------------|
-| RAM | 4GB | Free tier (12GB) |
-| CPU | 2 cores | 2 cores |
-| Python | 3.8+ | 3.8+ |
-| Storage | 2GB free | - |
-| GPU | Optional | Available in free tier |
+
+| Component | Local Deployment | Google Colab           |
+| --------- | ---------------- | ---------------------- |
+| RAM       | 4GB              | Free tier (12GB)       |
+| CPU       | 2 cores          | 2 cores                |
+| Python    | 3.8+             | 3.8+                   |
+| Storage   | 2GB free         | -                      |
+| GPU       | Optional         | Available in free tier |
 
 ### Recommended Requirements
-| Component | Local Deployment | Google Colab |
-|-----------|-----------------|--------------|
-| RAM | 8GB+ | Pro tier (24GB) |
-| CPU | 4+ cores | Pro tier (4 cores) |
-| Python | 3.9+ | 3.9+ |
-| Storage | 5GB+ free | - |
-| GPU | CUDA-compatible | Pro tier GPU |
+
+| Component | Local Deployment | Google Colab       |
+| --------- | ---------------- | ------------------ |
+| RAM       | 8GB+             | Pro tier (24GB)    |
+| CPU       | 4+ cores         | Pro tier (4 cores) |
+| Python    | 3.9+             | 3.9+               |
+| Storage   | 5GB+ free        | -                  |
+| GPU       | CUDA-compatible  | Pro tier GPU       |
 
 ## Key Features
 
@@ -48,7 +51,7 @@ Below is an illustration of LocalLab's architecture:
 
 ```mermaid
 graph TD;
-    A[User] --> B[LocalLab Client (Python/Node.js)];
+    A[User] --> B[LocalLab Client (Python and Node.js)];
     B --> C[LocalLab Server];
     C --> D[Model Manager];
     D --> E[Hugging Face Models];
@@ -57,6 +60,7 @@ graph TD;
 ```
 
 ### Model Loading & Optimization Flow
+
 ```mermaid
 graph TD;
     A[Load Model Request] --> B{Check Resources};
@@ -70,6 +74,7 @@ graph TD;
 ```
 
 ### Resource Management Flow
+
 ```mermaid
 graph TD;
     A[Client Request] --> B[Resource Monitor];
@@ -98,35 +103,38 @@ sequenceDiagram
 
 ## Documentation
 
-📚 [Read the full documentation](./docs/README.md)
+📚 [Read the full documentation](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/README.md)
 
-- [Getting Started Guide](./docs/guides/getting-started.md)
-- [Python Client](./docs/clients/python/README.md)
-- [Node.js Client](./docs/clients/nodejs/README.md)
-- [Client Comparison](./docs/clients/comparison.md)
-- [Google Colab Guide](./docs/colab/README.md)
-- [API Reference](./docs/guides/api.md)
+- [Getting Started Guide](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/guides/getting-started.md)
+- [Python Client](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/clients/python/README.md)
+- [Node.js Client](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/clients/nodejs/README.md)
+- [Client Comparison](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/clients/comparison.md)
+- [Google Colab Guide](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/colab/README.md)
+- [API Reference](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/guides/api.md)
 
 ## Usage Guides
 
-- Detailed setup and usage instructions can be found in our [Documentation](./docs/README.md).
-- For Python-specific details, check out our [Python Client Guide](./docs/clients/python.md).
-- For Node.js-specific details, refer to our [Node.js Client Guide](./docs/clients/nodejs.md).
+- Detailed setup and usage instructions can be found in our [Documentation](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/README.md).
+- For Python-specific details, check out our [Python Client Guide](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/clients/python.md).
+- For Node.js-specific details, refer to our [Node.js Client Guide](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/clients/nodejs.md).
 
 ## Get Started
 
 1. **Installation:**
+
    ```bash
    pip install locallab
    ```
 
 2. **Starting the Server Locally:**
+
    ```python
    from locallab import start_server
    start_server()
    ```
 
 3. **Starting the Server on Google Colab:**
+
    ```python
    !pip install locallab
    import os
@@ -145,7 +153,7 @@ sequenceDiagram
 
 - Report issues on our [GitHub Issues](https://github.com/Developer-Utkarsh/LocalLab/issues).
 - Participate in discussions on our [Community Forum](https://github.com/Developer-Utkarsh/LocalLab/discussions).
-- Learn how to contribute by reading our [Contributing Guidelines](./docs/guides/contributing.md).
+- Learn how to contribute by reading our [Contributing Guidelines](https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/guides/contributing.md).
 
 ---
 
diff --git a/docs/colab/README.md b/docs/colab/README.md
@@ -3,6 +3,7 @@
 ## Quick Start
 
 The fastest way to get started is to use our [Interactive Colab Guide](./locallab_colab_guide.ipynb). This notebook provides:
+
 - Step-by-step setup
 - Interactive configuration
 - Usage examples
@@ -36,3 +37,13 @@ The fastest way to get started is to use our [Interactive Colab Guide](./localla
 - Check our [Troubleshooting Guide](./troubleshooting.md)
 - Visit our [FAQ](./faq.md)
 - Open an [Issue](https://github.com/Developer-Utkarsh/LocalLab/issues)
+
+```mermaid
+graph TD;
+    A[User] --> B[LocalLab Client (Python and Node.js)];
+    B --> C[LocalLab Server];
+    C --> D[Model Manager];
+    D --> E[Hugging Face Models];
+    C --> F[Optimizations];
+    C --> G[Resource Monitoring];
+```
diff --git a/locallab/__init__.py b/locallab/__init__.py
@@ -2,7 +2,7 @@
 LocalLab - A lightweight AI inference server
 """
 
-__version__ = "0.0" 
+__version__ = "0.1.0" 
 
 from typing import Dict, Any, Optional
 
diff --git a/locallab/main.py b/locallab/main.py
@@ -22,6 +22,16 @@
 import requests
 import multiprocessing
 
+# New: Define a LogQueueWriter to redirect writes to a multiprocessing.Queue
+class LogQueueWriter:
+    def __init__(self, queue):
+        self.queue = queue
+    def write(self, msg):
+        if msg.strip() != "":
+            self.queue.put(msg)
+    def flush(self):
+        pass
+
 from . import __version__  # Import version from package
 from .model_manager import ModelManager
 from .config import (
@@ -669,13 +679,20 @@ def setup_ngrok(port: int = 8000, max_retries: int = 3) -> Optional[str]:
             logger.error("Failed to establish ngrok tunnel after all retries")
             raise
 
-def run_server_proc():
-    try:
-        # Redirect stdout and stderr to 'server.log' to capture all printed output
-        log_file = "server.log"
-        sys.stdout = open(log_file, "a", buffering=1)
-        sys.stderr = open(log_file, "a", buffering=1)
+# Modify run_server_proc to accept a log_queue and redirect stdout/stderr
 
+def run_server_proc(log_queue):
+    try:
+        # Redirect stdout and stderr to the log queue
+        log_writer = LogQueueWriter(log_queue)
+        sys.stdout = log_writer
+        sys.stderr = log_writer
+        
+        # Attach a logging handler to send log messages to the queue
+        handler = logging.StreamHandler(log_writer)
+        handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
+        logger.addHandler(handler)
+        
         if "COLAB_GPU" in os.environ:
             import nest_asyncio
             nest_asyncio.apply()
@@ -691,15 +708,21 @@ def run_server_proc():
         logger.error(f"Server startup failed: {str(e)}")
         raise
 
-def start_server(use_ngrok: bool = False):
+# Modify start_server to accept a log_queue parameter and pass it to the child process
+
+def start_server(use_ngrok: bool = False, log_queue=None):
     import time
     import requests
-
+    
+    # If no log_queue provided, create one (though normally parent supplies it)
+    if log_queue is None:
+        log_queue = multiprocessing.Queue()
+    
     # Start the server in a separate process using spawn context with module-level run_server_proc
     ctx = multiprocessing.get_context("spawn")
-    p = ctx.Process(target=run_server_proc)
+    p = ctx.Process(target=run_server_proc, args=(log_queue,))
     p.start()
-
+    
     # Wait until the /health endpoint returns 200 or timeout
     timeout = 30
     start_time_loop = time.time()
@@ -714,16 +737,16 @@ def start_server(use_ngrok: bool = False):
         except Exception:
             pass
         time.sleep(1)
-
+    
     if not server_ready:
         raise Exception("Server did not become healthy in time.")
-
+    
     if use_ngrok:
         public_url = setup_ngrok(port=8000)
         ngrok_section = f"\n{Fore.CYAN}┌────────────────────────── Ngrok Tunnel Details ─────────────────────────────┐{Style.RESET_ALL}\n│\n│  🚀 Ngrok Public URL: {Fore.GREEN}{public_url}{Style.RESET_ALL}\n│\n{Fore.CYAN}└──────────────────────────────────────────────────────────────────────────────┘{Style.RESET_ALL}\n"
         logger.info(ngrok_section)
         print(ngrok_section)
-
+    
     # Wait indefinitely until a KeyboardInterrupt is received
     try:
         while True:
@@ -733,68 +756,17 @@ def start_server(use_ngrok: bool = False):
         p.terminate()
         p.join()
 
-@app.middleware("http")
-async def log_requests(request: Request, call_next):
-    """Log API requests with detailed information"""
-    start_time = time.time()
-    response = await call_next(request)
-    process_time = (time.time() - start_time) * 1000
-
-    # Get current resource usage
-    cpu_percent = psutil.cpu_percent()
-    memory = psutil.virtual_memory()
-    gpu_mem = torch.cuda.memory_allocated(0) / 1024**2 if torch.cuda.is_available() else 0
-
-    # Create status color based on response code
-    status_color = Fore.GREEN if response.status_code < 300 else Fore.RED if response.status_code >= 400 else Fore.YELLOW
-
-    # Get endpoint description and other details
-    endpoint_desc = {
-        "/generate": "Text Generation",
-        "/chat": "Chat Completion",
-        "/generate/batch": "Batch Generation",
-        "/models/current": "Current Model Info",
-        "/models/available": "Available Models",
-        "/system/info": "System Status"
-    }.get(request.url.path, "API Request")
-
-    # Format request log with additional active model info
-    log_message = f"""
-{Fore.CYAN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━{Style.RESET_ALL}
-{Fore.CYAN}{endpoint_desc}{Style.RESET_ALL}
-{Fore.CYAN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━{Style.RESET_ALL}
-
-• Time: {Fore.GREEN}{time.strftime('%H:%M:%S')}{Style.RESET_ALL}
-• Method: {Fore.MAGENTA}{request.method}{Style.RESET_ALL}
-• Path: {Fore.YELLOW}{request.url.path}{Style.RESET_ALL}
-• Status: {status_color}{response.status_code}{Style.RESET_ALL}
-• Duration: {Fore.GREEN}{process_time:.2f}ms{Style.RESET_ALL}
-
-{Fore.YELLOW}Resource Usage:{Style.RESET_ALL}
-• CPU: {Fore.YELLOW}{cpu_percent}%{Style.RESET_ALL}
-• RAM: {Fore.YELLOW}{memory.percent}%{Style.RESET_ALL}
-• Active Model: {Fore.YELLOW}{model_manager.current_model}{Style.RESET_ALL}
-"""
-    logger.info(log_message)
-    return response
+# Define a log listener function in the parent to print messages from the log queue
 
-# Define a helper function to tail the log file and print its content to stdout
-import time
-import threading
-
-def tail_log_file(filepath):
-    last_pos = 0
+def log_listener(queue):
     while True:
         try:
-            with open(filepath, 'r') as f:
-                f.seek(last_pos)
-                new_content = f.read()
-                if new_content:
-                    print(new_content, end='')
-                    last_pos = f.tell()
+            msg = queue.get()
+            if msg is None:
+                break
+            print(msg, end='')
         except Exception as e:
             pass
-        time.sleep(1)
 
 if __name__ == "__main__":
     import multiprocessing
@@ -804,7 +776,9 @@ def tail_log_file(filepath):
         logger.warning("multiprocessing start method already set: " + str(e))
     
     import threading
-    tail_thread = threading.Thread(target=tail_log_file, args=("server.log",), daemon=True)
-    tail_thread.start()
-
-    start_server(use_ngrok=True)
+    # Create a log queue and start the listener thread
+    log_queue = multiprocessing.Queue()
+    listener_thread = threading.Thread(target=log_listener, args=(log_queue,), daemon=True)
+    listener_thread.start()
+    
+    start_server(use_ngrok=True, log_queue=log_queue)
diff --git a/setup.py b/setup.py