Building a Document AI Assistant with OpenAI Assistant API

The Assistants API empowers developers to build AI-powered assistants within their applications. These assistants can leverage OpenAI’s advanced models, tools, and files to handle complex tasks, process user queries, and enhance productivity.

Currently in beta, the Assistants API supports multiple tools and functionalities, enabling developers to create intelligent workflows and integrate AI features seamlessly.

Introduction

The OpenAI Assistants API allows developers to build AI-powered assistants that can process queries, analyze documents, and deliver context-aware responses. This API supports advanced features, such as file handling, code interpretation, and thread management, making it easier to integrate AI capabilities into applications.

This blog post will guide you through:

  • Key features of the OpenAI Assistant API.
  • Cost structure and optimization strategies.
  • Real-world example: Building a Document AI Assistant App with SharePoint Online and Azure Functions.

Let’s dive into how you can use this API to create smarter workflows and automate tasks!


What is the OpenAI Assistant API?

The OpenAI Assistant API enables developers to create AI-powered agents that can:

  • Answer queries based on uploaded files.
  • Perform calculations and interpret data using the Code Interpreter tool.
  • Maintain context in conversations using threads.
  • Stream responses for real-time interactions.

It is designed for building workflows such as question-answering, data processing, and document reviews.

Key Features

  1. Context-Aware Conversations: Maintain context through threads for multi-turn interactions.
  2. File Support: Upload files for AI processing and retrieval-based answers.
  3. Flexible Scalability: Easily integrates with existing enterprise systems.
  4. Rich Toolset: Includes retrieval tools, code interpreters, and data analysis capabilities.
  5. Custom Workflows: Build assistants for document reviews, compliance checks, and knowledge exploration.

OpenAI Assistant API


Supported File Types and Limits

Cost Considerations

  • Usage Pricing: Costs are based on tokens processed (input and output) and file size.
  • Model Pricing:
    • GPT-4 is more expensive than GPT-3.5.
    • Additional charges apply for retrieval tools and code interpreters.

Cost Breakdown for GPT-4 Model

Key Cost Factors:

  1. Token Usage: Both input (prompts) and output (responses) are charged based on token count.
  2. Retrieval Tool Costs: Charges apply for data storage and retrieval if files are used.
  3. Code Interpreter Fees: Costs are incurred when executing code for data analysis.

Cost Optimization Tips:

  • Efficient Prompts: Write concise prompts to minimize token usage.
  • Optimize Data Storage: Upload only necessary files in optimized formats.
  • Monitor Usage: Regularly review API usage to control costs.

The Demo App: Document AI Assistant

Overview

The Document AI Assistant is a SharePoint Framework (SPFx) Command Set extension that allows users to interact with AI directly within SharePoint Online. It uses OpenAI’s Assistant API to process and analyze documents selected in a SharePoint library.

Architecture

Frontend: React-based chat UI built with SPFx Command Set extensions.
Backend: Azure Functions for handling Graph API and OpenAI Assistant API calls.
Graph API: Converts documents to PDF format and streams content to the Assistant API.
AI Services: OpenAI Assistant API for question-answering based on the selected document.

Architecture Diagram


Key Features

  • Interactive Chat Interface: Query the AI assistant directly in SharePoint.
  • AI-Powered Responses: Receive accurate, context-aware answers from the assistant.
  • Azure Function Backend: Secure processing through Azure Functions.
  • Graph API Integration: Fetch documents from SharePoint libraries.
  • Multi-Format Support: Analyze PDFs and other file types.
  • Code and Data Interpretation: Perform calculations and data analysis within documents.

Demo

  1. Select a Document: Navigate to the SharePoint document library and select a document.
  2. Activate Assistant: Click the Document AI Assistant button in the toolbar.
  3. Ask Questions: Use the chat interface to query the assistant about document content.

Demo Screenshot


Step-by-Step Guide: Build Your AI Assistant

  1. Set Up Development Environment

    • Install Visual Studio from here.
    • Create an Office 365 Developer Tenant with a modern site collection.
  2. Register an Entra ID Application

    • Create an Entra ID app.
    • Assign Graph API permissions: Files.Read.All.
    • Note down ClientId, ClientSecret, and TenantID.
  3. Get OpenAI API Key

    • Obtain an API key from OpenAI.
  4. Clone the Repository

    • Clone the project from the repository.
  5. Configure Azure Functions

    • Update local.settings.json with Azure and OpenAI credentials.
    • Deploy or test the Azure Function locally.
  6. Install Dependencies

    • Run npm install in the project folder.
  7. Build and Package Solution

    • Execute:
      gulp bundle --ship  
      gulp package-solution --ship
  8. Deploy to SharePoint

    • Deploy the generated package to the SharePoint app catalog.
    • Add the app to a SharePoint site.
  9. Use the Application

    • Navigate to a document library, select a document, and use the AI Document Assistant command to interact with the assistant.

Source code

Note

You can find the complete source code from GitHub.

Author: Ejaz Hussain
Link: https://office365clinic.com/2024/12/29/building-document-assistant-with-openai/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.