xebia-gptvision
Service icon

Xebia_GPTVision

Stable version 1.0.0 (Compatible with OutSystems 11)
Uploaded
 on 17 December 2024
 by 
5.0
 (3 ratings)
xebia-gptvision

Xebia_GPTVision

Documentation
1.0.0

About The Connector & Demo 

The GPT Vision Connector is designed to leverage OpenAI's advanced capabilities for understanding visual content. It doesn’t just identify objects in an image – it goes much deeper to provide a comprehensive understanding of what’s happening in the picture. Accurately identifies the colors present in the image, understands the relationships between objects and can explain actions or interactions (e.g., "What are they doing?") and provides highly specific and detailed interpretations of the image. Alongside, we've crafted a functional demo application showcasing its use. 

 
Pre-requisites  

Here is the step-by-step documentation for getting the API key for "gpt-4o” model from Open AI. 

Click on the below URL to proceed further 

  1. Create an OpenAI account‍ 

  1. Verify your account‍ 

  1. Log into your account‍ 

  1. Navigate to the API section. 

  1. Generate a new API key. 

  1. Save your API key. 

 

Configuring OutSystems Demo Application 

 

Then add that key as Site property of our demo application to continue our services. 
 

 

 

 

About the Demo Application 

Step 1: Access the Demo Page 

 

1.Select File: Use the "Select File" option and the upload icon to upload an image. The uploaded image will then be displayed in the image preview. 

2.Dropdown Box: Uploaded files will appear in the dropdown, 

3.Delete Icon: You can delete the selected file from the dropdown list by clicking the delete icon. 

4.User Question: Type your question in the "User Question" box, and the answer will be displayed in the "Answer" section. 

5.Clear Chat: To reset the "Answer" section, use the "Clear Chat" button 

  This setup allows you to upload, select, delete, and interact with documents seamlessly. 

 

Step 2: Upload an Image File 

Upload an image file using the "Select File" option. The chosen image will appear in the preview section below. Click the upload icon to complete the process, which may take a few seconds. Once uploaded, the image will be available in the "Select document" dropdown. 
 
Step 3: Ask your Question 
 
Enter your question in the "User Question" box on the left side of the page, where the prompt says, "How can we help you?" For example, type "How many people are there?" and click the send icon to submit your query. 

  
Step 4: Generate the answer 
 
After submitting your query, OpenAI GPT Vision will process it and display the response in the "Answer" section, which also shows the chat history. To clear all chats, click the "Clear Chat" button, or delete a specific response using the delete icon next to it. 

 

Example Prompts and Responses: 

1.Identifying Objects  

Prompt: "How many people are there?" 

Response: "There are three people in the image." 

GPT Vision can accurately identify and count objects in the image. 

2.Explaining Relationships 

Prompt: "What are the doing?" 

Response: "They are harvesting tea leaves in a tea plantation." 

It not only recognizes objects but also understands their actions and relationships within the image. 

3.Describing Colors 

Prompt: "Can you list the colors you see in the image?" 

Response: "The image contains the following colors: Green (tea plants and trees) - White (shirts) - Red (scarves) - Purple (head covering) - Blue (scarf) - Yellow/Orange (bag) - Brown (tree trunks) - Black (clothing elements) - Beige (parts of clothing) 

GPT Vision provides a detailed breakdown of the colors in the image, highlighting even subtle variations. 

Note: Please refer to the image for prompt output 

 

Use Cases:  

1.Image Analysis and Description 

  • Generate detailed descriptions of images for accessibility (e.g., for visually impaired users). 

  • Identify objects, people, or actions in photos. 

  • Analyze complex visuals like charts, graphs, and infographics. 

 

2. Retail and E-commerce 

  • Identify products and categorize them based on visuals. 

  • Analyze images to detect product colors, patterns, and styles. 

  • Provide detailed item descriptions for online listings. 

3. Education and Training 

  • Explain diagrams, illustrations, and educational visuals. 

  • Provide detailed feedback on visual assignments or creative works. 

  • Support interactive learning with visual-based question-answering.