Multi-modal image analysis
Analyze images with GPT-4 Vision using our curated gallery of demo images.
Explore AI's visual understanding with suggested prompts and detailed insights.
AI can identify objects, read text, analyze scenes, and understand visual content in remarkable detail.
Document analysis, scene description, object identification, text extraction, visual reasoning, and creative interpretation.
GPT-4o: Latest multimodal model with excellent vision capabilities
GPT-4o Mini: Faster, cost-effective option
GPT-4 Vision Preview: Original vision model
High: More detailed analysis, higher cost
Auto: Balanced approach
Low: Faster processing, lower cost
Key Kotlin code for AI vision analysis using OpenAI's GPT-4 Vision API:
// Analyze image with GPT-4 Vision
suspend fun createVisionCompletion(
prompt: String,
imageUrl: String,
model: String = "gpt-4o",
maxTokens: Int = 500,
detail: String = "auto"
): ChatCompletionResponse {
val messages = listOf(
Message(
role = "user",
content = listOf(
ContentPart(type = "text", text = prompt),
ContentPart(type = "image_url", imageUrl = ImageUrl(url = imageUrl, detail = detail))
)
)
)
return createChatCompletionWithMessages(messages, model, maxTokens)
}
// Usage in controller
val visionResponse = openAI.createVisionCompletion(
prompt = "Analyze this image and describe what you see",
imageUrl = "https://example.com/image.jpg",
model = "gpt-4o",
detail = "auto"
)
val analysis = visionResponse.text()