Interested in publishing a one-time post on R-bloggers.com? Press here to learn how.
Introduction
Few days ago, Google presented their own multimodal-LLM named as “Gemini”.
Also there was article named “How to Integrate google’s gemini AI model into R” that tells us how to use gemini API in R brieflly.
Thanks to Deepanshu Bhalla (writer of above article), I’ve many inspirations and made some research to utilize Gemini API more. And I’m glad to share the results with you.
In this article, I want to highlight to How to use gemini with R and Shiny via R package for Gemini API
(You can see result and contribute in github repository: gemini.r)
Gemini API
As today (23.12.26), Gemini API is mainly consisted with 4 things. you can see more details in official docs.
1. Gemini Pro: Is get Text and returns Text
2. Gemini Pro Vision: Is get Text and Image and returns Text
3. Gemini Pro Multi-turn: Just chat
4. Embedding: for NLP
and I’ll use 1 & 2.
You can get API keys in Google AI Studio
However, offical docs doesn’t describe for how to use Gemini API in R. (How sad)
But we can handle it as “REST API” ( I’ll explain it later)
Shiny application
I made very brief concept of Shiny application that uses Gemini API for get Image and Text (maybe “Explain this picture”) and returns Answer from Gemini
(Number is expected user flow)
This UI, is consisted 5 components.
1. fileInput for upload image
2. imageOutput for show uploaded Image
3. textInput for prompt
4. actionButton for send API to Gemini
5. textOutput for show return value from Gemini
And this is result of shiny and R code (Again, you can see it in github repository)
—
library(shiny)
library(gemini.R)
ui <- fluidPage(
sidebarLayout(
NULL,
mainPanel(
fileInput(
inputId = “file”,
label = “Choose file to upload”,
),
div(
style = ‘border: solid 1px blue;’,
imageOutput(outputId = “image1”),
),
textInput(
inputId = “prompt”,
label = “Prompt”,
placeholder = “Enter Prompts Here”
),
actionButton(“goButton”, “Ask to gemini”),
div(
style = ‘border: solid 1px blue; min-height: 100px;’, textOutput(“text1”)
)
)
)
)
server <- function(input, output) {
observeEvent(input$file, {
path <- input$file$datapath
output$image1 <- renderImage({
list( src = path )
}, deleteFile = FALSE) })
observeEvent(input$goButton, {
output$text1 <- renderText({
gemini_image(input$prompt, input$file$datapath)
})
})
}
shinyApp(ui = ui, server = server)
—
gemini.R package
I think you may think “What is gemini_image function?”
It is function to send API to Gemini server and return result.
and it consisted with 3 main part.
1. Model query
2. API key
3. Content
I used gemini_image function in example. but I’ll gemini function first (which is function to send text and get text)
Gemini’s API example usage is looks like below. (for REST API)
Which can be transformed like below in R
Also, gemini API key must set before use with “Sys.setenv” function
Anyway, I think you should note, body for API is mainly consisted with list.
Similarly, gemini_image function for Gemini Pro Vision API looks like below
is
Note that, image must encoded as base64 using base64encode function and provided as separated list.
Example
So with Shiny application and gemini.r package.
You now can run example application to ask image to Gemini.
Summary
I made very basic R package “gemini.R” to use Gemini API.
Which provides 2 function: gemini and gemini_image.
And still there’s many possiblity for develop this package.
like feature to Chat like bard or provide NLP Embedding
and finally, I want to hear feedback or contribution from you. (Really)
Thanks.
* P.S, I think just using bard / chatGPT / copilot is much better for personal usage. (unless you don’t want to provide AI service via R)