Compare commits

..

14 Commits

Author SHA1 Message Date
mudler
fd12eef074 Ci: do not run jobs for every branch
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 19:17:28 +02:00
mudler
1b2806d139 Better error handling during planning
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 18:58:35 +02:00
mudler
7fca3620f6 Back at arcee-agent as default
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 18:46:17 +02:00
mudler
289a6ce4c8 Simplify
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 18:38:20 +02:00
mudler
0ac5c13c4d docker compose unification, small changes
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 18:17:43 +02:00
mudler
4858f85ade Enable reasoning in some of the tests
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 18:00:58 +02:00
mudler
71320bc1cb chore(tests): set timeout
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 17:49:06 +02:00
mudler
525efb264e use openthinker, it's smaller 2025-04-12 17:37:19 +02:00
mudler
be5d1a7d80 use 12b
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 14:49:51 +02:00
mudler
9135e3fe57 this is not necessary anymore
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 14:43:29 +02:00
mudler
424ef2dedf change base cpu model
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 13:59:30 +02:00
mudler
0e3df1562c chore: cleanup, identify goal from conversation when evaluting achievement
Signed-off-by: mudler <mudler@localai.io>
2025-04-12 11:41:19 +02:00
Ettore Di Giacinto
209a9989c4 Update README.md 2025-04-11 22:49:50 +02:00
Ettore Di Giacinto
5105b46f48 Add Github reviewer and improve reasoning (#27)
* Add Github reviewer and improve reasoning

* feat: improve action picking

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: mudler <mudler@localai.io>
2025-04-11 21:57:19 +02:00
12 changed files with 268 additions and 312 deletions

View File

@@ -3,7 +3,7 @@ name: Run Go Tests
on: on:
push: push:
branches: branches:
- '**' - 'main'
pull_request: pull_request:
branches: branches:
- '**' - '**'

View File

@@ -3,7 +3,7 @@ IMAGE_NAME?=webui
ROOT_DIR:=$(shell dirname $(realpath $(lastword $(MAKEFILE_LIST)))) ROOT_DIR:=$(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
prepare-tests: prepare-tests:
docker compose up -d docker compose up -d --build
cleanup-tests: cleanup-tests:
docker compose down docker compose down

100
README.md
View File

@@ -1,5 +1,5 @@
<p align="center"> <p align="center">
<img src="https://github.com/user-attachments/assets/6958ffb3-31cf-441e-b99d-ce34ec6fc88f" alt="LocalAGI Logo" width="220"/> <img src="./webui/react-ui/public/logo_1.png" alt="LocalAGI Logo" width="220"/>
</p> </p>
<h3 align="center"><em>Your AI. Your Hardware. Your Rules.</em></h3> <h3 align="center"><em>Your AI. Your Hardware. Your Rules.</em></h3>
@@ -45,14 +45,100 @@ LocalAGI ensures your data stays exactly where you want it—on your hardware. N
git clone https://github.com/mudler/LocalAGI git clone https://github.com/mudler/LocalAGI
cd LocalAGI cd LocalAGI
# CPU setup # CPU setup (default)
docker compose up -f docker-compose.yml docker compose up
# GPU setup # NVIDIA GPU setup
docker compose up -f docker-compose.gpu.yml docker compose --profile nvidia up
# Intel GPU setup (for Intel Arc and integrated GPUs)
docker compose --profile intel up
# Start with a specific model (see available models in models.localai.io, or localai.io to use any model in huggingface)
MODEL_NAME=gemma-3-12b-it docker compose up
# NVIDIA GPU setup with custom multimodal and image models
MODEL_NAME=gemma-3-12b-it \
MULTIMODAL_MODEL=minicpm-v-2_6 \
IMAGE_MODEL=flux.1-dev \
docker compose --profile nvidia up
``` ```
Access your agents at `http://localhost:8080` Now you can access and manage your agents at [http://localhost:8080](http://localhost:8080)
## 🖥️ Hardware Configurations
LocalAGI supports multiple hardware configurations through Docker Compose profiles:
### CPU (Default)
- No special configuration needed
- Runs on any system with Docker
- Best for testing and development
- Supports text models only
### NVIDIA GPU
- Requires NVIDIA GPU and drivers
- Uses CUDA for acceleration
- Best for high-performance inference
- Supports text, multimodal, and image generation models
- Run with: `docker compose --profile nvidia up`
- Default models:
- Text: `arcee-agent`
- Multimodal: `minicpm-v-2_6`
- Image: `flux.1-dev`
- Environment variables:
- `MODEL_NAME`: Text model to use
- `MULTIMODAL_MODEL`: Multimodal model to use
- `IMAGE_MODEL`: Image generation model to use
- `LOCALAI_SINGLE_ACTIVE_BACKEND`: Set to `true` to enable single active backend mode
### Intel GPU
- Supports Intel Arc and integrated GPUs
- Uses SYCL for acceleration
- Best for Intel-based systems
- Supports text, multimodal, and image generation models
- Run with: `docker compose --profile intel up`
- Default models:
- Text: `arcee-agent`
- Multimodal: `minicpm-v-2_6`
- Image: `sd-1.5-ggml`
- Environment variables:
- `MODEL_NAME`: Text model to use
- `MULTIMODAL_MODEL`: Multimodal model to use
- `IMAGE_MODEL`: Image generation model to use
- `LOCALAI_SINGLE_ACTIVE_BACKEND`: Set to `true` to enable single active backend mode
## Customize models
You can customize the models used by LocalAGI by setting environment variables when running docker-compose. For example:
```bash
# CPU with custom model
MODEL_NAME=gemma-3-12b-it docker compose up
# NVIDIA GPU with custom models
MODEL_NAME=gemma-3-12b-it \
MULTIMODAL_MODEL=minicpm-v-2_6 \
IMAGE_MODEL=flux.1-dev \
docker compose --profile nvidia up
# Intel GPU with custom models
MODEL_NAME=gemma-3-12b-it \
MULTIMODAL_MODEL=minicpm-v-2_6 \
IMAGE_MODEL=sd-1.5-ggml \
docker compose --profile intel up
```
If no models are specified, it will use the defaults:
- Text model: `arcee-agent`
- Multimodal model: `minicpm-v-2_6`
- Image model: `flux.1-dev` (NVIDIA) or `sd-1.5-ggml` (Intel)
Good (relatively small) models that have been tested are:
- `qwen_qwq-32b` (best in co-ordinating agents)
- `gemma-3-12b-it`
- `gemma-3-27b-it`
## 🏆 Why Choose LocalAGI? ## 🏆 Why Choose LocalAGI?
@@ -98,6 +184,8 @@ Explore detailed documentation including:
### Environment Configuration ### Environment Configuration
LocalAGI supports environment configurations. Note that these environment variables needs to be specified in the localagi container in the docker-compose file to have effect.
| Variable | What It Does | | Variable | What It Does |
|----------|--------------| |----------|--------------|
| `LOCALAGI_MODEL` | Your go-to model | | `LOCALAGI_MODEL` | Your go-to model |

View File

@@ -10,12 +10,11 @@ import (
// NewGoal creates a new intention action // NewGoal creates a new intention action
// The inention action is special as it tries to identify // The inention action is special as it tries to identify
// a tool to use and a reasoning over to use it // a tool to use and a reasoning over to use it
func NewGoal(s ...string) *GoalAction { func NewGoal() *GoalAction {
return &GoalAction{tools: s} return &GoalAction{}
} }
type GoalAction struct { type GoalAction struct {
tools []string
} }
type GoalResponse struct { type GoalResponse struct {
Goal string `json:"goal"` Goal string `json:"goal"`

View File

@@ -41,7 +41,7 @@ func (a *PlanAction) Plannable() bool {
func (a *PlanAction) Definition() types.ActionDefinition { func (a *PlanAction) Definition() types.ActionDefinition {
return types.ActionDefinition{ return types.ActionDefinition{
Name: PlanActionName, Name: PlanActionName,
Description: "Use this tool for solving complex tasks that involves calling more tools in sequence.", Description: "Use it for situations that involves doing more actions in sequence.",
Properties: map[string]jsonschema.Definition{ Properties: map[string]jsonschema.Definition{
"subtasks": { "subtasks": {
Type: jsonschema.Array, Type: jsonschema.Array,

View File

@@ -24,15 +24,27 @@ type decisionResult struct {
func (a *Agent) decision( func (a *Agent) decision(
ctx context.Context, ctx context.Context,
conversation []openai.ChatCompletionMessage, conversation []openai.ChatCompletionMessage,
tools []openai.Tool, toolchoice any, maxRetries int) (*decisionResult, error) { tools []openai.Tool, toolchoice string, maxRetries int) (*decisionResult, error) {
var choice *openai.ToolChoice
if toolchoice != "" {
choice = &openai.ToolChoice{
Type: openai.ToolTypeFunction,
Function: openai.ToolFunction{Name: toolchoice},
}
}
var lastErr error var lastErr error
for attempts := 0; attempts < maxRetries; attempts++ { for attempts := 0; attempts < maxRetries; attempts++ {
decision := openai.ChatCompletionRequest{ decision := openai.ChatCompletionRequest{
Model: a.options.LLMAPI.Model, Model: a.options.LLMAPI.Model,
Messages: conversation, Messages: conversation,
Tools: tools, Tools: tools,
ToolChoice: toolchoice, }
if choice != nil {
decision.ToolChoice = *choice
} }
resp, err := a.client.CreateChatCompletion(ctx, decision) resp, err := a.client.CreateChatCompletion(ctx, decision)
@@ -42,6 +54,9 @@ func (a *Agent) decision(
continue continue
} }
jsonResp, _ := json.Marshal(resp)
xlog.Debug("Decision response", "response", string(jsonResp))
if len(resp.Choices) != 1 { if len(resp.Choices) != 1 {
lastErr = fmt.Errorf("no choices: %d", len(resp.Choices)) lastErr = fmt.Errorf("no choices: %d", len(resp.Choices))
xlog.Warn("Attempt to make a decision failed", "attempt", attempts+1, "error", lastErr) xlog.Warn("Attempt to make a decision failed", "attempt", attempts+1, "error", lastErr)
@@ -189,10 +204,7 @@ func (a *Agent) generateParameters(ctx context.Context, pickTemplate string, act
result, attemptErr = a.decision(ctx, result, attemptErr = a.decision(ctx,
cc, cc,
a.availableActions().ToTools(), a.availableActions().ToTools(),
openai.ToolChoice{ act.Definition().Name.String(),
Type: openai.ToolTypeFunction,
Function: openai.ToolFunction{Name: act.Definition().Name.String()},
},
maxAttempts, maxAttempts,
) )
if attemptErr == nil && result.actionParams != nil { if attemptErr == nil && result.actionParams != nil {
@@ -253,6 +265,7 @@ func (a *Agent) handlePlanning(ctx context.Context, job *types.Job, chosenAction
params, err := a.generateParameters(ctx, pickTemplate, subTaskAction, conv, subTaskReasoning, maxRetries) params, err := a.generateParameters(ctx, pickTemplate, subTaskAction, conv, subTaskReasoning, maxRetries)
if err != nil { if err != nil {
xlog.Error("error generating action's parameters", "error", err)
return conv, fmt.Errorf("error generating action's parameters: %w", err) return conv, fmt.Errorf("error generating action's parameters: %w", err)
} }
@@ -282,6 +295,7 @@ func (a *Agent) handlePlanning(ctx context.Context, job *types.Job, chosenAction
result, err := a.runAction(ctx, subTaskAction, actionParams) result, err := a.runAction(ctx, subTaskAction, actionParams)
if err != nil { if err != nil {
xlog.Error("error running action", "error", err)
return conv, fmt.Errorf("error running action: %w", err) return conv, fmt.Errorf("error running action: %w", err)
} }
@@ -367,7 +381,9 @@ func (a *Agent) prepareHUD() (promptHUD *PromptHUD) {
func (a *Agent) pickAction(ctx context.Context, templ string, messages []openai.ChatCompletionMessage, maxRetries int) (types.Action, types.ActionParams, string, error) { func (a *Agent) pickAction(ctx context.Context, templ string, messages []openai.ChatCompletionMessage, maxRetries int) (types.Action, types.ActionParams, string, error) {
c := messages c := messages
xlog.Debug("[pickAction] picking action", "messages", messages) xlog.Debug("[pickAction] picking action starts", "messages", messages)
// Identify the goal of this conversation
if !a.options.forceReasoning { if !a.options.forceReasoning {
xlog.Debug("not forcing reasoning") xlog.Debug("not forcing reasoning")
@@ -376,7 +392,7 @@ func (a *Agent) pickAction(ctx context.Context, templ string, messages []openai.
thought, err := a.decision(ctx, thought, err := a.decision(ctx,
messages, messages,
a.availableActions().ToTools(), a.availableActions().ToTools(),
nil, "",
maxRetries) maxRetries)
if err != nil { if err != nil {
return nil, nil, "", err return nil, nil, "", err
@@ -415,120 +431,83 @@ func (a *Agent) pickAction(ctx context.Context, templ string, messages []openai.
}, c...) }, c...)
} }
actionsID := []string{} thought, err := a.decision(ctx,
c,
types.Actions{action.NewReasoning()}.ToTools(),
action.NewReasoning().Definition().Name.String(), maxRetries)
if err != nil {
return nil, nil, "", err
}
originalReasoning := ""
response := &action.ReasoningResponse{}
if thought.actionParams != nil {
if err := thought.actionParams.Unmarshal(response); err != nil {
return nil, nil, "", err
}
originalReasoning = response.Reasoning
}
if thought.message != "" {
originalReasoning = thought.message
}
xlog.Debug("[pickAction] picking action", "messages", c)
// thought, err := a.askLLM(ctx,
// c,
actionsID := []string{"reply"}
for _, m := range a.availableActions() { for _, m := range a.availableActions() {
actionsID = append(actionsID, m.Definition().Name.String()) actionsID = append(actionsID, m.Definition().Name.String())
} }
// thoughtPromptStringBuilder := strings.Builder{} xlog.Debug("[pickAction] actionsID", "actionsID", actionsID)
// thoughtPromptStringBuilder.WriteString("You have to pick an action based on the conversation and the prompt. Describe the full reasoning process for your choice. Here is a list of actions: ")
// for _, m := range a.availableActions() {
// thoughtPromptStringBuilder.WriteString(
// m.Definition().Name.String() + ": " + m.Definition().Description + "\n",
// )
// }
// thoughtPromptStringBuilder.WriteString("To not use any action, respond with 'none'")
//thoughtPromptStringBuilder.WriteString("\n\nConversation: " + Messages(c).RemoveIf(func(msg openai.ChatCompletionMessage) bool {
// return msg.Role == "system"
//}).String())
//thoughtPrompt := thoughtPromptStringBuilder.String()
//thoughtConv := []openai.ChatCompletionMessage{}
thought, err := a.askLLM(ctx,
c,
maxRetries,
)
if err != nil {
return nil, nil, "", err
}
originalReasoning := thought.Content
// From the thought, get the action call
// Get all the available actions IDs
// by grammar, let's decide if we have achieved the goal
// 1. analyze response and check if goal is achieved
params, err := a.decision(ctx,
[]openai.ChatCompletionMessage{
{
Role: "system",
Content: "Extract an action to perform from the following reasoning: ",
},
{
Role: "user",
Content: originalReasoning,
}},
types.Actions{action.NewGoal()}.ToTools(),
action.NewGoal().Definition().Name, maxRetries)
if err != nil {
return nil, nil, "", fmt.Errorf("failed to get the action tool parameters: %v", err)
}
goalResponse := action.GoalResponse{}
err = params.actionParams.Unmarshal(&goalResponse)
if err != nil {
return nil, nil, "", err
}
if goalResponse.Achieved {
xlog.Debug("[pickAction] goal achieved", "goal", goalResponse.Goal)
return nil, nil, "", nil
}
// if the goal is not achieved, pick an action
xlog.Debug("[pickAction] goal not achieved", "goal", goalResponse.Goal)
xlog.Debug("[pickAction] thought", "conv", c, "originalReasoning", originalReasoning)
intentionsTools := action.NewIntention(actionsID...)
// TODO: FORCE to select ana ction here // TODO: FORCE to select ana ction here
// NOTE: we do not give the full conversation here to pick the action // NOTE: we do not give the full conversation here to pick the action
// to avoid hallucinations // to avoid hallucinations
params, err = a.decision(ctx,
[]openai.ChatCompletionMessage{ // Extract an action
{ params, err := a.decision(ctx,
Role: "system", append(c, openai.ChatCompletionMessage{
Content: "Extract an action to perform from the following reasoning: ", Role: "system",
}, Content: "Pick the relevant action given the following reasoning: " + originalReasoning,
{ }),
Role: "user", types.Actions{intentionsTools}.ToTools(),
Content: originalReasoning, intentionsTools.Definition().Name.String(), maxRetries)
}},
a.availableActions().ToTools(),
nil, maxRetries)
if err != nil { if err != nil {
return nil, nil, "", fmt.Errorf("failed to get the action tool parameters: %v", err) return nil, nil, "", fmt.Errorf("failed to get the action tool parameters: %v", err)
} }
chosenAction := a.availableActions().Find(params.actioName) if params.actionParams == nil {
xlog.Debug("[pickAction] no action params found")
return nil, nil, params.message, nil
}
// xlog.Debug("[pickAction] params", "params", params) actionChoice := action.IntentResponse{}
err = params.actionParams.Unmarshal(&actionChoice)
if err != nil {
return nil, nil, "", err
}
// if params.actionParams == nil { if actionChoice.Tool == "" || actionChoice.Tool == "reply" {
// return nil, nil, params.message, nil xlog.Debug("[pickAction] no action found, replying")
// } return nil, nil, "", nil
}
// xlog.Debug("[pickAction] actionChoice", "actionChoice", params.actionParams, "message", params.message) chosenAction := a.availableActions().Find(actionChoice.Tool)
// actionChoice := action.IntentResponse{} xlog.Debug("[pickAction] chosenAction", "chosenAction", chosenAction, "actionName", actionChoice.Tool)
// err = params.actionParams.Unmarshal(&actionChoice) // // Let's double check if the action is correct by asking the LLM to judge it
// if err != nil {
// return nil, nil, "", err
// }
// if actionChoice.Tool == "" || actionChoice.Tool == "none" { // if chosenAction!= nil {
// return nil, nil, "", nil // promptString:= "Given the following goal and thoughts, is the action correct? \n\n"
// } // promptString+= fmt.Sprintf("Goal: %s\n", goalResponse.Goal)
// promptString+= fmt.Sprintf("Thoughts: %s\n", originalReasoning)
// promptString+= fmt.Sprintf("Action: %s\n", chosenAction.Definition().Name.String())
// promptString+= fmt.Sprintf("Action description: %s\n", chosenAction.Definition().Description)
// promptString+= fmt.Sprintf("Action parameters: %s\n", params.actionParams)
// // Find the action
// chosenAction := a.availableActions().Find(actionChoice.Tool)
// if chosenAction == nil {
// return nil, nil, "", fmt.Errorf("no action found for intent:" + actionChoice.Tool)
// } // }
return chosenAction, nil, originalReasoning, nil return chosenAction, nil, originalReasoning, nil

View File

@@ -249,7 +249,7 @@ func (a *Agent) runAction(ctx context.Context, chosenAction types.Action, params
} }
} }
xlog.Info("Running action", "action", chosenAction.Definition().Name, "agent", a.Character.Name) xlog.Info("[runAction] Running action", "action", chosenAction.Definition().Name, "agent", a.Character.Name, "params", params.String())
if chosenAction.Definition().Name.Is(action.StateActionName) { if chosenAction.Definition().Name.Is(action.StateActionName) {
// We need to store the result in the state // We need to store the result in the state
@@ -270,6 +270,8 @@ func (a *Agent) runAction(ctx context.Context, chosenAction types.Action, params
} }
} }
xlog.Debug("[runAction] Action result", "action", chosenAction.Definition().Name, "params", params.String(), "result", result.Result)
return result, nil return result, nil
} }
@@ -603,7 +605,13 @@ func (a *Agent) consumeJob(job *types.Job, role string) {
var err error var err error
conv, err = a.handlePlanning(job.GetContext(), job, chosenAction, actionParams, reasoning, pickTemplate, conv) conv, err = a.handlePlanning(job.GetContext(), job, chosenAction, actionParams, reasoning, pickTemplate, conv)
if err != nil { if err != nil {
job.Result.Finish(fmt.Errorf("error running action: %w", err)) xlog.Error("error handling planning", "error", err)
//job.Result.Conversation = conv
//job.Result.SetResponse(msg.Content)
a.reply(job, role, append(conv, openai.ChatCompletionMessage{
Role: "assistant",
Content: fmt.Sprintf("Error handling planning: %v", err),
}), actionParams, chosenAction, reasoning)
return return
} }
@@ -689,26 +697,6 @@ func (a *Agent) consumeJob(job *types.Job, role string) {
job.SetNextAction(&followingAction, &followingParams, reasoning) job.SetNextAction(&followingAction, &followingParams, reasoning)
a.consumeJob(job, role) a.consumeJob(job, role)
return return
} else if followingAction == nil {
xlog.Info("Not following another action", "agent", a.Character.Name)
if !a.options.forceReasoning {
xlog.Info("Finish conversation with reasoning", "reasoning", reasoning, "agent", a.Character.Name)
msg := openai.ChatCompletionMessage{
Role: "assistant",
Content: reasoning,
}
conv = append(conv, msg)
job.Result.SetResponse(msg.Content)
job.Result.Conversation = conv
job.Result.AddFinalizer(func(conv []openai.ChatCompletionMessage) {
a.saveCurrentConversation(conv)
})
job.Result.Finish(nil)
return
}
} }
a.reply(job, role, conv, actionParams, chosenAction, reasoning) a.reply(job, role, conv, actionParams, chosenAction, reasoning)

View File

@@ -126,6 +126,8 @@ var _ = Describe("Agent test", func() {
agent, err := New( agent, err := New(
WithLLMAPIURL(apiURL), WithLLMAPIURL(apiURL),
WithModel(testModel), WithModel(testModel),
EnableForceReasoning,
WithTimeout("10m"),
WithLoopDetectionSteps(3), WithLoopDetectionSteps(3),
// WithRandomIdentity(), // WithRandomIdentity(),
WithActions(&TestAction{response: map[string]string{ WithActions(&TestAction{response: map[string]string{
@@ -174,7 +176,7 @@ var _ = Describe("Agent test", func() {
agent, err := New( agent, err := New(
WithLLMAPIURL(apiURL), WithLLMAPIURL(apiURL),
WithModel(testModel), WithModel(testModel),
WithTimeout("10m"),
// WithRandomIdentity(), // WithRandomIdentity(),
WithActions(&TestAction{response: map[string]string{ WithActions(&TestAction{response: map[string]string{
"boston": testActionResult, "boston": testActionResult,
@@ -199,6 +201,7 @@ var _ = Describe("Agent test", func() {
agent, err := New( agent, err := New(
WithLLMAPIURL(apiURL), WithLLMAPIURL(apiURL),
WithModel(testModel), WithModel(testModel),
WithTimeout("10m"),
EnableHUD, EnableHUD,
// EnableStandaloneJob, // EnableStandaloneJob,
// WithRandomIdentity(), // WithRandomIdentity(),

View File

@@ -115,7 +115,7 @@ Available Tools:
const reSelfEvalTemplate = pickSelfTemplate const reSelfEvalTemplate = pickSelfTemplate
const pickActionTemplate = hudTemplate + ` const pickActionTemplate = hudTemplate + `
Your only task is to analyze the situation and determine a goal and the best tool to use, or just a final response if we have fullfilled the goal. Your only task is to analyze the conversation and determine a goal and the best tool to use, or just a final response if we have fullfilled the goal.
Guidelines: Guidelines:
1. Review the current state, what was done already and context 1. Review the current state, what was done already and context

View File

@@ -1,75 +0,0 @@
services:
localai:
# See https://localai.io/basics/container/#standard-container-images for
# a list of available container images (or build your own with the provided Dockerfile)
# Available images with CUDA, ROCm, SYCL, Vulkan
# Image list (quay.io): https://quay.io/repository/go-skynet/local-ai?tab=tags
# Image list (dockerhub): https://hub.docker.com/r/localai/localai
image: localai/localai:master-sycl-f32-ffmpeg-core
command:
# - rombo-org_rombo-llm-v3.0-qwen-32b # minimum suggested model
- arcee-agent # (smaller)
- granite-embedding-107m-multilingual
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
interval: 60s
timeout: 10m
retries: 120
ports:
- 8081:8080
environment:
- DEBUG=true
#- LOCALAI_API_KEY=sk-1234567890
volumes:
- ./volumes/models:/build/models:cached
- ./volumes/images:/tmp/generated/images
devices:
# On a system with integrated GPU and an Arc 770, this is the Arc 770
- /dev/dri/card1
- /dev/dri/renderD129
localrecall:
image: quay.io/mudler/localrecall:main
ports:
- 8080
environment:
- COLLECTION_DB_PATH=/db
- EMBEDDING_MODEL=granite-embedding-107m-multilingual
- FILE_ASSETS=/assets
- OPENAI_API_KEY=sk-1234567890
- OPENAI_BASE_URL=http://localai:8080
volumes:
- ./volumes/localrag/db:/db
- ./volumes/localrag/assets/:/assets
localrecall-healthcheck:
depends_on:
localrecall:
condition: service_started
image: busybox
command: ["sh", "-c", "until wget -q -O - http://localrecall:8080 > /dev/null 2>&1; do echo 'Waiting for localrecall...'; sleep 1; done; echo 'localrecall is up!'"]
localagi:
depends_on:
localai:
condition: service_healthy
localrecall-healthcheck:
condition: service_completed_successfully
build:
context: .
dockerfile: Dockerfile.webui
ports:
- 8080:3000
image: quay.io/mudler/localagi:master
environment:
- LOCALAGI_MODEL=arcee-agent
- LOCALAGI_LLM_API_URL=http://localai:8080
#- LOCALAGI_LLM_API_KEY=sk-1234567890
- LOCALAGI_LOCALRAG_URL=http://localrecall:8080
- LOCALAGI_STATE_DIR=/pool
- LOCALAGI_TIMEOUT=5m
- LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./volumes/localagi/:/pool

View File

@@ -1,85 +0,0 @@
services:
localai:
# See https://localai.io/basics/container/#standard-container-images for
# a list of available container images (or build your own with the provided Dockerfile)
# Available images with CUDA, ROCm, SYCL, Vulkan
# Image list (quay.io): https://quay.io/repository/go-skynet/local-ai?tab=tags
# Image list (dockerhub): https://hub.docker.com/r/localai/localai
image: localai/localai:master-gpu-nvidia-cuda-12
command:
- mlabonne_gemma-3-27b-it-abliterated
- qwen_qwq-32b
# Other good alternative options:
# - rombo-org_rombo-llm-v3.0-qwen-32b # minimum suggested model
# - arcee-agent
- granite-embedding-107m-multilingual
- flux.1-dev
- minicpm-v-2_6
environment:
# Enable if you have a single GPU which don't fit all the models
- LOCALAI_SINGLE_ACTIVE_BACKEND=true
- DEBUG=true
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
interval: 10s
timeout: 20m
retries: 20
ports:
- 8081:8080
volumes:
- ./volumes/models:/build/models:cached
- ./volumes/images:/tmp/generated/images
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
localrecall:
image: quay.io/mudler/localrecall:main
ports:
- 8080
environment:
- COLLECTION_DB_PATH=/db
- EMBEDDING_MODEL=granite-embedding-107m-multilingual
- FILE_ASSETS=/assets
- OPENAI_API_KEY=sk-1234567890
- OPENAI_BASE_URL=http://localai:8080
volumes:
- ./volumes/localrag/db:/db
- ./volumes/localrag/assets/:/assets
localrecall-healthcheck:
depends_on:
localrecall:
condition: service_started
image: busybox
command: ["sh", "-c", "until wget -q -O - http://localrecall:8080 > /dev/null 2>&1; do echo 'Waiting for localrecall...'; sleep 1; done; echo 'localrecall is up!'"]
localagi:
depends_on:
localai:
condition: service_healthy
localrecall-healthcheck:
condition: service_completed_successfully
build:
context: .
dockerfile: Dockerfile.webui
ports:
- 8080:3000
image: quay.io/mudler/localagi:master
environment:
- LOCALAGI_MODEL=qwen_qwq-32b
- LOCALAGI_LLM_API_URL=http://localai:8080
#- LOCALAGI_LLM_API_KEY=sk-1234567890
- LOCALAGI_LOCALRAG_URL=http://localrecall:8080
- LOCALAGI_STATE_DIR=/pool
- LOCALAGI_TIMEOUT=5m
- LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false
- LOCALAGI_MULTIMODAL_MODEL=minicpm-v-2_6
- LOCALAGI_IMAGE_MODEL=flux.1-dev
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./volumes/localagi/:/pool

View File

@@ -7,7 +7,8 @@ services:
# Image list (dockerhub): https://hub.docker.com/r/localai/localai # Image list (dockerhub): https://hub.docker.com/r/localai/localai
image: localai/localai:master-ffmpeg-core image: localai/localai:master-ffmpeg-core
command: command:
- arcee-agent # (smaller) # - gemma-3-12b-it
- ${MODEL_NAME:-arcee-agent}
- granite-embedding-107m-multilingual - granite-embedding-107m-multilingual
healthcheck: healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"] test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
@@ -23,14 +24,44 @@ services:
- ./volumes/models:/build/models:cached - ./volumes/models:/build/models:cached
- ./volumes/images:/tmp/generated/images - ./volumes/images:/tmp/generated/images
# decomment the following piece if running with Nvidia GPUs localai-nvidia:
# deploy: profiles: ["nvidia"]
# resources: extends:
# reservations: service: localai
# devices: environment:
# - driver: nvidia - LOCALAI_SINGLE_ACTIVE_BACKEND=true
# count: 1 - DEBUG=true
# capabilities: [gpu] deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
command:
- ${MODEL_NAME:-arcee-agent}
- ${MULTIMODAL_MODEL:-minicpm-v-2_6}
- ${IMAGE_MODEL:-flux.1-dev}
- granite-embedding-107m-multilingual
localai-intel:
profiles: ["intel"]
environment:
- LOCALAI_SINGLE_ACTIVE_BACKEND=true
- DEBUG=true
extends:
service: localai
image: localai/localai:master-sycl-f32-ffmpeg-core
devices:
# On a system with integrated GPU and an Arc 770, this is the Arc 770
- /dev/dri/card1
- /dev/dri/renderD129
command:
- ${MODEL_NAME:-arcee-agent}
- ${MULTIMODAL_MODEL:-minicpm-v-2_6}
- ${IMAGE_MODEL:-sd-1.5-ggml}
- granite-embedding-107m-multilingual
localrecall: localrecall:
image: quay.io/mudler/localrecall:main image: quay.io/mudler/localrecall:main
ports: ports:
@@ -65,7 +96,7 @@ services:
- 8080:3000 - 8080:3000
#image: quay.io/mudler/localagi:master #image: quay.io/mudler/localagi:master
environment: environment:
- LOCALAGI_MODEL=arcee-agent - LOCALAGI_MODEL=${MODEL_NAME:-arcee-agent}
- LOCALAGI_LLM_API_URL=http://localai:8080 - LOCALAGI_LLM_API_URL=http://localai:8080
#- LOCALAGI_LLM_API_KEY=sk-1234567890 #- LOCALAGI_LLM_API_KEY=sk-1234567890
- LOCALAGI_LOCALRAG_URL=http://localrecall:8080 - LOCALAGI_LOCALRAG_URL=http://localrecall:8080
@@ -76,3 +107,31 @@ services:
- "host.docker.internal:host-gateway" - "host.docker.internal:host-gateway"
volumes: volumes:
- ./volumes/localagi/:/pool - ./volumes/localagi/:/pool
localagi-nvidia:
profiles: ["nvidia"]
extends:
service: localagi
environment:
- LOCALAGI_MODEL=${MODEL_NAME:-arcee-agent}
- LOCALAGI_MULTIMODAL_MODEL=${MULTIMODAL_MODEL:-minicpm-v-2_6}
- LOCALAGI_IMAGE_MODEL=${IMAGE_MODEL:-flux.1-dev}
- LOCALAGI_LLM_API_URL=http://localai:8080
- LOCALAGI_LOCALRAG_URL=http://localrecall:8080
- LOCALAGI_STATE_DIR=/pool
- LOCALAGI_TIMEOUT=5m
- LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false
localagi-intel:
profiles: ["intel"]
extends:
service: localagi
environment:
- LOCALAGI_MODEL=${MODEL_NAME:-arcee-agent}
- LOCALAGI_MULTIMODAL_MODEL=${MULTIMODAL_MODEL:-minicpm-v-2_6}
- LOCALAGI_IMAGE_MODEL=${IMAGE_MODEL:-sd-1.5-ggml}
- LOCALAGI_LLM_API_URL=http://localai:8080
- LOCALAGI_LOCALRAG_URL=http://localrecall:8080
- LOCALAGI_STATE_DIR=/pool
- LOCALAGI_TIMEOUT=5m
- LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false