Ci: do not run jobs for every branch

Signed-off-by: mudler <mudler@localai.io>
Better error handling during planning
2025-04-12 19:17:28 +02:00 · 2025-04-12 18:58:35 +02:00 · 2025-04-12 18:46:17 +02:00 · 2025-04-12 18:38:20 +02:00 · 2025-04-12 18:17:43 +02:00 · 2025-04-12 18:00:58 +02:00
12 changed files with 268 additions and 312 deletions
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -3,7 +3,7 @@ name: Run Go Tests
 on:
  push:
    branches:
-      - '**'
+      - 'main'
  pull_request:
    branches:
      - '**'
--- a/2
+++ b/2
@@ -3,7 +3,7 @@ IMAGE_NAME?=webui
 ROOT_DIR:=$(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
 prepare-tests:
-	docker compose up -d
+	docker compose up -d --build
 cleanup-tests:
 	docker compose down
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
 <p align="center">
-  <img src="https://github.com/user-attachments/assets/6958ffb3-31cf-441e-b99d-ce34ec6fc88f" alt="LocalAGI Logo" width="220"/>
+  <img src="./webui/react-ui/public/logo_1.png" alt="LocalAGI Logo" width="220"/>
 </p>
 <h3 align="center"><em>Your AI. Your Hardware. Your Rules.</em></h3>
@@ -45,14 +45,100 @@ LocalAGI ensures your data stays exactly where you want it—on your hardware. N
 git clone https://github.com/mudler/LocalAGI
 cd LocalAGI
-# CPU setup
+# CPU setup (default)
-docker compose up -f docker-compose.yml
+docker compose up
-# GPU setup
+# NVIDIA GPU setup
-docker compose up -f docker-compose.gpu.yml
+docker compose --profile nvidia up
 # Intel GPU setup (for Intel Arc and integrated GPUs)
 docker compose --profile intel up
 # Start with a specific model (see available models in models.localai.io, or localai.io to use any model in huggingface)
 MODEL_NAME=gemma-3-12b-it docker compose up
 # NVIDIA GPU setup with custom multimodal and image models
 MODEL_NAME=gemma-3-12b-it \
 MULTIMODAL_MODEL=minicpm-v-2_6 \
 IMAGE_MODEL=flux.1-dev \
 docker compose --profile nvidia up
 ```
-Access your agents at `http://localhost:8080`
+Now you can access and manage your agents at [http://localhost:8080](http://localhost:8080)
 ## 🖥️ Hardware Configurations
 LocalAGI supports multiple hardware configurations through Docker Compose profiles:
 ### CPU (Default)
 - No special configuration needed
 - Runs on any system with Docker
 - Best for testing and development
 - Supports text models only
 ### NVIDIA GPU
 - Requires NVIDIA GPU and drivers
 - Uses CUDA for acceleration
 - Best for high-performance inference
 - Supports text, multimodal, and image generation models
 - Run with: `docker compose --profile nvidia up`
 - Default models:
  - Text: `arcee-agent`
  - Multimodal: `minicpm-v-2_6`
  - Image: `flux.1-dev`
 - Environment variables:
  - `MODEL_NAME`: Text model to use
  - `MULTIMODAL_MODEL`: Multimodal model to use
  - `IMAGE_MODEL`: Image generation model to use
  - `LOCALAI_SINGLE_ACTIVE_BACKEND`: Set to `true` to enable single active backend mode
 ### Intel GPU
 - Supports Intel Arc and integrated GPUs
 - Uses SYCL for acceleration
 - Best for Intel-based systems
 - Supports text, multimodal, and image generation models
 - Run with: `docker compose --profile intel up`
 - Default models:
  - Text: `arcee-agent`
  - Multimodal: `minicpm-v-2_6`
  - Image: `sd-1.5-ggml`
 - Environment variables:
  - `MODEL_NAME`: Text model to use
  - `MULTIMODAL_MODEL`: Multimodal model to use
  - `IMAGE_MODEL`: Image generation model to use
  - `LOCALAI_SINGLE_ACTIVE_BACKEND`: Set to `true` to enable single active backend mode
 ## Customize models
 You can customize the models used by LocalAGI by setting environment variables when running docker-compose. For example:
 ```bash
 # CPU with custom model
 MODEL_NAME=gemma-3-12b-it docker compose up
 # NVIDIA GPU with custom models
 MODEL_NAME=gemma-3-12b-it \
 MULTIMODAL_MODEL=minicpm-v-2_6 \
 IMAGE_MODEL=flux.1-dev \
 docker compose --profile nvidia up
 # Intel GPU with custom models
 MODEL_NAME=gemma-3-12b-it \
 MULTIMODAL_MODEL=minicpm-v-2_6 \
 IMAGE_MODEL=sd-1.5-ggml \
 docker compose --profile intel up
 ```
 If no models are specified, it will use the defaults:
 - Text model: `arcee-agent`
 - Multimodal model: `minicpm-v-2_6`
 - Image model: `flux.1-dev` (NVIDIA) or `sd-1.5-ggml` (Intel)
 Good (relatively small) models that have been tested are:
 - `qwen_qwq-32b` (best in co-ordinating agents)
 - `gemma-3-12b-it`
 - `gemma-3-27b-it`
 ## 🏆 Why Choose LocalAGI?
@@ -98,6 +184,8 @@ Explore detailed documentation including:
 ### Environment Configuration
 LocalAGI supports environment configurations. Note that these environment variables needs to be specified in the localagi container in the docker-compose file to have effect.
 | Variable | What It Does |
 |----------|--------------|
 | `LOCALAGI_MODEL` | Your go-to model |
--- a/core/action/goal.go
+++ b/core/action/goal.go
@@ -10,12 +10,11 @@ import (
 // NewGoal creates a new intention action
 // The inention action is special as it tries to identify
 // a tool to use and a reasoning over to use it
-func NewGoal(s ...string) *GoalAction {
+func NewGoal() *GoalAction {
-	return &GoalAction{tools: s}
+	return &GoalAction{}
 }
 type GoalAction struct {
 	tools []string
 }
 type GoalResponse struct {
 	Goal     string `json:"goal"`
--- a/core/action/plan.go
+++ b/core/action/plan.go
@@ -41,7 +41,7 @@ func (a *PlanAction) Plannable() bool {
 func (a *PlanAction) Definition() types.ActionDefinition {
 	return types.ActionDefinition{
 		Name:        PlanActionName,
-		Description: "Use this tool for solving complex tasks that involves calling more tools in sequence.",
+		Description: "Use it for situations that involves doing more actions in sequence.",
 		Properties: map[string]jsonschema.Definition{
 			"subtasks": {
 				Type:        jsonschema.Array,
--- a/core/agent/actions.go
+++ b/core/agent/actions.go
@@ -24,15 +24,27 @@ type decisionResult struct {
 func (a *Agent) decision(
 	ctx context.Context,
 	conversation []openai.ChatCompletionMessage,
-	tools []openai.Tool, toolchoice any, maxRetries int) (*decisionResult, error) {
+	tools []openai.Tool, toolchoice string, maxRetries int) (*decisionResult, error) {
 	var choice *openai.ToolChoice
 	if toolchoice != "" {
 		choice = &openai.ToolChoice{
 			Type:     openai.ToolTypeFunction,
 			Function: openai.ToolFunction{Name: toolchoice},
 		}
 	}
 	var lastErr error
 	for attempts := 0; attempts < maxRetries; attempts++ {
 		decision := openai.ChatCompletionRequest{
-			Model:      a.options.LLMAPI.Model,
+			Model:    a.options.LLMAPI.Model,
-			Messages:   conversation,
+			Messages: conversation,
-			Tools:      tools,
+			Tools:    tools,
-			ToolChoice: toolchoice,
+		}
 		if choice != nil {
 			decision.ToolChoice = *choice
 		}
 		resp, err := a.client.CreateChatCompletion(ctx, decision)
@@ -42,6 +54,9 @@ func (a *Agent) decision(
 			continue
 		}
 		jsonResp, _ := json.Marshal(resp)
 		xlog.Debug("Decision response", "response", string(jsonResp))
 		if len(resp.Choices) != 1 {
 			lastErr = fmt.Errorf("no choices: %d", len(resp.Choices))
 			xlog.Warn("Attempt to make a decision failed", "attempt", attempts+1, "error", lastErr)
@@ -189,10 +204,7 @@ func (a *Agent) generateParameters(ctx context.Context, pickTemplate string, act
 		result, attemptErr = a.decision(ctx,
 			cc,
 			a.availableActions().ToTools(),
-			openai.ToolChoice{
+			act.Definition().Name.String(),
 				Type:     openai.ToolTypeFunction,
 				Function: openai.ToolFunction{Name: act.Definition().Name.String()},
 			},
 			maxAttempts,
 		)
 		if attemptErr == nil && result.actionParams != nil {
@@ -253,6 +265,7 @@ func (a *Agent) handlePlanning(ctx context.Context, job *types.Job, chosenAction
 		params, err := a.generateParameters(ctx, pickTemplate, subTaskAction, conv, subTaskReasoning, maxRetries)
 		if err != nil {
 			xlog.Error("error generating action's parameters", "error", err)
 			return conv, fmt.Errorf("error generating action's parameters: %w", err)
 		}
@@ -282,6 +295,7 @@ func (a *Agent) handlePlanning(ctx context.Context, job *types.Job, chosenAction
 		result, err := a.runAction(ctx, subTaskAction, actionParams)
 		if err != nil {
 			xlog.Error("error running action", "error", err)
 			return conv, fmt.Errorf("error running action: %w", err)
 		}
@@ -367,7 +381,9 @@ func (a *Agent) prepareHUD() (promptHUD *PromptHUD) {
 func (a *Agent) pickAction(ctx context.Context, templ string, messages []openai.ChatCompletionMessage, maxRetries int) (types.Action, types.ActionParams, string, error) {
 	c := messages
-	xlog.Debug("[pickAction] picking action", "messages", messages)
+	xlog.Debug("[pickAction] picking action starts", "messages", messages)
 	// Identify the goal of this conversation
 	if !a.options.forceReasoning {
 		xlog.Debug("not forcing reasoning")
@@ -376,7 +392,7 @@ func (a *Agent) pickAction(ctx context.Context, templ string, messages []openai.
 		thought, err := a.decision(ctx,
 			messages,
 			a.availableActions().ToTools(),
-			nil,
+			"",
 			maxRetries)
 		if err != nil {
 			return nil, nil, "", err
@@ -415,120 +431,83 @@ func (a *Agent) pickAction(ctx context.Context, templ string, messages []openai.
 		}, c...)
 	}
-	actionsID := []string{}
+	thought, err := a.decision(ctx,
 		c,
 		types.Actions{action.NewReasoning()}.ToTools(),
 		action.NewReasoning().Definition().Name.String(), maxRetries)
 	if err != nil {
 		return nil, nil, "", err
 	}
 	originalReasoning := ""
 	response := &action.ReasoningResponse{}
 	if thought.actionParams != nil {
 		if err := thought.actionParams.Unmarshal(response); err != nil {
 			return nil, nil, "", err
 		}
 		originalReasoning = response.Reasoning
 	}
 	if thought.message != "" {
 		originalReasoning = thought.message
 	}
 	xlog.Debug("[pickAction] picking action", "messages", c)
 	// thought, err := a.askLLM(ctx,
 	// 	c,
 	actionsID := []string{"reply"}
 	for _, m := range a.availableActions() {
 		actionsID = append(actionsID, m.Definition().Name.String())
 	}
-	// thoughtPromptStringBuilder := strings.Builder{}
+	xlog.Debug("[pickAction] actionsID", "actionsID", actionsID)
 	// thoughtPromptStringBuilder.WriteString("You have to pick an action based on the conversation and the prompt. Describe the full reasoning process for your choice. Here is a list of actions: ")
 	// for _, m := range a.availableActions() {
 	// 	thoughtPromptStringBuilder.WriteString(
 	// 		m.Definition().Name.String() + ": " + m.Definition().Description + "\n",
 	// 	)
 	// }
 	// thoughtPromptStringBuilder.WriteString("To not use any action, respond with 'none'")
 	//thoughtPromptStringBuilder.WriteString("\n\nConversation: " + Messages(c).RemoveIf(func(msg openai.ChatCompletionMessage) bool {
 	//	return msg.Role == "system"
 	//}).String())
 	//thoughtPrompt := thoughtPromptStringBuilder.String()
 	//thoughtConv := []openai.ChatCompletionMessage{}
 	thought, err := a.askLLM(ctx,
 		c,
 		maxRetries,
 	)
 	if err != nil {
 		return nil, nil, "", err
 	}
 	originalReasoning := thought.Content
 	// From the thought, get the action call
 	// Get all the available actions IDs
 	// by grammar, let's decide if we have achieved the goal
 	//  1. analyze response and check if  goal is achieved
 	params, err := a.decision(ctx,
 		[]openai.ChatCompletionMessage{
 			{
 				Role:    "system",
 				Content: "Extract an action to perform from the following reasoning: ",
 			},
 			{
 				Role:    "user",
 				Content: originalReasoning,
 			}},
 		types.Actions{action.NewGoal()}.ToTools(),
 		action.NewGoal().Definition().Name, maxRetries)
 	if err != nil {
 		return nil, nil, "", fmt.Errorf("failed to get the action tool parameters: %v", err)
 	}
 	goalResponse := action.GoalResponse{}
 	err = params.actionParams.Unmarshal(&goalResponse)
 	if err != nil {
 		return nil, nil, "", err
 	}
 	if goalResponse.Achieved {
 		xlog.Debug("[pickAction] goal achieved", "goal", goalResponse.Goal)
 		return nil, nil, "", nil
 	}
 	// if the goal is not achieved, pick an action
 	xlog.Debug("[pickAction] goal not achieved", "goal", goalResponse.Goal)
 	xlog.Debug("[pickAction] thought", "conv", c, "originalReasoning", originalReasoning)
 	intentionsTools := action.NewIntention(actionsID...)
 	// TODO: FORCE to select ana ction here
 	// NOTE: we do not give the full conversation here to pick the action
 	// to avoid hallucinations
-	params, err = a.decision(ctx,
+
-		[]openai.ChatCompletionMessage{
+	// Extract an action
-			{
+	params, err := a.decision(ctx,
-				Role:    "system",
+		append(c, openai.ChatCompletionMessage{
-				Content: "Extract an action to perform from the following reasoning: ",
+			Role:    "system",
-			},
+			Content: "Pick the relevant action given the following reasoning: " + originalReasoning,
-			{
+		}),
-				Role:    "user",
+		types.Actions{intentionsTools}.ToTools(),
-				Content: originalReasoning,
+		intentionsTools.Definition().Name.String(), maxRetries)
 			}},
 		a.availableActions().ToTools(),
 		nil, maxRetries)
 	if err != nil {
 		return nil, nil, "", fmt.Errorf("failed to get the action tool parameters: %v", err)
 	}
-	chosenAction := a.availableActions().Find(params.actioName)
+	if params.actionParams == nil {
 		xlog.Debug("[pickAction] no action params found")
 		return nil, nil, params.message, nil
 	}
-	// xlog.Debug("[pickAction] params", "params", params)
+	actionChoice := action.IntentResponse{}
 	err = params.actionParams.Unmarshal(&actionChoice)
 	if err != nil {
 		return nil, nil, "", err
 	}
-	// if params.actionParams == nil {
+	if actionChoice.Tool == "" || actionChoice.Tool == "reply" {
-	// 	return nil, nil, params.message, nil
+		xlog.Debug("[pickAction] no action found, replying")
-	// }
+		return nil, nil, "", nil
 	}
-	// xlog.Debug("[pickAction] actionChoice", "actionChoice", params.actionParams, "message", params.message)
+	chosenAction := a.availableActions().Find(actionChoice.Tool)
-	// actionChoice := action.IntentResponse{}
+	xlog.Debug("[pickAction] chosenAction", "chosenAction", chosenAction, "actionName", actionChoice.Tool)
-	// err = params.actionParams.Unmarshal(&actionChoice)
+	// // Let's double check if the action is correct by asking the LLM to judge it
 	// if err != nil {
 	// 	return nil, nil, "", err
 	// }
-	// if actionChoice.Tool == "" || actionChoice.Tool == "none" {
+	// if chosenAction!= nil {
-	// 	return nil, nil, "", nil
+	// 	promptString:= "Given the following goal and thoughts, is the action correct? \n\n"
-	// }
+	// 	promptString+= fmt.Sprintf("Goal: %s\n", goalResponse.Goal)
 	// 	promptString+= fmt.Sprintf("Thoughts: %s\n", originalReasoning)
 	// 	promptString+= fmt.Sprintf("Action: %s\n", chosenAction.Definition().Name.String())
 	// 	promptString+= fmt.Sprintf("Action description: %s\n", chosenAction.Definition().Description)
 	// 	promptString+= fmt.Sprintf("Action parameters: %s\n", params.actionParams)
 	// // Find the action
 	// chosenAction := a.availableActions().Find(actionChoice.Tool)
 	// if chosenAction == nil {
 	// 	return nil, nil, "", fmt.Errorf("no action found for intent:" + actionChoice.Tool)
 	// }
 	return chosenAction, nil, originalReasoning, nil
--- a/core/agent/agent.go
+++ b/core/agent/agent.go
@@ -249,7 +249,7 @@ func (a *Agent) runAction(ctx context.Context, chosenAction types.Action, params
 		}
 	}
-	xlog.Info("Running action", "action", chosenAction.Definition().Name, "agent", a.Character.Name)
+	xlog.Info("[runAction] Running action", "action", chosenAction.Definition().Name, "agent", a.Character.Name, "params", params.String())
 	if chosenAction.Definition().Name.Is(action.StateActionName) {
 		// We need to store the result in the state
@@ -270,6 +270,8 @@ func (a *Agent) runAction(ctx context.Context, chosenAction types.Action, params
 		}
 	}
 	xlog.Debug("[runAction] Action result", "action", chosenAction.Definition().Name, "params", params.String(), "result", result.Result)
 	return result, nil
 }
@@ -603,7 +605,13 @@ func (a *Agent) consumeJob(job *types.Job, role string) {
 	var err error
 	conv, err = a.handlePlanning(job.GetContext(), job, chosenAction, actionParams, reasoning, pickTemplate, conv)
 	if err != nil {
-		job.Result.Finish(fmt.Errorf("error running action: %w", err))
+		xlog.Error("error handling planning", "error", err)
 		//job.Result.Conversation = conv
 		//job.Result.SetResponse(msg.Content)
 		a.reply(job, role, append(conv, openai.ChatCompletionMessage{
 			Role:    "assistant",
 			Content: fmt.Sprintf("Error handling planning: %v", err),
 		}), actionParams, chosenAction, reasoning)
 		return
 	}
@@ -689,26 +697,6 @@ func (a *Agent) consumeJob(job *types.Job, role string) {
 		job.SetNextAction(&followingAction, &followingParams, reasoning)
 		a.consumeJob(job, role)
 		return
 	} else if followingAction == nil {
 		xlog.Info("Not following another action", "agent", a.Character.Name)
 		if !a.options.forceReasoning {
 			xlog.Info("Finish conversation with reasoning", "reasoning", reasoning, "agent", a.Character.Name)
 			msg := openai.ChatCompletionMessage{
 				Role:    "assistant",
 				Content: reasoning,
 			}
 			conv = append(conv, msg)
 			job.Result.SetResponse(msg.Content)
 			job.Result.Conversation = conv
 			job.Result.AddFinalizer(func(conv []openai.ChatCompletionMessage) {
 				a.saveCurrentConversation(conv)
 			})
 			job.Result.Finish(nil)
 			return
 		}
 	}
 	a.reply(job, role, conv, actionParams, chosenAction, reasoning)
--- a/core/agent/agent_test.go
+++ b/core/agent/agent_test.go
@@ -126,6 +126,8 @@ var _ = Describe("Agent test", func() {
 			agent, err := New(
 				WithLLMAPIURL(apiURL),
 				WithModel(testModel),
 				EnableForceReasoning,
 				WithTimeout("10m"),
 				WithLoopDetectionSteps(3),
 				//	WithRandomIdentity(),
 				WithActions(&TestAction{response: map[string]string{
@@ -174,7 +176,7 @@ var _ = Describe("Agent test", func() {
 			agent, err := New(
 				WithLLMAPIURL(apiURL),
 				WithModel(testModel),
-
+				WithTimeout("10m"),
 				//	WithRandomIdentity(),
 				WithActions(&TestAction{response: map[string]string{
 					"boston": testActionResult,
@@ -199,6 +201,7 @@ var _ = Describe("Agent test", func() {
 			agent, err := New(
 				WithLLMAPIURL(apiURL),
 				WithModel(testModel),
 				WithTimeout("10m"),
 				EnableHUD,
 				//	EnableStandaloneJob,
 				//	WithRandomIdentity(),
--- a/core/agent/templates.go
+++ b/core/agent/templates.go
@@ -115,7 +115,7 @@ Available Tools:
 const reSelfEvalTemplate = pickSelfTemplate
 const pickActionTemplate = hudTemplate + `
-Your only task is to analyze the situation and determine a goal and the best tool to use, or just a final response if we have fullfilled the goal.
+Your only task is to analyze the conversation and determine a goal and the best tool to use, or just a final response if we have fullfilled the goal.
 Guidelines:
 1. Review the current state, what was done already and context
--- a/docker-compose.gpu.intel.yaml
+++ b/docker-compose.gpu.intel.yaml
@@ -1,75 +0,0 @@
 services:
  localai:
    # See https://localai.io/basics/container/#standard-container-images for
    # a list of available container images (or build your own with the provided Dockerfile)
    # Available images with CUDA, ROCm, SYCL, Vulkan
    # Image list (quay.io): https://quay.io/repository/go-skynet/local-ai?tab=tags
    # Image list (dockerhub): https://hub.docker.com/r/localai/localai
    image: localai/localai:master-sycl-f32-ffmpeg-core
    command: 
    # - rombo-org_rombo-llm-v3.0-qwen-32b # minimum suggested model
    - arcee-agent # (smaller)
    - granite-embedding-107m-multilingual
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
      interval: 60s
      timeout: 10m
      retries: 120
    ports:
    - 8081:8080
    environment:
      - DEBUG=true
      #- LOCALAI_API_KEY=sk-1234567890
    volumes:
      - ./volumes/models:/build/models:cached
      - ./volumes/images:/tmp/generated/images
    devices:
      # On a system with integrated GPU and an Arc 770, this is the Arc 770
      - /dev/dri/card1
      - /dev/dri/renderD129
  localrecall:
    image: quay.io/mudler/localrecall:main
    ports:
      - 8080
    environment:
      - COLLECTION_DB_PATH=/db
      - EMBEDDING_MODEL=granite-embedding-107m-multilingual
      - FILE_ASSETS=/assets
      - OPENAI_API_KEY=sk-1234567890
      - OPENAI_BASE_URL=http://localai:8080
    volumes:
      - ./volumes/localrag/db:/db
      - ./volumes/localrag/assets/:/assets
  localrecall-healthcheck:
    depends_on:
      localrecall:
        condition: service_started
    image: busybox
    command: ["sh", "-c", "until wget -q -O - http://localrecall:8080 > /dev/null 2>&1; do echo 'Waiting for localrecall...'; sleep 1; done; echo 'localrecall is up!'"]
  localagi:
    depends_on:
      localai:
        condition: service_healthy
      localrecall-healthcheck:
        condition: service_completed_successfully
    build:
      context: .
      dockerfile: Dockerfile.webui
    ports:
      - 8080:3000
    image: quay.io/mudler/localagi:master
    environment:
      - LOCALAGI_MODEL=arcee-agent
      - LOCALAGI_LLM_API_URL=http://localai:8080
      #- LOCALAGI_LLM_API_KEY=sk-1234567890
      - LOCALAGI_LOCALRAG_URL=http://localrecall:8080
      - LOCALAGI_STATE_DIR=/pool
      - LOCALAGI_TIMEOUT=5m
      - LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./volumes/localagi/:/pool
--- a/docker-compose.gpu.yaml
+++ b/docker-compose.gpu.yaml
@@ -1,85 +0,0 @@
 services:
  localai:
    # See https://localai.io/basics/container/#standard-container-images for
    # a list of available container images (or build your own with the provided Dockerfile)
    # Available images with CUDA, ROCm, SYCL, Vulkan
    # Image list (quay.io): https://quay.io/repository/go-skynet/local-ai?tab=tags
    # Image list (dockerhub): https://hub.docker.com/r/localai/localai
    image: localai/localai:master-gpu-nvidia-cuda-12
    command: 
    - mlabonne_gemma-3-27b-it-abliterated
    - qwen_qwq-32b
    # Other good alternative options:
    # - rombo-org_rombo-llm-v3.0-qwen-32b # minimum suggested model
    # - arcee-agent
    - granite-embedding-107m-multilingual
    - flux.1-dev
    - minicpm-v-2_6
    environment:
      # Enable if you have a single GPU which don't fit all the models
      - LOCALAI_SINGLE_ACTIVE_BACKEND=true
      - DEBUG=true
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
      interval: 10s
      timeout: 20m
      retries: 20
    ports:
    - 8081:8080
    volumes:
      - ./volumes/models:/build/models:cached
      - ./volumes/images:/tmp/generated/images
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
  localrecall:
    image: quay.io/mudler/localrecall:main
    ports:
      - 8080
    environment:
      - COLLECTION_DB_PATH=/db
      - EMBEDDING_MODEL=granite-embedding-107m-multilingual
      - FILE_ASSETS=/assets
      - OPENAI_API_KEY=sk-1234567890
      - OPENAI_BASE_URL=http://localai:8080
    volumes:
      - ./volumes/localrag/db:/db
      - ./volumes/localrag/assets/:/assets
  localrecall-healthcheck:
    depends_on:
      localrecall:
        condition: service_started
    image: busybox
    command: ["sh", "-c", "until wget -q -O - http://localrecall:8080 > /dev/null 2>&1; do echo 'Waiting for localrecall...'; sleep 1; done; echo 'localrecall is up!'"]
  localagi:
    depends_on:
      localai:
        condition: service_healthy
      localrecall-healthcheck:
        condition: service_completed_successfully
    build:
      context: .
      dockerfile: Dockerfile.webui
    ports:
      - 8080:3000
    image: quay.io/mudler/localagi:master
    environment:
      - LOCALAGI_MODEL=qwen_qwq-32b
      - LOCALAGI_LLM_API_URL=http://localai:8080
      #- LOCALAGI_LLM_API_KEY=sk-1234567890
      - LOCALAGI_LOCALRAG_URL=http://localrecall:8080
      - LOCALAGI_STATE_DIR=/pool
      - LOCALAGI_TIMEOUT=5m
      - LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false
      - LOCALAGI_MULTIMODAL_MODEL=minicpm-v-2_6
      - LOCALAGI_IMAGE_MODEL=flux.1-dev
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./volumes/localagi/:/pool
--- a/docker-compose.yaml
+++ b/docker-compose.yaml
@@ -7,7 +7,8 @@ services:
    # Image list (dockerhub): https://hub.docker.com/r/localai/localai
    image: localai/localai:master-ffmpeg-core
    command: 
-    - arcee-agent # (smaller)
+    # - gemma-3-12b-it
    - ${MODEL_NAME:-arcee-agent}
    - granite-embedding-107m-multilingual
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
@@ -23,14 +24,44 @@ services:
      - ./volumes/models:/build/models:cached
      - ./volumes/images:/tmp/generated/images
-    # decomment the following piece if running with Nvidia GPUs
+  localai-nvidia:
-    # deploy:
+    profiles: ["nvidia"]
-    #   resources:
+    extends:
-    #     reservations:
+      service: localai
-    #       devices:
+    environment:
-    #         - driver: nvidia
+      - LOCALAI_SINGLE_ACTIVE_BACKEND=true
-    #           count: 1
+      - DEBUG=true
-    #           capabilities: [gpu]
+    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    command: 
    - ${MODEL_NAME:-arcee-agent}
    - ${MULTIMODAL_MODEL:-minicpm-v-2_6}
    - ${IMAGE_MODEL:-flux.1-dev}
    - granite-embedding-107m-multilingual
  localai-intel:
    profiles: ["intel"]
    environment:
      - LOCALAI_SINGLE_ACTIVE_BACKEND=true
      - DEBUG=true
    extends:
      service: localai
    image: localai/localai:master-sycl-f32-ffmpeg-core
    devices:
      # On a system with integrated GPU and an Arc 770, this is the Arc 770
      - /dev/dri/card1
      - /dev/dri/renderD129
    command: 
    - ${MODEL_NAME:-arcee-agent}
    - ${MULTIMODAL_MODEL:-minicpm-v-2_6}
    - ${IMAGE_MODEL:-sd-1.5-ggml}
    - granite-embedding-107m-multilingual
  localrecall:
    image: quay.io/mudler/localrecall:main
    ports:
@@ -65,7 +96,7 @@ services:
      - 8080:3000
    #image: quay.io/mudler/localagi:master
    environment:
-      - LOCALAGI_MODEL=arcee-agent
+      - LOCALAGI_MODEL=${MODEL_NAME:-arcee-agent}
      - LOCALAGI_LLM_API_URL=http://localai:8080
      #- LOCALAGI_LLM_API_KEY=sk-1234567890
      - LOCALAGI_LOCALRAG_URL=http://localrecall:8080
@@ -76,3 +107,31 @@ services:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./volumes/localagi/:/pool
  localagi-nvidia:
    profiles: ["nvidia"]
    extends:
      service: localagi
    environment:
      - LOCALAGI_MODEL=${MODEL_NAME:-arcee-agent}
      - LOCALAGI_MULTIMODAL_MODEL=${MULTIMODAL_MODEL:-minicpm-v-2_6}
      - LOCALAGI_IMAGE_MODEL=${IMAGE_MODEL:-flux.1-dev}
      - LOCALAGI_LLM_API_URL=http://localai:8080
      - LOCALAGI_LOCALRAG_URL=http://localrecall:8080
      - LOCALAGI_STATE_DIR=/pool
      - LOCALAGI_TIMEOUT=5m
      - LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false
  localagi-intel:
    profiles: ["intel"]
    extends:
      service: localagi
    environment:
      - LOCALAGI_MODEL=${MODEL_NAME:-arcee-agent}
      - LOCALAGI_MULTIMODAL_MODEL=${MULTIMODAL_MODEL:-minicpm-v-2_6}
      - LOCALAGI_IMAGE_MODEL=${IMAGE_MODEL:-sd-1.5-ggml}
      - LOCALAGI_LLM_API_URL=http://localai:8080
      - LOCALAGI_LOCALRAG_URL=http://localrecall:8080
      - LOCALAGI_STATE_DIR=/pool
      - LOCALAGI_TIMEOUT=5m
      - LOCALAGI_ENABLE_CONVERSATIONS_LOGGING=false
Author	SHA1	Message	Date
mudler	fd12eef074	Ci: do not run jobs for every branch Signed-off-by: mudler <mudler@localai.io>	2025-04-12 19:17:28 +02:00
mudler	1b2806d139	Better error handling during planning Signed-off-by: mudler <mudler@localai.io>	2025-04-12 18:58:35 +02:00
mudler	7fca3620f6	Back at arcee-agent as default Signed-off-by: mudler <mudler@localai.io>	2025-04-12 18:46:17 +02:00
mudler	289a6ce4c8	Simplify Signed-off-by: mudler <mudler@localai.io>	2025-04-12 18:38:20 +02:00
mudler	0ac5c13c4d	docker compose unification, small changes Signed-off-by: mudler <mudler@localai.io>	2025-04-12 18:17:43 +02:00
mudler	4858f85ade	Enable reasoning in some of the tests Signed-off-by: mudler <mudler@localai.io>	2025-04-12 18:00:58 +02:00
mudler	71320bc1cb	chore(tests): set timeout Signed-off-by: mudler <mudler@localai.io>	2025-04-12 17:49:06 +02:00
mudler	525efb264e	use openthinker, it's smaller	2025-04-12 17:37:19 +02:00
mudler	be5d1a7d80	use 12b Signed-off-by: mudler <mudler@localai.io>	2025-04-12 14:49:51 +02:00
mudler	9135e3fe57	this is not necessary anymore Signed-off-by: mudler <mudler@localai.io>	2025-04-12 14:43:29 +02:00
mudler	424ef2dedf	change base cpu model Signed-off-by: mudler <mudler@localai.io>	2025-04-12 13:59:30 +02:00
mudler	0e3df1562c	chore: cleanup, identify goal from conversation when evaluting achievement Signed-off-by: mudler <mudler@localai.io>	2025-04-12 11:41:19 +02:00
Ettore Di Giacinto	209a9989c4	Update README.md	2025-04-11 22:49:50 +02:00
Ettore Di Giacinto	5105b46f48	Add Github reviewer and improve reasoning (#27 ) * Add Github reviewer and improve reasoning * feat: improve action picking Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: mudler <mudler@localai.io>	2025-04-11 21:57:19 +02:00