Skip to content

fix(instance): POST /instance/pair returns empty pairing code instead of working code#39

Open
edilsonoliveirama wants to merge 6 commits intoEvolutionAPI:mainfrom
edilsonoliveirama:fix/pair-empty-code-issue-21
Open

fix(instance): POST /instance/pair returns empty pairing code instead of working code#39
edilsonoliveirama wants to merge 6 commits intoEvolutionAPI:mainfrom
edilsonoliveirama:fix/pair-empty-code-issue-21

Conversation

@edilsonoliveirama
Copy link
Copy Markdown

@edilsonoliveirama edilsonoliveirama commented Apr 20, 2026

Closes #21

Root cause

Dois bugs na função Pair:

  1. Erro silenciado: PairPhone retornava erro (client nil ou não conectado), mas o erro era apenas logado — a função retornava HTTP 200 com PairingCode: "", enganando o caller.

  2. Client nunca iniciado: o whatsmeow exige que o client esteja conectado ao websocket do WhatsApp (em estado "aguardando autenticação") antes de PairPhone ser chamado. A implementação chamava PairPhone diretamente sem setup.

Correção

  • Inicia a instância automaticamente se não houver conexão ativa (igual ao fluxo do QR code)
  • Aguarda 3 segundos para a conexão WA e geração inicial do QR se estabelecerem (requisito do whatsmeow antes de PairPhone)
  • Rejeita com erro se a instância já estiver autenticada
  • Retorna erros de PairPhone ao caller — handler responde HTTP 500 com mensagem acionável em vez de 200 com código vazio

Validação

Testado localmente com número real: POST /instance/pair retornou código XXXXXXXX, inserido no WhatsApp → instância conectada com sucesso.

Summary by Sourcery

Fix phone pairing endpoint to correctly establish WhatsApp client connection and surface errors while adding observability and dashboard improvements.

New Features:

  • Expose Prometheus metrics endpoint with HTTP and instance connectivity metrics and add a lightweight HTML dashboard for instance status and server health.
  • Allow configurable mute duration for chats via the API, including support for indefinite mutes.

Bug Fixes:

  • Ensure POST /instance/pair initializes the WhatsApp client connection when needed, rejects already authenticated instances, and returns errors instead of empty pairing codes.
  • Update instance status reporting to safely handle missing clients without failing.
  • Enable previously disabled chat pin/archive/mute endpoints by removing temporary 'not working' flags.

Enhancements:

  • Refine manager routing to serve a standalone metrics dashboard at /manager while preserving the existing React bundle routes.
  • Add basic validation and limits for chat mute duration values to prevent invalid or extreme inputs.

Build:

  • Include the new dashboard assets in the Docker image build and add Prometheus client dependencies.

edilsonoliveirama and others added 6 commits April 20, 2026 13:28
The /manager dashboard previously showed only a static placeholder
("Dashboard content will be implemented here..."). This replaces it
with a standalone HTML page that fetches live data from the API and
displays real metrics:

- Total instances count
- Connected instances count and percentage
- Disconnected instances count
- Server health status (GET /server/ok)
- AlwaysOnline count
- Instance table with name, status badge, phone number, client and
  AlwaysOnline indicator
- Auto-refresh every 30 seconds with manual refresh button

Implementation uses a standalone HTML file (Tailwind CDN + vanilla JS
fetch) served at GET /manager, keeping the existing compiled bundle
intact for all other routes (/manager/instances, /manager/login, etc.).

Changes:
- manager/dashboard/index.html: new self-contained dashboard page
- pkg/routes/routes.go: serve dashboard/index.html for GET /manager
  (exact), keep dist/index.html for GET /manager/*any (wildcard)
- Dockerfile: copy manager/dashboard/ into the final image
- .gitignore: exclude manager build artifacts from version control

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removes the '// TODO: not working' markers from the six chat endpoints
(pin, unpin, archive, unarchive, mute, unmute). Investigation confirmed
the implementation is correct: the endpoints work on fully-established
sessions that have synced WhatsApp app state keys. The markers were
likely added after testing on a fresh session where keys had not yet
been distributed by the WhatsApp server.

Also fixes the hardcoded 1-hour mute duration: the BodyStruct now
accepts an optional `duration` field (seconds). Sending 0 or omitting
the field mutes the chat indefinitely, matching WhatsApp's own behaviour.
Reject negative duration values with a 400-level validation error.
Document that duration=0 maps to 'mute forever' (BuildMute treats 0
as a zero time.Duration, which causes BuildMuteAbs to set the
WhatsApp sentinel timestamp of -1).
Clamp duration to a maximum of 1 year (31536000 seconds) to avoid
unreasonably large timestamps being sent to the WhatsApp API.
Adds GET /metrics serving standard Prometheus text format.
No authentication required — follows the Prometheus convention of
protecting the endpoint at the network/ingress level.

Metrics exposed:

  evolution_instances_total               total registered instances (gauge)
  evolution_instances_connected           connected instances (gauge)
  evolution_instances_disconnected        disconnected instances (gauge)
  evolution_http_requests_total           HTTP requests by method/path/status (counter)
  evolution_http_request_duration_seconds HTTP latency by method/path (histogram)
  evolution_build_info                    always 1, version label carries the value (gauge)
  evolution_uptime_seconds                seconds since server start (gauge)

Instance gauges use a custom Collector that queries the database on
each scrape, so values are always current without event hooks.
HTTP path labels use Gin registered route patterns (e.g. /instance/:instanceId)
to keep cardinality bounded regardless of distinct IDs in the path.

New dependency: github.com/prometheus/client_golang v1.20.5
…volutionAPI#20

GET /instance/status was calling ensureClientConnected, which returns
an error when the WhatsApp client exists but is not connected (e.g.
after the user manually removes the device from their phone).
This caused the endpoint to return HTTP 400 until the container was
restarted, making it impossible for clients to detect the disconnected
state without restarting the server.

Status is a read-only query: it should report the current state, not
require an active connection to do so. The fix reads clientPointer
directly and returns Connected=false/LoggedIn=false when the client
is nil or disconnected, without attempting reconnection.

Fixes EvolutionAPI#20
The Pair function was calling PairPhone directly without checking if
the client was initialized, and was silently swallowing errors from
PairPhone. This caused two problems:

1. If the client was nil or disconnected, PairPhone returned an error
   but the function ignored it and returned HTTP 200 with an empty
   PairingCode field, misleading the caller into thinking it succeeded.

2. The client was never started before PairPhone was called. WhatsApp
   requires the client to be connected to the WA websocket (waiting
   for auth) before a pairing code can be generated.

Fix:
- Start the instance automatically if no active connection exists,
  mirroring the QR code flow
- Wait 3 seconds for the WA websocket connection to establish and
  the initial QR generation to begin (required by whatsmeow before
  PairPhone can be called)
- Reject early if the instance is already authenticated
- Return PairPhone errors to the caller instead of swallowing them,
  so the handler correctly responds with HTTP 500 and an actionable
  error message

Fixes EvolutionAPI#21
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Apr 20, 2026

Reviewer's Guide

Implements a robust fix for /instance/pair by properly initializing and validating the WhatsApp client before pairing, improves status handling and chat mute API, introduces Prometheus metrics and a new HTML dashboard, and exposes metrics/manager assets via updated routes and Docker config.

Updated class diagram for metrics, instance repository and chat mute

classDiagram
    class InstanceRepository {
        <<interface>>
        +Create(instance *Instance) error
        +Get(instanceId string) (*Instance, error)
        +GetAllConnectedInstances() []*Instance
        +GetAllConnectedInstancesByClientName(clientName string) []*Instance
        +GetAll(clientName string) []*Instance
        +GetAllInstances() ([]*Instance, error)
        +Delete(instanceId string) error
        +GetAdvancedSettings(instanceId string) (*AdvancedSettings, error)
        +UpdateAdvancedSettings(instanceId string, settings *AdvancedSettings) error
    }

    class instanceRepository {
        -db *gorm.DB
        +GetAllInstances() ([]*Instance, error)
        +Delete(instanceId string) error
        +GetAll(clientName string) ([]*Instance, error)
    }

    InstanceRepository <|.. instanceRepository

    class Registry {
        -reg *prometheus.Registry
        -httpRequests *prometheus.CounterVec
        -httpDuration *prometheus.HistogramVec
        +New(version string, instanceRepo InstanceRepository) *Registry
        +Handler() http.Handler
        +GinMiddleware() gin.HandlerFunc
    }

    class instanceCollector {
        -repo InstanceRepository
        -descTotal *prometheus.Desc
        -descConnected *prometheus.Desc
        -descDisconnected *prometheus.Desc
        +Describe(ch chan<- *prometheus.Desc)
        +Collect(ch chan<- prometheus.Metric)
    }

    Registry --> instanceCollector : uses
    instanceCollector --> InstanceRepository : queries

    class chatService {
        -loggerWrapper LoggerWrapper
        +ChatMute(data *BodyStruct, instance *Instance) (string, error)
        +ChatUnmute(data *BodyStruct, instance *Instance) (string, error)
        +ensureClientConnected(instanceId string) (*WAClient, error)
    }

    class BodyStruct {
        +Chat string
        +Duration int64
    }

    chatService --> BodyStruct : parameter

    class instances {
        -clientPointer map[string]*WAClient
        -loggerWrapper LoggerWrapper
        -whatsmeowService WhatsmeowService
        +Status(instance *Instance) (*StatusStruct, error)
        +GetQr(instance *Instance) (*QrcodeStruct, error)
        +Pair(data *PairStruct, instance *Instance) (*PairReturnStruct, error)
    }

    class StatusStruct {
        +Connected bool
        +LoggedIn bool
        +myJid string
        +Name string
    }

    class PairStruct {
        +Phone string
    }

    class PairReturnStruct {
        +PairingCode string
    }

    instances --> StatusStruct : returns
    instances --> PairStruct : parameter
    instances --> PairReturnStruct : returns
    instances --> InstanceRepository : may use via services

    class WAClient {
        +IsConnected() bool
        +IsLoggedIn() bool
        +PairPhone(ctx context.Context, phone string, force bool, clientType whatsmeow.PairClientType, clientName string) (string, error)
        +SendAppState(ctx context.Context, muteState appstate.MuteState) error
    }

    instances --> WAClient : uses clientPointer
    chatService --> WAClient : uses ensureClientConnected

    class WhatsmeowService {
        +StartInstance(instanceId string) error
    }

    instances --> WhatsmeowService : uses to start instance

    class LoggerWrapper {
        +GetLogger(instanceId string) Logger
    }

    class Logger {
        +LogInfo(format string, args interface)
        +LogError(format string, args interface)
    }

    instances --> LoggerWrapper : uses for Pair
    chatService --> LoggerWrapper : uses for ChatMute

    class appstate {
        +BuildMute(recipient JID, mute bool, duration time.Duration) MuteState
    }

    chatService --> appstate : BuildMute

    class Instance {
        +Id string
        +Connected bool
        +Name string
        +ClientName string
        +AlwaysOnline bool
    }

    instanceRepository --> Instance : manages
    instanceCollector --> Instance : reads fields
Loading

File-Level Changes

Change Details Files
Fix /instance/pair to correctly initialize the WhatsApp client, handle already-authenticated instances, and surface pairing errors to callers.
  • Refactors Pair to obtain a logger and client reference up front and ensure the client is connected before pairing.
  • Automatically starts the instance when there is no active connection, waits for the websocket/QR to establish, and re-reads the client pointer.
  • Rejects pairing when the client is already logged in and returns descriptive errors for initialization and PairPhone failures instead of silently logging and returning empty pairing codes.
pkg/instance/service/instance_service.go
Adjust Status to handle missing clients without forcing a connection and to simplify return logic.
  • Stops calling ensureClientConnected in Status and instead uses the raw client pointer.
  • Returns a disconnected/not-logged-in StatusStruct when no client exists, and simplifies the construction/return of the status struct.
pkg/instance/service/instance_service.go
Expose chat mute duration as an API field and enforce safe bounds when applying mutes.
  • Extends the chat request body with a duration field used by mute operations and documents it in the Swagger description.
  • Caps mute duration at one year and rejects negative values with clear errors.
  • Builds mute requests using the user-specified duration in seconds, treating 0 as mute-forever per appstate.BuildMute semantics.
pkg/chat/service/chat_service.go
pkg/chat/handler/chat_handler.go
Add Prometheus-based metrics collection and HTTP instrumentation, including instance-level gauges and a /metrics endpoint.
  • Introduces a metrics.Registry that encapsulates a custom Prometheus registry, HTTP request counters/histograms, build_info and uptime gauges, and an instanceCollector.
  • Registers the metrics middleware and /metrics endpoint in the Gin router, wiring it with the instance repository and version.
  • Extends the InstanceRepository interface and implementation with a GetAllInstances method to support the instanceCollector.
pkg/metrics/metrics.go
cmd/evolution-go/main.go
pkg/instance/repository/instance_repository.go
go.mod
go.sum
Add a standalone Tailwind-based HTML dashboard consuming existing APIs and wire it into routes and Docker image.
  • Adds manager/dashboard/index.html implementing a metric/instance overview UI that polls /instance/all and /server/ok and uses localStorage apikey.
  • Adjusts routes so /manager serves the new dashboard HTML, while /manager/*any continues to serve the SPA bundle for the existing manager.
  • Ensures the dashboard assets are copied into the container image under manager/dashboard in the Dockerfile and not ignored in git.
manager/dashboard/index.html
pkg/routes/routes.go
Dockerfile
.gitignore
Enable pinned/archive/mute chat endpoints by removing "TODO: not working" comments from route definitions.
  • Cleans up comments that marked several chat routes as non-working, effectively advertising them as supported.
  • Leaves the route wiring and middleware unchanged aside from comment removal.
pkg/routes/routes.go

Assessment against linked issues

Issue Objective Addressed Explanation
#21 Ensure that POST /instance/pair correctly initializes and uses the WhatsApp client so that a valid pairing code is generated for newly created instances instead of an empty string.
#21 Ensure that POST /instance/pair does not return HTTP 200 with an empty PairingCode on failure, but instead surfaces pairing errors to the caller as an error response.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In instances.Pair, the hardcoded time.Sleep(3 * time.Second) both blocks the request and assumes a fixed websocket/Qr setup time; consider replacing this with an event-/state-based wait (or at least a configurable timeout + retry loop) so pairing is more robust under slow/fast connections.
  • The new instanceCollector.Collect silently returns on GetAllInstances error, which can hide DB or repository issues; it would be safer to at least log the error (or expose it via Prometheus’s error reporting patterns) so scrape failures are observable.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `instances.Pair`, the hardcoded `time.Sleep(3 * time.Second)` both blocks the request and assumes a fixed websocket/Qr setup time; consider replacing this with an event-/state-based wait (or at least a configurable timeout + retry loop) so pairing is more robust under slow/fast connections.
- The new `instanceCollector.Collect` silently returns on `GetAllInstances` error, which can hide DB or repository issues; it would be safer to at least log the error (or expose it via Prometheus’s error reporting patterns) so scrape failures are observable.

## Individual Comments

### Comment 1
<location path="pkg/instance/service/instance_service.go" line_range="501" />
<code_context>
+		return nil, fmt.Errorf("instance is already authenticated")
+	}
+
+	code, err := client.PairPhone(context.Background(), data.Phone, true, whatsmeow.PairClientChrome, "Chrome (Linux)")
 	if err != nil {
-		i.loggerWrapper.GetLogger(instance.Id).LogError("[%s] something went wrong calling pair phone", instance.Id)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** PairPhone uses context.Background, which can cause runaway calls under slow or hung connections.

Because this call isn’t tied to a request or timeout, it may keep running after the client disconnects or during shutdown, leaving goroutines stuck. Prefer a bounded or request-scoped context (e.g. `context.WithTimeout` or the incoming request’s context) so the pairing attempt can be cancelled appropriately.

Suggested implementation:

```golang
	ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
	defer cancel()

	code, err := client.PairPhone(ctx, data.Phone, true, whatsmeow.PairClientChrome, "Chrome (Linux)")

```

1. Ensure `time` is imported in this file, e.g. in the import block add:
   - `import "time"` (or add `"time"` to the existing grouped import).
2. If the surrounding function already receives a request-scoped `context.Context` (e.g. `ctx`), prefer:
   - `ctx, cancel := context.WithTimeout(ctx, 60*time.Second)` instead of using `context.Background()` directly.
3. You may want to replace the hard-coded `60*time.Second` with a named constant or configuration value if your project has standard timeouts for external calls.
</issue_to_address>

### Comment 2
<location path="pkg/metrics/metrics.go" line_range="24-27" />
<code_context>
+func New(version string, instanceRepo instance_repository.InstanceRepository) *Registry {
</code_context>
<issue_to_address>
**suggestion:** Metrics registry omits default Go/process collectors, which may limit observability.

Using a fresh `prometheus.NewRegistry()` without default collectors means `go_*` and `process_*` metrics (GC, goroutines, memory, FDs, etc.) won’t be exposed. If those are useful for operating this service, consider also registering:

```go
reg.MustRegister(
    prometheus.NewGoCollector(),
    prometheus.NewProcessCollector(prometheus.ProcessCollectorOpts{}),
)
```

If you intentionally want a minimal registry, this is acceptable, but it trades off runtime visibility.

```suggestion
func New(version string, instanceRepo instance_repository.InstanceRepository) *Registry {
	reg := prometheus.NewRegistry()
	reg.MustRegister(
		prometheus.NewGoCollector(),
		prometheus.NewProcessCollector(prometheus.ProcessCollectorOpts{}),
	)

	httpRequests := prometheus.NewCounterVec(prometheus.CounterOpts{
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

return nil, fmt.Errorf("instance is already authenticated")
}

code, err := client.PairPhone(context.Background(), data.Phone, true, whatsmeow.PairClientChrome, "Chrome (Linux)")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): PairPhone uses context.Background, which can cause runaway calls under slow or hung connections.

Because this call isn’t tied to a request or timeout, it may keep running after the client disconnects or during shutdown, leaving goroutines stuck. Prefer a bounded or request-scoped context (e.g. context.WithTimeout or the incoming request’s context) so the pairing attempt can be cancelled appropriately.

Suggested implementation:

	ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
	defer cancel()

	code, err := client.PairPhone(ctx, data.Phone, true, whatsmeow.PairClientChrome, "Chrome (Linux)")
  1. Ensure time is imported in this file, e.g. in the import block add:
    • import "time" (or add "time" to the existing grouped import).
  2. If the surrounding function already receives a request-scoped context.Context (e.g. ctx), prefer:
    • ctx, cancel := context.WithTimeout(ctx, 60*time.Second) instead of using context.Background() directly.
  3. You may want to replace the hard-coded 60*time.Second with a named constant or configuration value if your project has standard timeouts for external calls.

Comment thread pkg/metrics/metrics.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/instance/pair returns empty PairingCode despite success message

1 participant