AI Insights
April 30, 2025

Agent Performance Testing With Grafana K6 and InfluxDB

How to test your agent app performance with Grafana and InfluxDB

In today's rapidly evolving web landscape, the rise of agent programming and the integration of large language models (LLMs) are transforming how we build, test, and monitor applications. As systems become more autonomous and complex, ensuring robust performance and real-time observability is no longer optional—it's essential.

This article walks you through setting up a modern performance testing and monitoring stack using K6, InfluxDB, and Grafana, all within Docker containers. We'll also discuss why these tools matter in the context of agent-driven architectures and LLM-powered applications.

Why Performance Monitoring Matters in the Age of Agents and LLMs

Agent programming—where autonomous software agents interact, learn, and adapt—demands systems that are not only functional but also resilient under unpredictable loads. LLMs, meanwhile, introduce new performance variables: inference latency, API throughput, and dynamic user interactions.

Key reasons to invest in performance monitoring:

  • Scalability: Agents and LLMs can generate bursty, unpredictable traffic.
  • Reliability: Automated systems must recover gracefully from failures.
  • User Experience: Slow responses from LLM-powered features can degrade trust.
  • Continuous Improvement: Real-time metrics enable rapid iteration and optimization.

Setting Up Your Performance Testing and Monitoring Stack

Let's get hands-on! Here's how to deploy K6, InfluxDB, and Grafana using Docker, and connect them for seamless performance testing and visualization.

1. Directory Structure

Organize your files for clarity:

tests/performance/
├── docker-compose.yml
├── k6/
│   └── your_test_script.js
└── grafana-provisioning/
    ├── datasources/
    │   └── datasource.yml
    └── dashboards/
        └── k6-dashboard.json

2. Docker Compose File

Spin up InfluxDB and Grafana with this docker-compose.yml:

version: '3.7'

services:
  influxdb:
    image: influxdb:1.8
    container_name: influxdb
    ports:
      - "8086:8086"
    environment:
      - INFLUXDB_DB=k6
      - INFLUXDB_ADMIN_USER=admin
      - INFLUXDB_ADMIN_PASSWORD=admin123
    volumes:
      - influxdb-data:/var/lib/influxdb

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana-data:/var/lib/grafana
      - ./grafana-provisioning/datasources:/etc/grafana/provisioning/datasources
      - ./grafana-provisioning/dashboards:/etc/grafana/provisioning/dashboards

volumes:
  influxdb-data:
  grafana-data:

3. Provision Grafana Data Source

Create grafana-provisioning/datasources/datasource.yml:

apiVersion: 1
datasources:
  - name: InfluxDB
    type: influxdb
    access: proxy
    url: http://influxdb:8086
    database: k6
    isDefault: true

4. Start the Monitoring Stack

From your tests/performance directory, run:

docker-compose up -d

5. Install and Run K6

Option 1: Local Installation

brew install k6

Option 2: Run K6 in a Docker Container

docker run -i --rm \
  -v $(pwd)/k6:/scripts \
  loadimpact/k6 run /scripts/your_test_script.js \
  --out influxdb=http://influxdb:8086/k6

6. Configure K6 Output

When running K6, direct the results to InfluxDB:

k6 run your_test_script.js --out influxdb=http://influxdb:8086/k6

Or, if running from a container in the same Docker network:

k6 run your_test_script.js --out influxdb=http://influxdb:8086/k6

7. Access Grafana

  • Open http://localhost:3000
  • Default login: admin / admin
  • Import or use a K6 dashboard (JSON file) for real-time visualization.

Summary Table of Components

ComponentCommand/Config Example
InfluxDBDocker Compose service, port 8086, DB: k6
GrafanaDocker Compose service, port 3000, provisioned with InfluxDB
K6k6 run your_test_script.js --out influxdb=http://influxdb:8086/k6

Real-World Scenarios: What to Test

  • LLM API Endpoints: Measure latency and throughput under concurrent requests.
  • Agent Coordination: Simulate multiple agents interacting with your backend.
  • User Workflows: Test login, dashboard, and data retrieval flows for bottlenecks.

Sample K6 Test Script

Here's a basic K6 script to get you started:

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  vus: 10,  // 10 virtual users
  duration: '30s',
  thresholds: {
    http_req_duration: ['p(95)<500'], // 95% of requests must complete below 500ms
    'http_req_duration{name:healthcheck}': ['p(99)<50'], // 99% of healthchecks must complete below 50ms
  },
};

export default function() {
  // Test API endpoints
  let loginRes = http.post('http://localhost:5001/api/login', {
    username: 'testuser',
    password: 'password123',
  });
  
  check(loginRes, {
    'login successful': (r) => r.status === 200,
    'has auth token': (r) => r.json('token') !== '',
  });
  
  // Extract token for authenticated requests
  let token = loginRes.json('token');
  
  // Health check endpoint
  let healthCheck = http.get('http://localhost:5001/api/health', {
    tags: { name: 'healthcheck' },
  });
  
  check(healthCheck, {
    'status is up': (r) => r.json('status') === 'healthy',
  });
  
  // Test main dashboard endpoint
  let dashboardRes = http.get('http://localhost:5001/api/dashboard', {
    headers: { 'Authorization': `Bearer ${token}` },
  });
  
  check(dashboardRes, {
    'dashboard loaded': (r) => r.status === 200,
    'has dashboard data': (r) => r.json('data') !== null,
  });
  
  sleep(1);
}

Advanced Tips

  • Thresholds: Set performance thresholds in your K6 scripts (e.g., 95% of requests < 500ms).
  • Custom Metrics: Track business-specific KPIs alongside system metrics.
  • Alerting: Configure Grafana alerts for anomalies or SLA breaches.
  • Progressive Load Testing: Start with few virtual users and gradually increase to find breaking points.
  • Distributed Testing: For high-load scenarios, run K6 in distributed mode across multiple machines.

Troubleshooting

Common issues and solutions:

  • CORS errors: If K6 can't connect to InfluxDB, check network settings in Docker.
  • Missing metrics: Ensure your test script is correctly tagged and structured.
  • High memory usage: Consider batching results or using streaming output for long tests.

As agent programming and LLM integration become mainstream, robust performance testing and monitoring are critical. By combining K6, InfluxDB, and Grafana in a Dockerized environment, you gain a scalable, repeatable, and insightful workflow for ensuring your systems are ready for the demands of modern web development.

Ready to level up your observability? Try this stack on your next project and see the difference!


Have questions or want to see more example test scripts and dashboards? Drop an email to [email protected]

Tags:
AgentTesting