Building a Sentry-like Error Tracing System for Next.js using Loki and Grafana
2024-07-31
Implementing Lightweight Error Tracing in Next.js with Loki and Grafana: An Alternative to Sentry
Recently, I had some free time and built an Error Boundary for my team's project to implement error tracing on exception errors. I started thinking about internal back-office systems and how we could create a custom solution to keep all data internal without relying on external services.
In this post, we'll explore how to implement a comprehensive error tracing system in a Next.js application using Loki for log aggregation and Grafana for visualization. This approach provides a cost-effective alternative to services like Sentry while offering customizable error tracking. We'll also dive into an enhanced feature: tracing the last 5 UI elements interacted with and sending this information along with the error report.
Do note that this implementation is specifically for Next.js 13, where we still wrap components with Error Boundary, unlike the latest Next.js 14 changes. However, the core concepts still apply.
The Stack
To provide more context when errors occur, we'll implement a system that tracks the last 5 UI elements the user interacted with. This information can be crucial for debugging and understanding the user's journey leading up to the error.
Our error tracing system leverages the following technologies:
- Next.js: A popular React framework for building web applications.
- Loki: A horizontally-scalable, highly-available log aggregation system.
- Grafana: An open-source platform for monitoring and observability.
- Pino: A super fast Node.js logger with JSON output.
- Custom hook storing ui interactions only the HTML elements NOT a keylogger for obvious security reason.
Implementation
Let's go through the key components of our error tracing system.
1. Error Boundary Setup
First, create an ErrorBoundary component to catch and handle errors in your React tree:
╰┈➤ src/components/ErrorBoundary.tsx
import React from 'react';
import Router from 'next/router';
class ErrorBoundaryInner extends React.Component<
{ children: React.ReactNode; getInteractions: () => string[] },
{ hasError: boolean; error: Error | null }
> {
constructor(props) {
super(props);
this.state = { hasError: false, error: null };
}
static getDerivedStateFromError(error: Error) {
return { hasError: true, error };
}
componentDidCatch(error: Error, errorInfo: React.ErrorInfo) {
const interactions = this.props.getInteractions();
fetch('/api/log-error', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: error.message,
stack: error.stack,
url: window.location.href,
userAgent: navigator.userAgent,
timestamp: new Date().toISOString(),
route: window.location.pathname,
interactions: interactions,
}),
}).catch(console.error);
}
render() {
if (this.state.hasError) {
return (
<div
style={{
display: 'flex',
justifyContent: 'center',
alignItems: 'center',
height: '100vh',
backgroundColor: '#f5f5f5',
}}
>
<div
style={{
padding: '2rem',
backgroundColor: 'white',
borderRadius: '8px',
boxShadow: '0 4px 6px rgba(0, 0, 0, 0.1)',
maxWidth: '400px',
textAlign: 'center',
}}
>
<h2>Oops! Something went wrong.</h2>
<p>
We're sorry for the inconvenience. Our team has been notified and is working on a fix.
</p>
<div style={{ marginTop: '1rem' }}>
<button onClick={() => window.location.reload()} style={buttonStyle}>
Reload
</button>
<button onClick={() => Router.back()} style={buttonStyle}>
Go Back
</button>
<button onClick={() => Router.push('/')} style={buttonStyle}>
Home
</button>
</div>
</div>
</div>
);
}
return this.props.children;
}
}
const buttonStyle = {
margin: '0 0.5rem',
padding: '0.5rem 1rem',
backgroundColor: '#007bff',
color: 'white',
border: 'none',
borderRadius: '4px',
cursor: 'pointer',
};
function ErrorBoundary({
children,
getInteractions,
}: {
children: React.ReactNode;
getInteractions: () => string[];
}) {
return <ErrorBoundaryInner getInteractions={getInteractions}>{children}</ErrorBoundaryInner>;
}
export default ErrorBoundary;
2. Logging API Route
Set up an API route in Next.js to receive and process error logs:
╰┈➤ pages/api/log-error.ts
import type { NextApiRequest, NextApiResponse } from 'next';
import logger from '../../lib/logger';
export default function handler(req: NextApiRequest, res: NextApiResponse) {
if (req.method !== 'POST') {
return res.status(405).end();
}
const { message, stack, url, userAgent, timestamp, route, interactions } = req.body;
const relevantStack = stack
? stack.split('\n').slice(0, 3).join('\n')
: 'No stack trace available';
const formattedInteractions = interactions
? interactions
.map((interaction: string, index: number) => ` ${index + 1}. ${interaction}`)
.join('\n')
: 'No interactions logged';
logger.error(
{
message: message || 'Unknown error',
stack: relevantStack,
url: url || 'Unknown URL',
userAgent: userAgent ? userAgent.split(' ').pop() : 'Unknown user agent',
timestamp: timestamp || new Date().toISOString(),
env: ╰┈➤ if you have multiple environment
route: route || 'Unknown route',
interactions: formattedInteractions,
},
`Client Exception Error: ${
message || 'Unknown error'
}\nLast User Interactions:\n${formattedInteractions}`
);
res.status(200).json({ received: true });
}
3. Logger Configuration
Use Pino for logging, configured to work with Loki:
╰┈➤ lib/logger.ts
import pino from 'pino';
const logger = pino({
level: 'warn',
transport: {
target: 'pino-pretty',
options: {
colorize: true,
translateTime: 'SYS:standard',
ignore: 'pid,hostname',
messageFormat: '{msg} | {type}: {metric} | Value: {value} | Rating: {rating} | URL: {url}',
},
},
formatters: {
level: (label) => {
return { level: label.toUpperCase() };
},
},
base: {
env: ╰┈➤ if you have multiple environment and add conditonals for what to trace only i.e.: UAT
apiGateway: ╰┈➤ for your apiBasePath or gateway
},
});
export default logger;
4. Integration in _app.tsx with useInteractionTracker hook
Wrap your entire application with the ErrorBoundary in the custom App component together with a tracker in locastorage:
╰┈➤ pages/_app.tsx
import ErrorBoundary from 'src/components/ErrorBoundary';
import { useInteractionTracker } from 'src/hooks/useInteractionTracker';
function App({ Component, pageProps }: AppProps) {
const getInteractions = useInteractionTracker();
return (
<ErrorBoundary getInteractions={getInteractions}>
<Component {...pageProps} />
</ErrorBoundary>
);
}
export default App;
╰┈➤ src/hooks/useInteractionTracker.ts
import { useEffect, useRef } from "react";
const MAX_INTERACTIONS = 5;
function getElementIdentifier(element: HTMLElement): string {
if (element.id) {
return `#${element.id}`;
}
if (element.className) {
return `.${element.className.split(' ')[0]}`;
}
return element.tagName.toLowerCase();
}
function getElementText(element: HTMLElement): string {
let text = element.getAttribute("aria-label") ||
element.getAttribute("title") ||
element.textContent?.trim() || "";
return text ? ` "${text.slice(0, 20)}${text.length > 20 ? "..." : ""}"` : "";
}
function isInteractiveElement(element: HTMLElement): boolean {
const interactiveTags = ['A', 'BUTTON', 'INPUT', 'SELECT', 'TEXTAREA'];
const interactiveRoles = ['button', 'link', 'checkbox', 'menuitem', 'tab'];
return interactiveTags.includes(element.tagName) ||
interactiveRoles.includes(element.getAttribute('role') || '') ||
element.hasAttribute('onclick') ||
element.hasAttribute('tabindex');
}
function findInteractiveParent(element: HTMLElement): HTMLElement {
let current = element;
while (current && current !== document.body) {
if (isInteractiveElement(current)) {
return current;
}
current = current.parentElement!;
}
return element;
}
export function useInteractionTracker() {
const interactionsRef = useRef<string[]>([]);
useEffect(() => {
const trackInteraction = (event: MouseEvent | KeyboardEvent) => {
const target = event.target as HTMLElement;
const interactiveElement = findInteractiveParent(target);
const elementId = getElementIdentifier(interactiveElement);
const elementText = getElementText(interactiveElement);
let interaction = `${event.type} on ${elementId}${elementText}`;
if (event instanceof KeyboardEvent && event.key !== "Tab") {
interaction += ` (key: ${event.key})`;
}
interactionsRef.current = [
interaction,
...interactionsRef.current.slice(0, MAX_INTERACTIONS - 1),
];
localStorage.setItem("userInteractions", JSON.stringify(interactionsRef.current));
};
window.addEventListener("click", trackInteraction, true);
window.addEventListener("keydown", trackInteraction, true);
return () => {
window.removeEventListener("click", trackInteraction, true);
window.removeEventListener("keydown", trackInteraction, true);
};
}, []);
return () => interactionsRef.current;
}
5. Setting up Loki and Grafana
To complete our error tracing system, we need to set up Loki for log aggregation and Grafana for visualization.
Loki Setup
-
Install Loki: Create a
docker-compose.yml
file:version: '3' services: loki: image: grafana/loki:2.8.0 ports: - '3100:3100' command: -config.file=/etc/loki/local-config.yaml volumes: - ./loki-config.yaml:/etc/loki/local-config.yaml
-
Create a loki-config.yaml file:
auth_enabled: false server: http_listen_port: 3100 ingester: lifecycler: address: 127.0.0.1 ring: kvstore: store: inmemory final_sleep: 0s chunk_idle_period: 5m chunk_retain_period: 30s schema_config: configs: - from: 2020-05-15 store: boltdb object_store: filesystem schema: v11 index: prefix: index_ period: 168h storage_config: boltdb: directory: /tmp/loki/index filesystem: directory: /tmp/loki/chunks limits_config: enforce_metric_name: false reject_old_samples: true reject_old_samples_max_age: 168h chunk_store_config: max_look_back_period: 0s table_manager: retention_deletes_enabled: false retention_period: 0s
-
Run Loki:
docker-compose up -d
Grafana Setup
-
Update your docker-compose.yml to include Grafana:
version: '3' services: loki: image: grafana/loki:2.8.0 ports: - '3100:3100' command: -config.file=/etc/loki/local-config.yaml volumes: - ./loki-config.yaml:/etc/loki/local-config.yaml grafana: image: grafana/grafana:latest ports: - '3000:3000' depends_on: - loki
-
Run Grafana:
docker-compose up -d
-
Access Grafana at http://localhost:3000 (default credentials: admin/admin)
-
Add Loki as a data source in Grafana:
- Go to Configuration > Data Sources
- Click "Add data source"
- Select Loki
- Set the URL to http:╰┈➤loki:3100
- Click "Save & Test"
-
Create a dashboard in Grafana:
- Click "+ > Create > Dashboard"
- Add a new panel
- In the query editor, use LogQL to query your logs, e.g.:
{job="next-app"} |= "error"
-
Configuring Next.js to send logs to Loki: To send logs from your Next.js application to Loki, we'll use a Pino transport that forwards logs to Loki. Here's how to set it up:
a. Install required packages:
```bash
npm install pino pino-loki
b. Update your logger configuration (lib/logger.ts):
import pino from 'pino';
import { createWriteStream } from 'pino-loki';
const transport = createWriteStream({
host: 'http://localhost:3100', ╰┈➤ Adjust this to your Loki server address
basicAuth: {
username: 'your-username', ╰┈➤ If you've set up authentication
password: 'your-password',
},
labels: {
job: 'next-app', ╰┈➤ This helps identify your app in Loki
environment: process.env.NODE_ENV || 'development',
},
});
const logger = pino(
{
level: process.env.LOG_LEVEL || 'info',
formatters: {
level: (label) => {
return { level: label.toUpperCase() };
},
},
base: {
env: ╰┈➤if you have multiple environment to trace i.e.: dev, UAT
apiGateway: ╰┈➤implement based on your env apiBasePath or Gateway
},
},
transport
);
export default logger;
This configuration sets up Pino to send logs directly to Loki. The createWriteStream
function from pino-loki
creates a transport that sends logs to the specified Loki server.
c. Update your error logging API (pages/api/log-error.ts):
import type { NextApiRequest, NextApiResponse } from 'next';
import logger from '../../lib/logger';
export default function handler(req: NextApiRequest, res: NextApiResponse) {
if (req.method !== 'POST') {
return res.status(405).end();
}
const { message, stack, url, userAgent, timestamp, route, interactions } = req.body;
const relevantStack = stack
? stack.split('\n').slice(0, 3).join('\n')
: 'No stack trace available';
const formattedInteractions = interactions
? interactions
.map((interaction: string, index: number) => ` ${index + 1}. ${interaction}`)
.join('\n')
: 'No interactions logged';
logger.error(
{
message: message || 'Unknown error',
stack: relevantStack,
url: url || 'Unknown URL',
userAgent: userAgent ? userAgent.split(' ').pop() : 'Unknown user agent',
timestamp: timestamp || new Date().toISOString(),
route: route || 'Unknown route',
interactions: formattedInteractions,
},
`Client Exception Error: ${message || 'Unknown error'}`
);
res.status(200).json({ received: true });
}
Now, when errors occur in your Next.js application, they will be sent to Loki via the configured Pino transport.
7. Creating Useful Grafana Dashboards
With logs now flowing into Loki, you can create informative dashboards in Grafana. Here are some example queries and panels you might want to create:
a. Error Count Over Time:
Query: `sum(count_over_time({job="next-app"} |= "error" [$__interval]))`
Panel: Graph
b. Top 10 Error Messages:
Query: `topk(10, count_over_time({job="next-app"} |= "error" [$__interval]) by (message))`
Panel: Table
c. Errors by Route:
Query: `sum(count_over_time({job="next-app"} |= "error" [$__interval])) by (route)`
Panel: Pie Chart
d. Latest Errors:
Query: `{job="next-app"} |= "error" | json | line_format "{{.message}} ({{.route}})"`
Panel: Logs
e. Error Distribution by Environment:
Query: `sum(count_over_time({job="next-app"} |= "error" [$__interval])) by (env)`
Panel: Bar Gauge
8. Setting Up Alerts
Grafana allows you to set up alerts based on your log data. Here's an example of how to set up a simple alert:
a. In your Grafana dashboard, edit a panel showing error counts.
b. Go to the "Alert" tab.
c. Click "Create Alert".
d. Set conditions, for example:
"WHEN last() OF query(A, 5m, now) IS ABOVE 10" This will trigger an alert when there are more than 10 errors in the last 5 minutes. e. Set up notification channels (email, Slack, etc.) in Grafana's Alert Notification settings.
9. Best Practices
- Log Levels: Use appropriate log levels (error, warn, info, debug) to categorize your logs.
- Structured Logging: Always use structured logging to make it easier to query and analyze logs.
- Sensitive Information: Be careful not to log sensitive information like passwords or personal data.
- Performance: Monitor the performance impact of logging, especially in high-traffic applications.
tldr:
- This was just an interesting way to workaround not paying sentry though, still suggest you to go for it if resources allow, since we are still storing data ourselves this way. That comes with a cost too.
- BUT it is a unique custom solution whereby there is no external party, all the data is yours to manage and you know what is logged.
- DO note, I did not implement user id/user name identity in this tracing as everyone has different identity management system, so you may want to add that if needed.