Skip to content
Armin's Portfolio
Go back

Polaris engineering write-up

Polaris Engineering Write-Up: Technology Decisions, Trade-offs, and Architecture Rationale

Polaris grew into a multi-runtime cybersecurity platform with a few very different moving parts: a web app, a low-latency backend, an AI orchestration service, a scanner microservice, and a Go endpoint client.

I wrote this post to document how those pieces fit together, why I chose them, and where the architecture got better or more complicated because of those decisions.


1) System Architecture at a Glance

Polaris is designed as specialized services with clear workload boundaries, not a single monolith:

Why I split it up this way

This decomposition maps cleanly to different execution models:

The trade-off is higher operational complexity. More containers means more interfaces, more auth boundaries, and more things that can go wrong. I still preferred that over cramming everything into one service, because the separation made scaling and fault isolation much easier to reason about.


2) Technology Inventory (Exact Libraries and Platforms)

Frontend (FrontEnd/package.json)

Core platform

Styling and component system

Rich content and rendering

Agent UX and terminal interaction

Editor stacks (present for rich authoring surfaces)

PWA and build tooling

Why this stack fits the frontend

Polaris frontend is not just dashboard CRUD. It handles:

React + Vite + TypeScript gave me a fast iteration loop and a mature ecosystem for these patterns. Tailwind and Radix let me build out the UI quickly without giving up too much control over accessibility or styling.

Trade-offs


Backend API (BackEnd/package.json)

Core runtime and framework

Security/auth/data

Why this stack fits the backend

I chose Elysia on Bun because I wanted a low-overhead TypeScript API with fast startup and solid performance. This backend handles:

Patterns I relied on in the backend

Trade-offs


DeepAgent (DeepAgent/package.json)

Core stack

Session/tooling and execution

Data/auth

Why this stack fits DeepAgent

DeepAgent is not a stateless prompt endpoint. I built it as a stateful orchestration runtime with:

LangGraph checkpointing into MongoDB ended up being one of the most important choices in this part of the system. It made it possible to continue sessions across turns and failures instead of treating every interaction like a fresh request. The custom summarization and token accounting logic came from the same need: long-running, tool-heavy workflows need more structure than normal chat endpoints.

Trade-offs


Scanner Service (PyScript/requirements.txt + PyScript/main.py)

Runtime and service framework

Scanner ecosystem dependencies

Why this stack fits the scanner service

The Python scanner service wraps multiple security scanners and runs scans in background threads with progress callbacks persisted to the database. I wanted long-running scan workflows to live outside the Bun backend and the AI orchestration runtime so each service could stay focused.

This service exposes:

It also integrates with a local paper display renderer, which is a hardware-adjacent concern that belongs outside the primary web API.

Trade-offs


Endpoint Client (PolarisClient/go.mod)

Core dependencies

Why I used Go for the endpoint client

For the endpoint client, I wanted efficient binaries, cross-platform support, access to system primitives, and predictable network behavior. Go was a very natural fit for that.

Backend build endpoints dynamically inject configuration into client source and compile per target platform, supporting tailored artifact generation.

Trade-offs


Infrastructure and Containerization (compose.yml, compose.debug.yml, Dockerfiles, Caddy)

Core infra components

Container/runtime details

Why this infrastructure setup made sense

Trade-offs


3) Cross-cutting decisions that shaped the project

A) Streaming architecture: SSE + WS used intentionally

I used each protocol where it fit best instead of forcing one transport to handle everything.

Trade-off: frontend/backend protocol complexity increases (SSE parsers, reconnection behavior, WS session routing).

B) Stateful workflows with durability

This mattered because I did not want the platform to feel disposable. Persistent sessions and replay paths make a huge difference once users are doing real work in the system.

Trade-off: more state machines and lifecycle handling across memory + DB.

C) Human-in-the-loop interrupt/resume

This mattered because practical autonomous systems still need operator steering. I did not want a one-shot black box.

Trade-off: significantly more control-flow complexity compared with simple request/response AI APIs.

D) Shared contracts in schema/

Shared contracts cut down on drift and made the frontend/backend integration less fragile.

Trade-off: contract evolution must be coordinated across services.


4) Trade-offs I accepted

1) Multi-runtime system (TypeScript/Bun + Python + Go)

2) Heavy DeepAgent capability set

3) Rich frontend capability stack

4) Security tooling inside runtime containers

5) Real-time protocols plus durable state


5) What Polaris taught me

The biggest lesson from Polaris was that the architecture only made sense once I stopped treating every part of the platform as the same kind of workload. The UI, the command-and-control backend, the scanner service, the agent runtime, and the endpoint client each wanted a different execution model, and the project got cleaner once I let that be true.

I also came away with a better sense of where complexity is justified. Durable sessions, interrupt and resume flows, SSE, WebSockets, Redis signaling, and multiple runtimes all add failure modes. But in a platform like this, those details are tied to the actual user experience and operator workflows, not just technical ambition for its own sake.


6) Closing thoughts

Looking back, Polaris feels like one of those projects where the stack tells the story pretty clearly. I used React and Tailwind where I needed a capable frontend, Bun and Elysia where I wanted a fast control plane, Python where the scanner ecosystem was stronger, and Go where I needed a reliable endpoint client. The result is not simple, but it is honest about the problem space.

If I were continuing the project, most of my next work would be around observability, stricter safety controls, and making the operational side easier to manage. The core architectural choices still feel right to me, though, because they came from the workload itself and not from trying to make the project sound impressive on paper.


Share this post on:

Previous Post
artefficient.io architecture and engineering trade-offs
Next Post
Building Zaal, a manual RDP intrusion detection system