AI Stack Overview
LME includes an AI-powered security analysis layer that runs entirely on your own infrastructure. No data leaves your network. The AI stack helps security analysts investigate alerts, understand vulnerabilities, and get answers about LME documentation — all through a web-based dashboard.
What Gets Deployed
When LME installs, it automatically sets up these AI components:
Component Summary
| Component | Port | Purpose |
|---|---|---|
| LME Dashboard | 8502 | Web UI for alerts, AI chat, model management, detection rules, KEV tracking |
| LiteLLM Proxy | 4000 | API gateway that routes LLM requests to local or cloud models |
| llama.cpp | 8080 | Runs the local chat LLM (no internet required) |
| Embeddings Server | 8081 | Generates text embeddings for RAG document search |
| pgvector | 5432 | PostgreSQL with vector extensions — stores document embeddings for RAG |
What is RAG?
RAG stands for Retrieval-Augmented Generation. When you ask a question in the AI chat, LME:
- Converts your question into a vector embedding
- Searches the pgvector database for relevant LME documentation chunks
- Sends those relevant docs as context along with your question to the LLM
- The LLM gives you an answer grounded in the actual documentation
What Models are Used
LME ships with two local models that run on your hardware:
| Model | Size | Purpose | Server |
|---|---|---|---|
| LFM2.5-1.2B-Instruct | ~0.8 GB | Chat and analysis | llama.cpp (:8080) |
| nomic-embed-text-v1.5 | ~0.3 GB | Text embeddings for RAG | Embeddings (:8081) |
Both models are downloaded automatically during installation and stored at /opt/lme/llama-models/.
The default 1.2B parameter model is lightweight and runs on modest hardware. You can switch to larger, more capable models through the dashboard if your server has more resources. See Managing Models.
Optional: Cloud Model Support
While LME works completely offline with local models, you can optionally connect cloud LLM providers for more capable analysis:
- OpenAI (GPT)
- Anthropic (Claude)
- Azure OpenAI
- Google Vertex AI (Gemini)
- AWS Bedrock
- Ollama (self-hosted)
Cloud models are configured through the dashboard UI or by editing the LiteLLM config file. See Managing Models for details.
Accessing the AI Features
After LME is installed, access the dashboard at:
https://<your-lme-server-ip>:8502
The dashboard uses the LME TLS certificates (self-signed by default), so your browser will show a certificate warning. This is expected.
Updating the RAG Documentation Index
The RAG system needs a local copy of the LME documentation stored as vector embeddings in pgvector. This is set up automatically during installation, but you can re-index if the docs have been updated:
- Open the dashboard at
https://<your-lme-server-ip>:8502 - Go to Settings > Documents
- You will see:
- Chunk count — how many documentation chunks are stored
- Last updated — when the index was last refreshed
- Source — the LME docs website that was crawled
- Click "Pull Latest Documentation"
- The button shows "Scraping & indexing... this takes a few minutes"
- When complete: "Done — N chunks indexed"
This crawls the LME documentation website, converts pages to text, chunks them, generates embeddings, and stores everything in pgvector.
Next Steps
- LME Security Dashboard — full guide to the dashboard UI
- Using the AI Chat — how to interact with the AI for security analysis
- Detection Engineering — import Kibana rules, convert Sigma rules, create ElastAlert2 rules
- Managing Models — switch models, add cloud providers, download new local models
- KEV Integration — CISA Known Exploited Vulnerabilities enrichment
- LiteLLM API Reference — calling the LLM API directly from scripts