Skip to content

Personal Knowledge Base

Build a searchable, semantic knowledge base from everything you read — articles, tweets, YouTube transcripts, PDFs — all ingested and retrievable via natural language chat with OpenClaw Ultra.

Core System Overview

This system turns your agent's memory into a personal research library. Drop any URL or file into chat, and it's automatically ingested, chunked, and indexed for semantic search. Later, ask questions and get ranked results with source attribution — no more lost bookmarks.

System LayerCore FunctionOutput Result
Ingestion LayerURL fetching, content extraction, format normalizationClean, structured text with metadata
Processing LayerChunking, embedding generation, vector indexingSemantically searchable knowledge store
Retrieval LayerHybrid search (semantic + keyword), relevance rankingRanked results with source context
Memory LayerCross-session persistence, auto-tagging, deduplicationGrowing, non-duplicating knowledge base
Integration LayerFeed into other workflows (SEO, social, meeting prep)Reusable research across all agent tasks

Prerequisites

ItemRequirement
OpenClaw UltraInstalled and running
Knowledge Base SkillInstall from ClawdHub — search "knowledge-base"
Ingestion ChannelTelegram topic or Slack channel (recommended for auto-ingest)

Step 0 — Initialize Knowledge Base System

Set up OpenClaw Ultra as your personal knowledge management engine.

Operation Steps

  1. Open OpenClaw Ultra new chat session
  2. Install the knowledge-base skill
  3. Create a dedicated Telegram topic called "knowledge-base" (or use a Slack channel)
  4. Paste initialization prompt

Ready-to-Use Prompt

Act as my personal knowledge base system.

I want to save everything I find valuable — articles, tweets, YouTube videos, PDFs, code snippets — and be able to search them conversationally.

Build a system that:
- ingests content from URLs I drop in chat
- extracts and indexes the full content
- supports natural language queries over saved knowledge
- deduplicates and tags content automatically
- connects to other workflows when they need research context

Step 1 — Set Up Auto-Ingestion Pipeline

Configure the agent to automatically process any URL or file you send.

1.1 Configure Ingestion Channel

Prompt

Set up the "knowledge-base" topic for automatic content ingestion.

When I drop a URL in this topic:
1. Fetch the full content (article, tweet thread, YouTube transcript, PDF)
2. Extract clean text with metadata: title, URL, date, content type
3. Chunk into semantic segments with embeddings
4. Index with tags: source type, topic, key entities
5. Reply with: what was ingested, chunk count, suggested tags

Supported sources:
- Web articles (any URL)
- YouTube videos (auto-fetch transcript)
- Tweets and X threads
- PDF documents (via file upload)
- GitHub READMEs and docs

1.2 Batch Import Existing Bookmarks

Prompt

I have a collection of saved links I want to import:

[list URLs or export file]

Process each one through the ingestion pipeline.
Report progress: [X/N] ingested, any failed URLs with error reasons.

INFO

Your knowledge base grows automatically from this point forward — every interesting link you encounter, just drop it in the topic.

Step 2 — Semantic Search & Retrieval

Query your knowledge base conversationally.

2.1 Basic Query

Prompt

Search my knowledge base for: [your question or topic]

Return:
- top 5 most relevant results
- for each: title, source URL, key excerpt, relevance score
- if no good matches, tell me explicitly

2.2 Cross-Reference Query

Prompt

I'm working on [current project/task].
Search my knowledge base for anything related to:
[list relevant topics or keywords]

Summarize what I already know, what sources I have, and what gaps exist.

Step 2 Output

Instant access to everything you've saved, organized by relevance.

Step 3 — Auto-Tagging & Organization

Keep your knowledge base structured without manual effort.

Prompt

Configure auto-tagging rules for ingested content:

Always tag by:
- Content type: article, tweet, video, pdf, code, discussion
- Domain: the primary subject area
- Entity: any companies, people, tools mentioned

Auto-create topic clusters when 3+ items share the same tag.
Flag duplicate or near-duplicate content before ingestion.

Step 4 — Connect Knowledge Base to Other Workflows

Make your saved knowledge available across all agent tasks.

4.1 Feed into Content Creation

Prompt

When generating content briefs (for SEO, social media, or YouTube),
automatically search the knowledge base for relevant saved content.

Include citations in the brief so I know where the insights came from.

Related Guide: SEO Content Workflow

4.2 Feed into Meeting Prep

Prompt

Before any meeting I have with [person/company/topic],
search my knowledge base for saved content about them or their industry.

Include findings in the meeting preparation brief.

4.3 Feed into Research Tasks

Prompt

When I ask a research question, always search the knowledge base FIRST.
Only search external sources if the KB doesn't have good results.
Report which source (KB vs external) the answer came from.

Step 5 — Maintenance & Review

Keep the knowledge base healthy and useful over time.

Prompt

Set up weekly knowledge base maintenance:

Every Sunday at 10 AM:
1. Report new items added this week: count, top tags, top sources
2. Identify orphaned items (never retrieved) — suggest archiving
3. Merge duplicate or overlapping entries
4. Suggest 3 topics that need more coverage based on my recent queries

Final System Architecture

URL/File Drop → Content Ingestion → Chunking & Embedding →
Vector Index → Semantic Search → Retrieval with Context →
Feed into SEO / Social / Meeting / Research Workflows

Practical Usage Tips

  1. Create a habit: every time you read something useful, drop the link in the knowledge-base topic immediately
  2. Use specific questions when searching — "What did I save about RAG architecture?" works better than "tell me about AI"
  3. Periodically review the weekly maintenance report to spot knowledge gaps
  4. Combine with Reddit Research Workflow — save Reddit findings directly to the KB for cross-reference