Referenced Clue4free repo: https://github.com/ry2009/cluely4Free/blob/main/brain/prompt_builder.py
the key part is in the brain folder where there are the prompt_builder, response_executer and router files. The system prompt is
system_prompt = f"""You are Cluely, a helpful and proactive desktop AI assistant. You understand context from what users say and what's visible on their screen.
Current Context:
- Time: {current_time}
- Date: {current_date}
- Active App: {app}
- User Said: "{audio_text}"
Response Guidelines:
- Be helpful, concise, and actionable (2-4 sentences max)
- Directly address what the user asked about
- Use information from the screen when relevant
- Provide specific suggestions or explanations
- Be conversational and friendly, not robotic
- If screen text is unclear, focus on the user's question
"""
# Add context-specific instructions
context_prompt = build_context_specific_prompt(app, audio_text, screen_text, context)
# Add screen content if relevant
screen_prompt = build_screen_content_prompt(screen_text, app)
# Combine all parts
full_prompt = system_prompt + context_prompt + screen_prompt
return full_prompt
Surprisingly, the build_context_specific_prompt function shows it’s quite “robotic” in taking in instructions then create prompts
def build_context_specific_prompt(app: str, audio_text: str, screen_text: str, context: Dict[str, Any] = None) -> str:
"""
Build app-specific prompt additions
"""
app_lower = app.lower()
audio_lower = audio_text.lower()
# Social Media (Twitter/X)
if 'twitter' in app_lower or 'x.com' in app_lower:
if any(word in audio_lower for word in ['tweet', 'post', 'share']):
return """
Context: User wants to create a tweet/post on Twitter/X.
Instructions:
- Suggest a compelling tweet based on screen content
- Keep it under 280 characters
- Make it engaging and authentic
- Include relevant hashtags if appropriate
- Consider current trends or topics visible on screen
"""
# Email/Communication
elif app_lower in ['mail', 'gmail', 'outlook'] or 'email' in audio_lower:
return """
Context: User is working with email/messages.
Instructions:
- Help compose professional, clear communication
- Suggest appropriate tone based on context
- Offer template phrases if composing
- Suggest improvements if reviewing content
"""
# Writing/Documentation
elif app_lower in ['word', 'docs', 'notion', 'obsidian', 'pages']:
if 'summarize' in audio_lower:
return """
Context: User wants to summarize content.
Instructions:
- Provide a concise summary of visible text
- Highlight key points and main ideas
- Use bullet points if appropriate
- Keep summary to 2-3 sentences
"""
else:
return """
Context: User is writing or editing documents.
Instructions:
- Suggest improvements to writing
- Help with clarity and flow
- Offer alternative phrasings
- Assist with structure and organization
"""
# Web Browsing
elif app_lower in ['chrome', 'safari', 'firefox']:
if any(word in audio_lower for word in ['chart', 'graph', 'data', 'visualization', 'plot']):
return """
Context: User is asking about charts, graphs, or data visualizations on a web page.
Instructions:
- Focus on explaining the data and trends shown
- Identify the type of chart/graph if possible
- Explain what the data represents
- Point out key insights or patterns
- Explain axes, labels, and data points if visible
- If chart details are unclear, explain based on context
"""
else:
return """
Context: User is browsing the web.
Instructions:
- Help explain or summarize web content
- Suggest related topics or actions
- Offer to extract key information
- Provide context about what's being viewed
"""
# Development/Coding
elif app_lower in ['vscode', 'cursor', 'xcode', 'terminal']:
return """
Context: User is coding or using development tools.
Instructions:
- Offer coding suggestions or explanations
- Help debug or improve code
- Suggest best practices
- Explain technical concepts if asked
"""
# Default context
return """
Context: General assistance needed.
Instructions:
- Provide helpful, relevant suggestions
- Consider the user's current activity
- Offer actionable next steps
- Be proactive but not intrusive
"""
Note the additional prompt functions: optimize_prompt_length, build_creative_prompt, build_question_prompt, build_reminderr_prompt, build_screen_content_prompt.
The brain/router.py determines when to respond to what it hears and sees.
should_respond, get_response_priority, should_interrupt_current_task if there are interrupt_phrases like stop, cancel, never mind, forget it, wati, hold on, actually,
question_keywords = [
“what”, “how”, “why”, “when”, “where”, “who”, “which”,
“explain”, “tell me”, “describe”, “show me”, “help me understand”,
“what’s”, “what is”, “what are”, “what does”, “what do”,
“how does”, “how do”, “how can”, “how should”,
“can you explain”, “can you tell me”, “can you show me”,
“what’s this”, “what’s that”, “what are these”, “what are those”,
“tell me about”, “explain this”, “explain that”,
“what is this”, “what is that”, “what are these”,
“help me with”, “help me understand”
]
Both Cursor (an AI-powered code editor) and cluely4Free use similar high-level concepts:
- Context-Awareness: Both need to understand what’s on screen to provide relevant assistance
- LLM Integration: Both use large language models to generate responses
- UI Integration: Both display AI responses within the user’s workflow
Key Differences
However, there are significant differences in implementation and focus:
- Purpose:
- cluely4Free is a general desktop assistant with audio/visual capabilities
- Cursor is specifically focused on coding assistance and editor integration
- Integration Depth:
- cluely4Free uses screen capture and OCR, which is more generic but less precise
- Cursor likely has deeper integration with code structure through direct editor access
- Both Cursor and Windsurf tools would have more sophisticated code understanding
- Architecture:
- cluely4Free uses a relatively simple architecture with screen capture and audio recording
- Professional tools like Cursor would likely use more robust and optimized implementations
- They would have more sophisticated versioning, caching, and error handling
- Performance Optimization:
- Commercial products like Cursor have likely invested significantly in performance optimization
- The screen capture approach in cluely4Free is more resource-intensive than direct code parsing
Cluely4free provides a simple but functional UI system, Popup Windows
- These appear when the system generates a response to your voice commands
- The UI primarily consists of popup notifications that display responses
- Auto-dismiss Behavior
- Popups can auto-dismiss after a configurable time period
- Default is 10 seconds, configurable in
cluely_config.json - High priority responses don’t auto-dismiss (require manual interaction)
- Terminal Interface
- Simple logging and status messages in the terminal
- Shows initialization progress and runtime status
- Displays performance metrics when the app is stopped
The UI appears to be non-intrusive by design, meant to provide helpful information without disrupting your workflow. It doesn’t have a persistent window or complex controls – it’s mainly popup-based responses triggered by voice commands, displaying the LLM-generated content.