How to Prompt the Data Processing Aspect of the Agent More Robust

It is realized by process_table_query, here are ideas to make it much more capable/robust, without bloating the code too much: 1. Smarter routing before hitting the LLM Right now every table query goes straight to “LLM generates pandas code”. You can get a lot more robustness by handling simple patterns yourself first. 1.1. Direct “show me table X” … Continue reading How to Prompt the Data Processing Aspect of the Agent More Robust

Create a Simple Data Science Assistant Agent

Building a simple data science agent that can handle Basic data operations (load, filter, aggregate) Code generation and execution Visualization with auto-display Excel/CSV support Conversation context To make the system more robust, we need to incorporate advanced data handling capabilities — including large file support, database connectivity, multi-DataFrame input, and data versioning to enable undo or rollback functions. It … Continue reading Create a Simple Data Science Assistant Agent

Architecture of a Multi-Index Agent

After a successful single-index agent creation, I am pondering about creating a multi-index agent, the benefits are obvious: Scalability: Add new indexes without changing core code Maintainability: Each index config is isolated Flexibility: Easy to modify methodology for one index without affecting others Testability: Test each index independently Reusability: Share common logic (QC, processing rules, … Continue reading Architecture of a Multi-Index Agent

Everything Can Be Tokenized (by Jensen Huang)

At NVIDIA’s GTC 2025, Jensen Huang said it loud and clear: “Everything can be tokenized.”And with the sheer computing power of GPUs, he added, “everything can be decoded and figured out — it’s just a matter of electricity.” He’s right. But most people don’t fully grasp what “everything can be tokenized” really means. Let’s unpack … Continue reading Everything Can Be Tokenized (by Jensen Huang)

Use Streamlit

Streamlit is an incredibly powerful and versatile tool, making it essential to invest your time in truly grasping its full potential! It is a Python library that converts scripts into reactive web apps — without HTML, CSS, or JS. Each Streamlit run is stateless by default, hence you need to append conversation history yourself. Streamlit … Continue reading Use Streamlit

Graph RAG — From Scattered Retrieval to Connected Understanding

Traditional RAG (Retrieval-Augmented Generation) works by embedding texts into high-dimensional vectors and then retrieving the most similar ones when a user asks a question. It’s effective for small and isolated chunks of knowledge. But as the knowledge base grows complex — especially when the content is interconnected like custom SDK codes or component dependencies — … Continue reading Graph RAG — From Scattered Retrieval to Connected Understanding