Making the Most from Snowflake Snowpark

Snowpark is a powerful, developer-centric framework that lets you execute code written in popular languages like Python, Scala, or Java directly within Snowflake's data cloud. It moves computation closer to the data, drastically reducing data movement and leveraging Snowflake’s highly scalable, optimized engine. The single most important principle of Snowpark is client-side coding, server-side execution … Continue reading Making the Most from Snowflake Snowpark

Hilbert Spaces and Banach Spaces: What’s the Difference?

A Banach space is a complete normed vector space. Here, the norm is a function that measures the “length” of a vector vvv, and it must satisfy certain properties like the triangle inequality. The distance between two vectors uuu and vvv is defined simply by the norm of their difference, ∥u−v∥\|u - v\|∥u−v∥. Importantly, Banach … Continue reading Hilbert Spaces and Banach Spaces: What’s the Difference?

Security Matters in Building Safer Software

The OWASP Top 10 1. Broken Access Control This is the “users touching things they shouldn’t” category.Example: A simple missing WHERE user_id = ? lets User A view or modify User B’s data.Access control is not UI logic—it is backend logic. 2. Cryptographic Failures Encryption done wrong is effectively no encryption.Example: Sending passwords over HTTP … Continue reading Security Matters in Building Safer Software

How to Prompt the Data Processing Aspect of the Agent More Robust

It is realized by process_table_query, here are ideas to make it much more capable/robust, without bloating the code too much: 1. Smarter routing before hitting the LLM Right now every table query goes straight to “LLM generates pandas code”. You can get a lot more robustness by handling simple patterns yourself first. 1.1. Direct “show me table X” … Continue reading How to Prompt the Data Processing Aspect of the Agent More Robust

Create a Simple Data Science Assistant Agent

Building a simple data science agent that can handle Basic data operations (load, filter, aggregate) Code generation and execution Visualization with auto-display Excel/CSV support Conversation context To make the system more robust, we need to incorporate advanced data handling capabilities — including large file support, database connectivity, multi-DataFrame input, and data versioning to enable undo or rollback functions. It … Continue reading Create a Simple Data Science Assistant Agent

Architecture of a Multi-Index Agent

After a successful single-index agent creation, I am pondering about creating a multi-index agent, the benefits are obvious: Scalability: Add new indexes without changing core code Maintainability: Each index config is isolated Flexibility: Easy to modify methodology for one index without affecting others Testability: Test each index independently Reusability: Share common logic (QC, processing rules, … Continue reading Architecture of a Multi-Index Agent