Maximum Likelihood Estimation (MLE)

February 23, 2025 / Naixian Zhang / Leave a comment

Maximum Likelihood Estimation (MLE) is a robust statistical methodology employed to ascertain the parameters of a probability distribution through the maximization of the likelihood function. The significance of the likelihood function cannot be overstated, as it concurrently serves as the loss function within the framework of neural network mechanisms. Loss functions in neural networks can … Continue reading Maximum Likelihood Estimation (MLE)

Kolmogorov-Arnold Networks (KANs)

February 21, 2025 / Naixian Zhang / Leave a comment

Kolmogorov-Arnold Networks (KANs) are a novel class of neural networks inspired by the Kolmogorov-Arnold Representation Theorem, which states that any multivariate function can be decomposed into a sum of univariate functions. KANs differ from traditional neural networks by replacing weighted sum-based neurons with learnable, non-linear univariate functions. This allows them to have adaptive basis functions, … Continue reading Kolmogorov-Arnold Networks (KANs)

Adaboost and Random Forest: Powerful Ensemble Methods

February 21, 2025February 21, 2025 / Naixian Zhang / Leave a comment

In the realm of data science, ensemble methods play a crucial role in improving predictive performance by combining multiple weak learners into a stronger model. One of the most well-known ensemble techniques is AdaBoost (Adaptive Boosting), introduced by Freund and Schapire in 1996. AdaBoost is a powerful yet intuitive algorithm that enhances the accuracy of … Continue reading Adaboost and Random Forest: Powerful Ensemble Methods

Dynamic Programing and Recursive Algo

February 14, 2025 / Naixian Zhang / Leave a comment

Dynamic programming is an optimization technique that can be applied to recursive problems by storing intermediate results. Recursion is a concept, a way of solving problems where a function calls itself.

Statistic Distribution and Methods to Perform Statistical Analysis

February 14, 2025February 14, 2025 / Naixian Zhang / Leave a comment

We'd go through normal distribution, Bernouli distribution, binomial distribution, Poisson distribution, chi-square distribution, etc. ANOVA (Analysis of Variance) is primarily associated with the F-test, but its scope extends beyond just the F-test. ANOVA tests whether the means of three or more groups are equal by comparing: Between-group variability (differences across group means). Within-group variability (differences … Continue reading Statistic Distribution and Methods to Perform Statistical Analysis

Tensor – Invariant Geometric Object

February 14, 2025April 6, 2025 / Naixian Zhang / Leave a comment

Revisiting the concept of tensor in April 2025 per inspiration from "abide by reason". The key is to grasp concept of dual space, (where df in differential geometry live too). It's essentially a "linear functional" that takes in element in vector space and spit out real number in R. Tensor's concept is the expansion on … Continue reading Tensor – Invariant Geometric Object

OOP and Functional Programming

February 10, 2025 / Naixian Zhang / Leave a comment

Besides the widely used object-oriented programming (OOP), functional programming is another powerful paradigm, especially in applications like Apache Kafka, where handling high-volume streaming data is crucial. To understand functional programming, it's helpful to first examine non-functional programming paradigms. OOP is one example, but there are several others, including imperative programming, procedural programming, event-driven programming, logic … Continue reading OOP and Functional Programming

Optimization Problem and Methods

February 9, 2025February 9, 2025 / Naixian Zhang / Leave a comment

While Lagrange multipliers are a powerful technique for constrained optimization, there are several other methods used to solve optimization problems: Gradient Descent: An iterative first-order optimization algorithm for finding a local minimum of a differentiable function24. It's commonly used in machine learning for training models by minimizing error functions. Newton's Method: A second-order optimization technique that … Continue reading Optimization Problem and Methods

Deepseek Innovations per Their Papers Summarized by YC

February 5, 2025 / Naixian Zhang / Leave a comment

Diana Hu did a great job summarizing the key innovations or ingenuities of Deepseek. First, float 8 format to save memory without sacrificing performance. But FP8 will have accumulation precision issue, the strategy to improve accumulation precision in FP8 GEMM operations on NVIDIA H800 GPUs. Due to limited 14-bit precision in Tensor Cores, large matrix … Continue reading Deepseek Innovations per Their Papers Summarized by YC

Three Types of AI Talent Needed

January 26, 2025January 28, 2025 / Naixian Zhang / Leave a comment

There are people who research and develop new algorithms and architectures, advancing the field of artificial intelligence. There are people who train the neural network models such as OpenAI's ChatGPT, Qwen, and DeepSeek models, pushing the boundaries of what AI can achieve. There are people who use these AI models to transform and disrupt existing … Continue reading Three Types of AI Talent Needed