Cryptocurrency Statistical Analysis for Strategy Formulation

After collecting time series data, statistical analysis is necessary to obtain understanding and decide potential strategies.

Collected daily price and metrics of fee   ratio_mvrv         count_exctransfer              hash_rate          total_issurance total_liquidation              miner_revenue tansfer_value    spot_volume

Note I shifted the metrics data one day forward to make the backtesting realistic as it is exactly what data is available on that day

From 20190419 to 20240509,  transformed to daily return data, i.e. stationary data, then standardized and winsorized for data observation

Correlation map, this can view directly how these data points are correlated to each other but can’t identify if there is lead-lag correlation existing;

    From the heat map below, return_price and return_ratio_mvrv is correlation is almost 1, so is return_hash_rate and return_total _issurance, indicating these pairs are redundant data points.

    It explains why MVRV Ratio showed “strong” signal in initial momentum strategy, the ratio of MVRV is literally replication of price data, hence not meaningful to use it for investing

    Moreover, return_fee, total_liquidation and miner_revenue also demonstrates correlation: 0.19, 0.19 and 0.25 with price daily return

    Cross Correlation Function (CCF) to verify and identify if there are data points highly correlated in lead-lag fashion. Consistent with correlation matrix above, mvrv has high correlation with price return in a lag fashion, so is miner revenue and price return, but not as strong, while spot volume doesn’t have such effect with price return.

    Vector Autoregressive (VAR), fed to Anthropic for deeper insight:

    without prompting it as crypto expert, the insight purely based on math indicates price return has high autoregressive feature, backing up my attempt to use momentum strategy.

    Variance Inflation Factor (VIF): A more formal detection involves calculating the VIF for each variable. VIF values greater than 5 indicate multicollinearity that may be too high.

      Hence from below output, it verifies that ratio of MVRV, Hash Rate, Total Issurance and Miner Revenue is highly correlated with other variables

      SVD it’s like PCA, also to view redundant features/metrics

        The sigma/eigen values of SVD below indicate the 10 features can be represented by 5 to 4 principle eigen vectors(liner combination)

        If remove the price return feature,

        Step Wise Signal Selection, note I removed mvrv and total issuance(given hash rate is kept).

        Hence in conclusion, we can choose

        • Miner_Revenue
        • Hash_Rate
        • Fee,
        • Total Liquidation

        as signals for our crypto strategies.

        Leave a comment

        This site uses Akismet to reduce spam. Learn how your comment data is processed.