A high-level architectural diagram to visualize the system
/index-platform/
βββ π docs/ β βββ methodology_guides/ β βββ api_documentation.md β βββ architecture.md β
βββ π libs/ (Shared Libraries & Core Logic) β βββ π data-connectors/ (Code to talk to Bloomberg, Refinitiv, etc.) β β βββ bloomberg.py β β βββ refinitiv.py β βββ π index-methodologies/ (The “Rules as Code” – CRITICAL) β β βββ sp500.yaml β β βββ msci_world.yaml β β βββ solactive_clean_energy.yaml β βββ π quant-tools/ (Mathematical and financial functions) β β βββ risk_models.py β β βββ portfolio_math.py β βββ π shared-utils/ β βββ data_sanitizer.py β βββ logging_config.py β
βββ π jobs/ (Scheduled, Batch Processes) β βββ π rebalance-runner/ β β βββ main.py (The script that runs the rebalance) β β βββ Dockerfile β βββ π corp-action-processor/ β βββ main.py (Processes daily splits, dividends, M&A) β βββ Dockerfile β
βββ π services/ (Live, running microservices) β βββ π calculation-engine/ (The core real-time calculator) β β βββ main.cpp β β βββ Dockerfile β βββ π data-dissemination-api/ (Serves index data to clients) β β βββ routes.py β β βββ Dockerfile β βββ π analytics-dashboard-backend/ β βββ api.py β βββ Dockerfile β
βββ π .gitlab-ci.yml (The CI/CD pipeline definition) βββ π docker-compose.yml (For local development setup) βββ π README.md
pseudo code for run_rebalance main.py in the jobs folder
import argparse
from libs.data_connectors import refinitiv
from libs.quant_tools import portfolio_math
from libs.shared_utils import load_methodology_yaml, get_current_constituents
def run_rebalance(methodology_file):
# Step 1: Load the index rules from the YAML file
rules = load_methodology_yaml(methodology_file)
print(f"Starting rebalance for {rules['index_name']}...")
# Step 2: Fetch the full data universe based on the rules
# This is a massive API call to the data provider
universe_data = refinitiv.fetch_universe(rules['universe_source'])
# Step 3: Apply all eligibility filters programmatically
eligible_securities = universe_data
for f in rules['filters']:
eligible_securities = portfolio_math.apply_filter(eligible_securities, f)
# Step 4: Rank securities and select the top N
pro_forma_list = portfolio_math.rank_and_select(
eligible_securities,
rules['selection']['ranking_metric'],
rules['selection']['constituent_count']
)
# Step 5: Get the current list of stocks in the index
current_list = get_current_constituents(rules['id'])
# Step 6: Generate the "diff" file showing adds and deletes
changes = portfolio_math.generate_diff(current_list, pro_forma_list)
print("Rebalance Changes:", changes)
# Step 7: Save the results to the database and generate dissemination files
# ... code to write to SQL database and create files for FTP ...
print("Rebalance complete.")
if __name__ == "__main__":
# The script is run from a command line with the methodology file as an argument
# e.g., > python jobs/rebalance-runner/main.py --methodology sp500.yaml
parser = argparse.ArgumentParser()
parser.add_argument("--methodology")
args = parser.parse_args()
run_rebalance(args.methodology)
gitlab-ci.yml – Professional Software Engineering
This file proves it’s a real software project. It defines the Continuous Integration/Continuous Deployment (CI/CD) pipeline. Every time a developer makes a change, an automated process is triggered to:
- Lint:Β Check the code for stylistic errors.
- Test:Β Run thousands of unit tests and integration tests to ensure nothing broke. (e.g., “Does the market cap calculation still work?”, “Does a stock split get handled correctly?”).
- Build:Β Package the code into deployable units (like Docker containers).
- Deploy:Β Automatically deploy the new code to a staging environment for final review before pushing it to production.
Note the last piece is vital too, Β CI/CD pipeline is the mechanism that provides theΒ speed, safety, and auditabilityΒ required to manage a codebase where a single error could have billion-dollar consequences.
- Developer Pushes Code:Β A Quant Analyst updates theΒ sp500.yamlΒ file and a Developer pushes it to a feature branch.
- CI Kicks In:Β TheΒ validateΒ andΒ testΒ stages run automatically on the feature branch. The developer gets immediate feedback if their change broke a test.
- Merge to Main:Β After review, the code is merged into theΒ mainΒ branch.
- Full Pipeline Runs:Β Now theΒ entireΒ pipeline is triggered onΒ main.
- Validate:Β Code and YAML are checked again.
- Test:Β Unit tests are run again as a final check.
- Build:Β TheΒ build-rebalance-job-imageΒ job runs. It creates a new Docker image containing the updated code/rules and pushes it to GitLab’s private container registry.
- Deploy (Staging):Β TheΒ deploy-to-stagingΒ job runs automatically. It tells the staging server to pull the new Docker image and restart the service. The QA team can now verify the changes in a production-like environment.
- Deploy (Production):Β TheΒ deploy-to-productionΒ job appears in the pipeline UI with a “play” button. It isΒ paused. After the Index Committee gives final approval, an authorized engineer clicks the button. Only then is the command executed to update the production system.
Example of .gitlab-ci.ymlΒ file, a configuration thatΒ describes a series of scripts to be runΒ by a separate executor (the GitLab Runner) in a highly structured and controlled way:
# Default Docker image for all jobs. Creates a clean Python environment.
image: python:3.10-slim
# Define the sequence of the pipeline stages.
stages:
- validate
- test
- build
- deploy
# ---- VALIDATE STAGE ----
# Ensure code and configuration files are well-formatted before running expensive tests.
lint-python-code:
stage: validate
script:
- pip install flake8
- echo "Linting all Python files..."
- flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
validate-methodology-files:
stage: validate
script:
- pip install pyyaml
- echo "Validating all index methodology YAML files..."
- python ./scripts/validate_yaml.py ./libs/index-methodologies/ # A custom script to check YAML schema
# ---- TEST STAGE ----
# Run automated tests to verify the logic of the calculation engine.
unit-tests:
stage: test
script:
- pip install -r requirements.txt # Install dependencies like pandas, pytest
- echo "Running unit tests..."
- pytest ./tests/unit/ # Run tests on individual functions (e.g., market cap calculation)
# This job creates an artifact (a report) that can be viewed later.
artifacts:
paths:
- coverage.xml
# ---- BUILD STAGE ----
# If tests pass, package the application into a distributable format (a Docker image).
build-rebalance-job-image:
stage: build
# We need Docker-in-Docker to build an image inside a GitLab CI job.
image: docker:20.10.16
services:
- docker:20.10.16-dind
script:
- echo "Logging into GitLab Container Registry..."
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- echo "Building Docker image for the rebalance job..."
- docker build -t $CI_REGISTRY_IMAGE/rebalance-job:latest ./jobs/rebalance-runner
- echo "Pushing Docker image to registry..."
- docker push $CI_REGISTRY_IMAGE/rebalance-job:latest
rules:
# This job only runs on the main branch to avoid building images for every feature branch.
- if: '$CI_COMMIT_BRANCH == "main"'
# ---- DEPLOY STAGE ----
# Deploy the packaged application to the servers.
deploy-to-staging:
stage: deploy
script:
- echo "Deploying to Staging environment for final review..."
# A real script would use tools like Ansible, Terraform, or Kubernetes commands.
# This is a simplified example using SSH.
- ssh user@staging-server.com "docker pull $CI_REGISTRY_IMAGE/rebalance-job:latest && ./restart-service.sh"
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
deploy-to-production:
stage: deploy
script:
- echo "Deploying to PRODUCTION. This is a high-stakes operation."
- ssh user@production-server.com "docker pull $CI_REGISTRY_IMAGE/rebalance-job:latest && ./restart-service.sh"
# CRITICAL SAFETY FEATURE: This job does not run automatically.
# A human must go into the GitLab UI and manually click "play" to run it.
# This is CONTINUOUS DELIVERY, not Continuous Deployment.
when: manual
rules:
- if: '$CI_COMMIT_BRANCH == "main"'