Transform unstructured earnings call transcripts into actionable investment insights using Snowflake Cortex AI, ML Model Registry, and intelligent agents - all accelerated by Cortex Code.
Financial analysts spend countless hours manually reviewing earnings call transcripts. This guide demonstrates how to systematically process unstructured data at scale using AI Functions (AI_COMPLETE, AI_SQL) - turning raw transcript text into structured sentiment scores, analyst participation metrics, and investment signals that feed directly into quantitative models.
Full Guide: For detailed architecture, business impact, and use cases, see the Snowflake Developers Guide.
- How to use Cortex Code to build data pipelines through natural language
- How to extract structured insights from unstructured text using
AI_COMPLETE() - How to train and register ML models in Snowflake's Model Registry
- How to create semantic search over unstructured data with Cortex Search
- How to build a Cortex Agent that combines multiple AI tools
- A sentiment analysis pipeline that scores earnings call transcripts (1-10 scale)
- A LightGBM stock return prediction model with walk-forward validation
- A Cortex Search service for natural language queries over sentiment data
- A Cortex Agent accessible via Snowflake Intelligence
- Snowflake account (sign up for a free trial) with
ACCOUNTADMINaccess (see note below) - Access to Snowflake Marketplace
Note on Privileges: This guide uses
ACCOUNTADMINfor simplicity in demo and learning environments. For production deployments, follow the principle of least privilege by creating a dedicated role with only the specific grants required.
- Navigate to Snowflake Public Data (Free) in Snowflake Marketplace
- Click Get and accept the terms
- Keep the default database name
SNOWFLAKE_PUBLIC_DATA_FREE - Grant access to
ACCOUNTADMINrole - Click Get to install
- In Snowsight, navigate to Projects > Workspaces
- Create a new SQL file and copy the contents from
scripts/setup.sql - Run the entire script
This creates the complete demo environment including database, warehouse, tables, stored procedures, and deploys notebooks from this repository.
- In Snowsight, navigate to Projects > Notebooks
- Switch your role to
FSI_DEMO_ROLE(bottom-left, click on your username) - Open the
START_HEREnotebook - Run all cells to extract analyst sentiment from earnings call transcripts using Cortex AI Functions
Try with Cortex Code (before running cells):
Explain what the AI_COMPLETE function does in this notebook
After running all cells:
Summarize the sentiment analysis results shown in the notebook output
This is where Cortex Code shines - build an entire ML pipeline through conversation.
Note: To use Snowflake Intelligence in Step 6, you must complete either Option A or Option B to register an ML model.
Option A: Run pre-built code
- Open the
TRAIN_ML_MODELSnotebook and run all cells
Option B: Build it yourself with Cortex Code
- Create a new blank notebook with these settings:
- Notebook location:
FSI_DEMO_DB/ANALYTICS - Runtime: Run on warehouse
- Query warehouse:
FSI_DEMO_WH - Notebook warehouse:
FSI_DEMO_WH
- Notebook location:
- Install required packages using the Packages selector (top of the page):
lightgbmscikit-learnmatplotlibseabornstatsmodelssnowflake-ml-python
- Add a Python cell with the following imports and run it:
# Import python packages
import pandas as pd
# We can also use Snowpark for our analyses!
from snowflake.snowpark.context import get_active_session
session = get_active_session()- Open Cortex Code (bottom-right icon)
- Use these prompts one at a time, running the generated code after each:
Note: You can start with Prompt 1 immediately - Cortex Code will explore the FSI_DATA table directly.
Prompt 1: Feature Engineering
Using FSI_DEMO_DB.ANALYTICS.FSI_DATA table, help me construct features with returns: the last 1 day return using close price, return from t-4 to t-1, return from t-9 to t-5, return from t-20 to t-11, and return from t-62 to t-21. Also construct the predictive variable as future return from t+2 to t+6. Take the log across all return variables. Keep as panel data where ticker is a column.
After Cortex Code generates the code:
- Copy the SQL into a new SQL cell
- Click on the cell name (e.g.,
cell2) in the top-left corner and rename it tofeatures_df- Run the SQL cell
Prompt 2: Train ML Model
Using features_df (a SQL cell result - use .to_pandas() to convert), train a predictive LightGBM model with L2 metric. Do walk-forward training on a quarterly basis. For each test quarter: Train on all quarters < (Q-2), Validate on (Q-2, Q-1), Test on Q. Enforce strict cutoffs so rows needing returns beyond the split end are dropped (no look-ahead).
Prompt 3: Backtesting
Test if the strategy works starting 2021. For each portfolio construction, generate forecasts on Tuesdays. At Wednesday close, go long top-5 and short bottom-5 by predicted return (equal weight). Hold through Thu to next Wed (the t+2..t+6 window). Transaction cost: 3.0 bps one-way via weekly turnover. Show Information Ratio (before/after costs), Max drawdown, and plot the equity curve.
Prompt 4: Register Model in Snowflake
Register the final model in Snowflake Registry with model name "FIS_STOCK_RETURN_PREDICTOR_GBM", sample input of 100 rows, target_platforms=["WAREHOUSE"], and method_options for predict with case_sensitive=True.
Prompt 5: Verify Model Registration
Show me all models registered in FSI_DEMO_DB.ANALYTICS schema
Explore the code further:
Explain how the walk-forward validation prevents look-ahead biasCompare model performance across different feature combinations to identify which return windows matter mostWhy might the model be overfitting? Suggest fixes
- In Snowsight, navigate to Projects > Notebooks
- Open the
CREATE_CORTEX_COMPONENTSnotebook - Run all cells to create Cortex Search, Semantic View, and the Agent
Navigate to AI & ML → Snowflake Intelligence and try these example questions:
ML Predictions (StockPerformancePredictor):
Give me top 3 vs bottom 3 trade predictions for the next period
What are the model's top stock picks right now?
Show me the bottom 5 predicted performers
Structured Data Queries (Cortex Analyst):
Which companies have the highest sentiment score?
What is the average sentiment score by company?
Show me sentiment trends over time for MSFT
Semantic Search (Cortex Search):
Search for companies with concerns about margins
Find earnings calls where analysts discussed supply chain issues
Search for bullish commentary about revenue growth
Combined Analysis:
Compare the top predicted stocks with their analyst sentiment scores
Let's observe if any high sentiment in the bottom 3 performers, and summarize the qualitative insights from the earnings call that shows top sentiment
Show me the top 5 predictions and search for any negative sentiment in their earnings calls
Email Reports (SendEmail):
Note: Email functionality requires your Snowflake user to have a verified email address. Verify your email in Snowsight: User menu → Setting → Profile → Verfify Email.
Send me an email summary of today's top stock picks
Email me a report of companies with sentiment scores above 8
Send an email with the top 3 vs bottom 3 predictions
Beyond following the guided steps, here's what Cortex Code can do:
What tables exist in FSI_DEMO_DB.ANALYTICS? Describe each one.
Explain this notebook cell by cell
What's the schema of the AI_TRANSCRIPTS_ANALYSTS_SENTIMENTS table?
This cell is throwing an error - help me fix it
Why is my model prediction returning NULL?
The query is slow - how can I optimize it?
Summarize the model's feature importance
Compare actual vs predicted returns for Q4
Which stocks had the biggest prediction errors?
To remove all demo objects, run the teardown script:
- In Snowsight, navigate to Projects > Workspaces
- Create a new SQL file and copy the contents from
scripts/teardown.sql - Run the script
You've built an end-to-end AI-powered quantitative research pipeline entirely within Snowflake. From here, you can:
- Expand coverage - Add more companies beyond the DOW 30
- Add new features - Use Cortex Code to add technical indicators (RSI, MACD, Bollinger Bands)
- Improve the model - Experiment with different ML algorithms or hyperparameters
- Build dashboards - Create a Streamlit app to visualize sentiment trends
- Automate updates - Schedule daily predictions with Snowflake Tasks
The best part? You can use Cortex Code to help with all of it—just describe what you want to build.
- Cortex Code Documentation
- Cortex AI Functions
- Cortex Search
- Cortex Analyst
- Snowflake Intelligence
- Snowflake ML Model Registry
- Snowflake Notebooks
Copyright (c) Snowflake Inc. All rights reserved.
The code in this repository is licensed under the Apache 2.0 License.






