Recently I built a small Python program that automates one of the most
repetitive tasks in financial research. The task involves downloading
historical data for multiple assets from Yahoo Finance. The intention was not
to create something complex or overly ambitious, but rather a tool that is
simple, functional, and adaptable to different research needs. Anyone working
with time series data in CSV format will understand how tedious it becomes to
gather the same categories of data repeatedly from web interfaces or custom
scripts. This program replaces that routine with a guided, menu-based process
that runs entirely in the terminal.
The structure is deliberately
minimal. It allows the user to set tickers, define a time interval, choose a
start and end date, and select a destination folder for saving the output.
Everything runs step by step without requiring the user to write or modify
code. Once the configuration is complete, the program uses the yfinance
package to retrieve the data and automatically stores each dataset in a
well-named CSV file.
Although I developed this program as part of
my own research workflow in risk management and financial econometrics, it may
also benefit students or analysts who need an efficient and reproducible
method for collecting financial data.
Necessary Modules
To run the program successfully, you must have Python installed on your system along with two essential modules: yfinance and datetime. The yfinance module allows the program to access historical data directly from Yahoo Finance using its public API. It handles the data retrieval process in a structured and reliable way. The datetime module, which is part of Python’s standard library, is used to validate and process the user’s input regarding dates. If you do not already have yfinance installed, you can add it easily by running the following command in your terminal:
pip install yfinance
How to Use the Program
Once you run the program, a clean menu will appear in your terminal. From there, the entire process is guided step by step. First, you will be asked to enter the tickers of the assets you want to download. These tickers must follow Yahoo Finance's format. You can include stocks, indices, or commodities. Multiple tickers are separated by commas. For example, entering AAPL,^GSPC,GC=F will select Apple, the S&P 500 index, and gold futures.
After selecting your tickers, the program will ask for the start and end date.
You must enter both dates in the form day-month-year. If the format is
incorrect, the program will prompt you to try again. This ensures that all
date entries are valid and interpretable before the data request is sent.
Next,
you will choose the frequency of the data. You have three options available:
daily, weekly, or monthly. Each option corresponds to the resolution of the
time series that will be retrieved. For most use cases in econometrics or
portfolio analysis, daily data is sufficient, but weekly and monthly data are
also useful for reducing noise or constructing long-horizon models.
Once
you have defined the frequency, you will be asked to choose where the data
should be saved. You can either specify a directory or leave the field empty.
If you leave it empty, the program will automatically save the files in the
same directory as the script. The program will then summarize all your inputs
and wait for confirmation.
When you type the word START, the
program begins downloading the historical data for each ticker you selected.
Each dataset will be saved as a separate .csv file named according to the
asset and the date range you specified. For instance, a file for Apple may be
called AAPL_2024-01-01_to_2024-06-30.csv.
If any error occurs
during the download—such as a typo in the ticker or an unavailable dataset—the
program will report it, but it will continue with the remaining symbols. Once
the process is complete, you will see a confirmation message on screen.
At
the end, you may exit the program or return to the main menu and run another
session. The tool was designed to allow repeated use without restarting the
entire script.
Code
I created this program to support academic work and non-commercial projects. Feel free to use or adapt the code for your own learning or research. Just please do not use it for commercial purposes. All rights remain with me, Stefanos Stavrianos. You can also download the complete script directly from this link. The script is part of my public GitHub repository, where I will gradually include all Python tools I develop for financial data analysis and applied econometrics. You may explore the full repository here and follow its progress.
import os
import yfinance as yf
from datetime import datetime
tickers = []
start_date = None
end_date = None
interval = None
save_dir = None
def clear_screen():
os.system("clear" if os.name == "posix" else "cls")
dscr = "Risk Management & Financial Econometrics "
banner = f"""
{'*' * len(dscr)}
Stefanos Stavrianos, PhD Candidate
{dscr}
University of Patras, GR
www.stefanstavrianos.eu/en
{'*' * len(dscr)}
"""
print(banner.strip(), end="\n")
def show_exit_message():
print("Thank you for using the Data Downloader.")
print("Wishing you accurate data and insightful research!\n")
def download_data():
clear_screen()
print("Start downloading...\n")
for i, symbol in enumerate(tickers, start=1):
try:
ticker_obj = yf.Ticker(symbol)
info = ticker_obj.info
name = info.get("shortName") or info.get("longName") or "Unknown"
display_name = name.strip()
safe_name = name.replace(" ", "_").replace("/", "_").replace(",", "").replace(":", "")
data = ticker_obj.history(start=start_date, end=end_date, interval=interval)
safe_symbol = symbol.replace("=", "").replace("^", "")
filename = f"{safe_name}_{safe_symbol}_{start_date}_to_{end_date}.csv"
output_path = os.path.join(save_dir, filename)
os.makedirs(save_dir, exist_ok=True)
data.to_csv(output_path)
print(f"{i}) {display_name}... done")
except Exception as e:
print(f"{i}) Error with {symbol}: {e}")
input("\nPress ENTER to return to the main menu...")
clear_screen()
print("All data downloaded successfully!\n")
def set_tickers():
global tickers
while True:
clear_screen()
print("Enter Yahoo Finance codes separated by comma (,)")
print("Example: GC=F,^GSPC,AAPL")
user_input = input("Tickers: ")
tickers = [x.strip() for x in user_input.split(",") if x.strip()]
if tickers:
break
print("\nAt least one ticker required.")
input("Press ENTER to try again...")
def set_dates():
global start_date, end_date
while True:
clear_screen()
s = input("Enter START date (DD-MM-YYYY): ")
e = input("Enter END date (DD-MM-YYYY): ")
try:
datetime.strptime(s, "%d-%m-%Y")
datetime.strptime(e, "%d-%m-%Y")
start_date = datetime.strptime(s, "%d-%m-%Y").strftime("%Y-%m-%d")
end_date = datetime.strptime(e, "%d-%m-%Y").strftime("%Y-%m-%d")
break
except ValueError:
print("\nInvalid date format. Use DD-MM-YYYY.")
input("Press ENTER to try again...")
def set_interval():
global interval
while True:
clear_screen()
print("Choose interval")
print("(a) 1d")
print("(b) 1wk")
print("(c) 1mo")
choice = input("Enter option (a/b/c): ").lower()
if choice == "a":
interval = "1d"
break
elif choice == "b":
interval = "1wk"
break
elif choice == "c":
interval = "1mo"
break
print("\nInvalid choice.")
input("Press ENTER to try again...")
def set_location():
global save_dir
while True:
clear_screen()
print("Choose path or leave it empty to save in the same directory as this program")
path = input("Directory: ").strip()
save_dir = path if path else os.getcwd()
if os.path.isdir(save_dir):
break
print("\nPath not valid.")
input("Press ENTER to try again...")
def configuration_complete():
return all([tickers, start_date, end_date, interval, save_dir])
def handle_incomplete_config():
clear_screen()
print("Configuration incomplete. Please set all required fields.\n")
while True:
print("(1) Main Menu")
print("(2) Exit")
choice = input("\nChoose option: ").strip()
match choice:
case "1":
return
case "2":
clear_screen()
show_exit_message()
exit()
case _:
clear_screen()
print("Configuration incomplete. Please set all required fields.\n")
def show_menu():
while True:
clear_screen()
title = "Data Downloader"
border = "=" * max(len(title), 40)
print(border)
print(title)
print(border)
print("")
print("(1) Set Tickers")
print("(2) Set Date Range")
print("(3) Set Interval")
print("(4) Set Save Location")
print("(5) Exit")
print("")
config_header = "Configuration Summary "
border = "-" * max(len(config_header), 40)
print(border)
print(config_header)
print(border)
print(f"Tickers: {', '.join(tickers) if tickers else 'Not set'}")
print(f"Start Date: {start_date if start_date else 'Not set'}")
print(f"End Date: {end_date if end_date else 'Not set'}")
print(f"Interval: {interval if interval else 'Not set'}")
print(f"Save Location: {save_dir if save_dir else 'Not set'}")
print(border)
choice = input("\nChoose option or press ENTER to download: \n").strip().upper()
if choice == "":
choice = "START"
match choice:
case "1":
set_tickers()
case "2":
set_dates()
case "3":
set_interval()
case "4":
set_location()
case "5":
clear_screen()
show_exit_message()
break
case "START":
if configuration_complete():
download_data()
else:
clear_screen()
handle_incomplete_config()
case _:
handle_incomplete_config()
if __name__ == "__main__":
try:
show_menu()
except KeyboardInterrupt:
clear_screen()
show_exit_message()
Thanks! Amazingly helpful!
ReplyDeleteThank you very much. I am glad you found it helpful. If you have any questions while using the script or suggestions for improvements, feel free to share.
Delete