Posted by quantdev_alex | 3 days ago
I'm building a backtesting engine for some trading strategies I've been working on and I need a solid historical data API that won't break the bank. Currently looking at Alpha Vantage and Polygon.io but the rate limits on free tiers are pretty brutal. I need at least 5 years of daily OHLCV data for US equities and ideally some forex pairs too. What are you guys using for backtesting? I don't need real-time data, just reliable historical stuff that I can pull in bulk without getting throttled every 5 minutes.
Reply by tradingbot_jenny | 2 days ago
Alex I feel your pain on the rate limits, it's so annoying when you're trying to test something quickly. I ended up going with Yahoo Finance through the yfinance Python library and honestly it's been solid for my needs - completely free and you can pull decades of data pretty easily. The data quality isn't perfect and there's occasional gaps during stock splits or dividends, but for basic backtesting it works fine. If you need more serious data though, I'd say bite the bullet and pay for Polygon.io's starter plan, it's like $200/month but the data is way cleaner and you get minute-level granularity which is crucial if you're testing intraday strategies.
Reply by MarketDataPro | 2 days ago
Stay away from free APIs if you're doing anything remotely serious, survivorship bias will absolutely wreck your backtest results. I learned this the hard way after spending 3 months optimizing a strategy on Yahoo data only to find out it was completely curve-fitted garbage because delisted stocks weren't included lol. We switched to Tiingo for our firm and it's been worth every penny - they have survivorship-free data and corporate actions are handled properly. Their API is super developer friendly too with good documentation and reasonable rate limits. For forex I'd recommend OANDA or Interactive Brokers API, both have solid historical data going back pretty far.
Reply by quantdev_alex*| 1 day ago
Jenny yeah I've used yfinance before for quick prototypes but you're right about the data quality issues, I've definitely run into weird gaps. MarketDataPro that's a really good point about survivorship bias, I hadn't even thought about that affecting the results - that could totally explain why some of my backtests look too good to be true haha. What's the pricing like on Tiingo compared to Polygon? I'm bootstrapping this project so trying to keep costs under $300/month if possible. Also does anyone have experience with Quandl (now part of Nasdaq Data Link)?
Reply by tradingbot_jenny | 1 day ago
Tiingo's cheaper than Polygon for sure, I think their pro plan is around $30/month for end-of-day data which is perfect for daily/swing trading strategies. Quandl used to be amazing when it was free but after the Nasdaq acquisition the pricing got kind of insane for anything beyond basic stuff. One thing I'd add is make sure whatever API you choose has a good Python/REST interface because you'll be making thousands of requests during development and you don't want to deal with weird authentication issues or poorly documented endpoints. Also check if they provide adjusted vs unadjusted prices - you want both for proper backtesting.
Reply by crypto_quant_mike | 20 hours ago
If you're open to crypto data at all, Binance and Coinbase APIs are actually really solid for historical data and completely free with generous limits. I know that's not what you asked for but just throwing it out there since a lot of backtesting strategies can be adapted across markets. For traditional markets though I second the Tiingo recommendation, their support is actually responsive too which is rare. Just whatever you do, don't try to scrape data from TradingView or other sites, you'll get IP banned so fast and it's against their ToS anyway lol.