Remote Data Access¶
Warning
The access_key
keyword argument of DataReader
has been deprecated in favor of api_key
.
Functions from pandas_datareader.data
and pandas_datareader.wb
extract data from various Internet sources into a pandas DataFrame.
Currently the following sources are supported:
It should be noted, that various sources support different kinds of data, so not all sources implement the same methods and the data elements returned might also differ.
Tiingo¶
Tiingo is a tracing platform that provides a data api with historical end-of-day prices on equities, mutual funds and ETFs. Free registration is required to get an API key. Free accounts are rate limited and can access a limited number of symbols (500 at the time of writing).
In [1]: import os
In [2]: import pandas_datareader as pdr
In [3]: df = pdr.get_data_tiingo('GOOG', api_key=os.getenv('TIINGO_API_KEY'))
In [4]: df.head()
close high low open volume adjClose adjHigh adjLow adjOpen adjVolume divCash splitFactor
symbol date
GOOG 2014-03-27 00:00:00+00:00 558.46 568.00 552.92 568.000 13100 558.46 568.00 552.92 568.000 13100 0.0 1.0
2014-03-28 00:00:00+00:00 559.99 566.43 558.67 561.200 41100 559.99 566.43 558.67 561.200 41100 0.0 1.0
2014-03-31 00:00:00+00:00 556.97 567.00 556.93 566.890 10800 556.97 567.00 556.93 566.890 10800 0.0 1.0
2014-04-01 00:00:00+00:00 567.16 568.45 558.71 558.710 7900 567.16 568.45 558.71 558.710 7900 0.0 1.0
2014-04-02 00:00:00+00:00 567.00 604.83 562.19 565.106 146700 567.00 604.83 562.19 565.106 146700 0.0 1.0
IEX¶
Warning
Usage of all IEX readers now requires an API key. See below for additional information.
The Investors Exchange (IEX) provides a wide range of data through an
API. Historical stock
prices are available for up to 15 years. The usage of these readers requires the publishable API key from IEX Cloud Console, which can be stored in the IEX_API_KEY
environment variable.
In [1]: import pandas_datareader.data as web
In [2]: from datetime import datetime
In [3]: start = datetime(2016, 9, 1)
In [4]: end = datetime(2018, 9, 1)
In [5]: f = web.DataReader('F', 'iex', start, end)
In [6]: f.loc['2018-08-31']
Out[6]:
open 9.64
high 9.68
low 9.40
close 9.48
volume 76424884.00
Name: 2018-08-31, dtype: float64
Note
You must provide an API Key when using IEX. You can do this using
os.environ["IEX_API_KEY"] = "pk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
or by exporting the key before starting the IPython session.
There are additional interfaces to this API that are directly exposed: tops (‘iex-tops’) and last (‘iex-lasts’). A third interface to the deep API is exposed through Deep class or the get_iex_book function.
Todo
Execute block when markets are open
import pandas_datareader.data as web
f = web.DataReader('gs', 'iex-tops')
f[:10]
Alpha Vantage¶
Alpha Vantage provides realtime equities and forex data. Free registration is required to get an API key.
Historical Time Series Data¶
Through the Alpha Vantage Time Series endpoints, it is possible to obtain historical equities and currency rate data for individual symbols. For daily, weekly, and monthly frequencies, 20+ years of historical data is available. The past 3-5 days of intraday data is also available.
The following endpoints are available:
av-intraday
- Intraday Time Seriesav-daily
- Daily Time Seriesav-daily-adjusted
- Daily Time Series (Adjusted)av-weekly
- Weekly Time Seriesav-weekly-adjusted
- Weekly Time Series (Adjusted)av-monthly
- Monthly Time Seriesav-monthly-adjusted
- Monthly Time Series (Adjusted)av-forex-daily
- Daily Time Series
In [1]: import os
In [2]: from datetime import datetime
In [3]: import pandas_datareader.data as web
In [4]: f = web.DataReader("AAPL", "av-daily", start=datetime(2017, 2, 9),
...: end=datetime(2017, 5, 24),
...: api_key=os.getenv('ALPHAVANTAGE_API_KEY'))
In [5]: f.loc["2017-02-09"]
Out[5]:
open 1.316500e+02
high 1.324450e+02
low 1.311200e+02
close 1.324200e+02
volume 2.834990e+07
Name: 2017-02-09, dtype: float64
To request the historical exchange rate of physical or digital currencies, use
av-forex-daily
and format the symbol as “FROM/TO”, for example “USD/JPY”.
The top-level function get_data_alphavantage
is also provided. This
function will
return the TIME_SERIES_DAILY
endpoint for the symbol and date range
provided.
Quotes¶
Alpha Vantage Batch Stock Quotes
endpoint allows the retrieval of realtime stock quotes for up to 100 symbols at
once. These quotes are accessible through the top-level function
get_quote_av
.
In [1]: import os
In [2]: from datetime import datetime
In [3]: import pandas_datareader.data as web
In [4]: web.get_quote_av(["AAPL", "TSLA"])
Out[4]:
price volume timestamp
symbol
AAPL 219.87 NaN 2019-09-16 15:59:59
TSLA 242.80 NaN 2019-09-16 15:59:57
Note
Most quotes are only available during market hours.
Forex¶
Alpha Vantage provides realtime currency exchange rates (for physical and digital currencies).
To request the exchange rate of physical or digital currencies, simply format as “FROM/TO” as in “USD/JPY”.
In [1]: import os
In [2]: import pandas_datareader.data as web
In [3]: f = web.DataReader("USD/JPY", "av-forex",
...: api_key=os.getenv('ALPHAVANTAGE_API_KEY'))
In [4]: f
Out[4]:
USD/JPY
From_Currency Code USD
From_Currency Name United States Dollar
To_Currency Code JPY
To_Currency Name Japanese Yen
Exchange Rate 108.17000000
Last Refreshed 2019-09-17 10:43:36
Time Zone UTC
Bid Price 108.17000000
Ask Price 108.17000000
Multiple pairs are are allowable:
In [1]: import os
In [2]: import pandas_datareader.data as web
In [3]: f = web.DataReader(["USD/JPY", "BTC/CNY"], "av-forex",
...: api_key=os.getenv('ALPHAVANTAGE_API_KEY'))
In [4]: f
Out[4]:
USD/JPY BTC/CNY
From_Currency Code USD BTC
From_Currency Name United States Dollar Bitcoin
To_Currency Code JPY CNY
To_Currency Name Japanese Yen Chinese Yuan
Exchange Rate 108.17000000 72230.38039500
Last Refreshed 2019-09-17 10:44:35 2019-09-17 10:44:01
Time Zone UTC UTC
Bid Price 108.17000000 72226.26407700
Ask Price 108.17000000 72230.02554000
Sector Performance¶
Alpha Vantage provides sector
performances through the top-level function get_sector_performance_av
.
In [1]: import os
In [2]: import pandas_datareader.data as web
In [3]: web.get_sector_performance_av().head()
Out[4]:
RT 1D 5D 1M 3M YTD 1Y 3Y 5Y 10Y
Energy 3.29% 3.29% 4.82% 11.69% 3.37% 9.07% -15.26% -7.69% -32.31% 12.15%
Real Estate 1.02% 1.02% -1.39% 1.26% 3.49% 24.95% 16.55% NaN NaN NaN
Utilities 0.08% 0.08% 0.72% 2.77% 3.72% 18.16% 16.09% 27.95% 48.41% 113.09%
Industrials -0.15% -0.15% 2.42% 8.59% 5.10% 22.70% 0.50% 34.50% 43.53% 183.47%
Health Care -0.23% -0.23% 0.88% 1.91% 0.09% 5.20% -2.38% 26.37% 43.43% 216.01%
Econdb¶
Econdb provides economic data from 90+ official statistical agencies. Free API allows access to the complete Econdb database of time series aggregated into datasets.
Reading data for a single series such as the RGDP
for United States, is as simple as taking the ticker segment from the URL path
(RGDPUS
in https://www.econdb.com/series/RGDPUS/) and passing it in like:
import os
import pandas_datareader.data as web
f = web.DataReader('ticker=RGDPUS', 'econdb')
f.head()
The code snippet for exporting the whole dataset, or its filtered down subset, can be generated by using the Export -> Pandas Python3 functionality on any of the numerous datasets available, such as the Eurostat’s GDP and main components
import os
import pandas_datareader.data as web
df = web.DataReader('dataset=NAMQ_10_GDP&v=Geopolitical entity (reporting)&h=TIME&from=2018-05-01&to=2021-01-01&GEO=[AL,AT,BE,BA,BG,HR,CY,CZ,DK,EE,EA19,FI,FR,DE,EL,HU,IS,IE,IT,XK,LV,LT,LU,MT,ME,NL,MK,NO,PL,PT,RO,RS,SK,SI,ES,SE,CH,TR,UK]&NA_ITEM=[B1GQ]&S_ADJ=[SCA]&UNIT=[CLV10_MNAC]', 'econdb')
df.columns
Datasets can be located through Econdb’s search engine, or discovered by exploring the tree of available statistical sources.
Enigma¶
Access datasets from Enigma,
the world’s largest repository of structured public data. Note that the Enigma
URL has changed from app.enigma.io as of release
0.6.0
, as the old API deprecated.
Datasets are unique identified by the uuid4
at the end of a dataset’s web address.
For example, the following code downloads from USDA Food Recalls 1996 Data.
In [1]: import os
In [2]: import pandas_datareader as pdr
In [3]: df = pdr.get_data_enigma('292129b0-1275-44c8-a6a3-2a0881f24fe1', os.getenv('ENIGMA_API_KEY'))
In [4]: df.columns
Out[4]:
Index(['case_number', 'recall_notification_report_number',
'recall_notification_report_url', 'date_opened', 'date_closed',
'recall_class', 'press_release', 'domestic_est_number', 'company_name',
'imported_product', 'foreign_estab_number', 'city', 'state', 'country',
'product', 'problem', 'description', 'total_pounds_recalled',
'pounds_recovered'],
dtype='object')
Quandl¶
Daily financial data (prices of stocks, ETFs etc.) from Quandl. The symbol names consist of two parts: DB name and symbol name. DB names can be all the free ones listed on the Quandl website. Symbol names vary with DB name; for WIKI (US stocks), they are the common ticker symbols, in some other cases (such as FSE) they can be a bit strange. Some sources are also mapped to suitable ISO country codes in the dot suffix style shown above, currently available for BE, CN, DE, FR, IN, JP, NL, PT, UK, US.
As of June 2017, each DB has a different data schema, the coverage in terms of time range is sometimes surprisingly small, and the data quality is not always good.
In [1]: import pandas_datareader.data as web
In [2]: symbol = 'WIKI/AAPL' # or 'AAPL.US'
In [3]: df = web.DataReader(symbol, 'quandl', '2015-01-01', '2015-01-05')
In [4]: df.loc['2015-01-02']
Out[4]:
Open High Low Close Volume ... AdjOpen AdjHigh AdjLow AdjClose AdjVolume
Date ...
2015-01-02 111.39 111.44 107.35 109.33 53204626.0 ... 105.820966 105.868466 101.982949 103.863957 53204626.0
FRED¶
In [1]: import pandas_datareader.data as web
In [2]: import datetime
In [3]: start = datetime.datetime(2010, 1, 1)
In [4]: end = datetime.datetime(2013, 1, 27)
In [5]: gdp = web.DataReader('GDP', 'fred', start, end)
In [6]: gdp.loc['2013-01-01']
Out[6]:
GDP 16569.591
Name: 2013-01-01 00:00:00, dtype: float64
# Multiple series:
In [7]: inflation = web.DataReader(['CPIAUCSL', 'CPILFESL'], 'fred', start, end)
In [8]: inflation.head()
Out[8]:
CPIAUCSL CPILFESL
DATE
2010-01-01 217.488 220.633
2010-02-01 217.281 220.731
2010-03-01 217.353 220.783
2010-04-01 217.403 220.822
2010-05-01 217.290 220.962
Fama/French¶
Access datasets from the Fama/French Data Library.
The get_available_datasets
function returns a list of all available datasets.
In [9]: from pandas_datareader.famafrench import get_available_datasets
In [10]: import pandas_datareader.data as web
In [11]: len(get_available_datasets())
Out[11]: 297
In [12]: ds = web.DataReader('5_Industry_Portfolios', 'famafrench')
In [13]: print(ds['DESCR'])
5 Industry Portfolios
---------------------
This file was created by CMPT_IND_RETS using the 202105 CRSP database. It contains value- and equal-weighted returns for 5 industry portfolios. The portfolios are constructed at the end of June. The annual returns are from January to December. Missing data are indicated by -99.99 or -999. Copyright 2021 Kenneth R. French
0 : Average Value Weighted Returns -- Monthly (59 rows x 5 cols)
1 : Average Equal Weighted Returns -- Monthly (59 rows x 5 cols)
2 : Average Value Weighted Returns -- Annual (5 rows x 5 cols)
3 : Average Equal Weighted Returns -- Annual (5 rows x 5 cols)
4 : Number of Firms in Portfolios (59 rows x 5 cols)
5 : Average Firm Size (59 rows x 5 cols)
6 : Sum of BE / Sum of ME (5 rows x 5 cols)
7 : Value-Weighted Average of BE/ME (5 rows x 5 cols)
In [14]: ds[4].head()
Out[14]:
Cnsmr Manuf HiTec Hlth Other
Date
2016-07 539 622 719 620 1109
2016-08 536 621 713 614 1099
2016-09 534 615 705 609 1090
2016-10 530 613 699 604 1087
2016-11 529 611 688 600 1084
World Bank¶
pandas
users can easily access thousands of panel data series from the
World Bank’s World Development Indicators
by using the wb
I/O functions.
Indicators¶
Either from exploring the World Bank site, or using the search function included, every world bank indicator is accessible.
For example, if you wanted to compare the Gross Domestic Products per capita in
constant dollars in North America, you would use the search
function:
In [1]: from pandas_datareader import wb
In [2]: matches = wb.search('gdp.*capita.*const')
Then you would use the download
function to acquire the data from the World
Bank’s servers:
In [3]: dat = wb.download(indicator='NY.GDP.PCAP.KD', country=['US', 'CA', 'MX'], start=2005, end=2008)
In [4]: print(dat)
NY.GDP.PCAP.KD
country year
Canada 2008 36005.5004978584
2007 36182.9138439757
2006 35785.9698172849
2005 35087.8925933298
Mexico 2008 8113.10219480083
2007 8119.21298908649
2006 7961.96818458178
2005 7666.69796097264
United States 2008 43069.5819857208
2007 43635.5852068142
2006 43228.111147107
2005 42516.3934699993
The resulting dataset is a properly formatted DataFrame
with a hierarchical
index, so it is easy to apply .groupby
transformations to it:
In [6]: dat['NY.GDP.PCAP.KD'].groupby(level=0).mean()
Out[6]:
country
Canada 35765.569188
Mexico 7965.245332
United States 43112.417952
dtype: float64
Now imagine you want to compare GDP to the share of people with cellphone contracts around the world.
In [7]: wb.search('cell.*%').iloc[:,:2]
Out[7]:
id name
3990 IT.CEL.SETS.FE.ZS Mobile cellular telephone users, female (% of ...
3991 IT.CEL.SETS.MA.ZS Mobile cellular telephone users, male (% of po...
4027 IT.MOB.COV.ZS Population coverage of mobile cellular telepho...
Notice that this second search was much faster than the first one because
pandas
now has a cached list of available data series.
In [13]: ind = ['NY.GDP.PCAP.KD', 'IT.MOB.COV.ZS']
In [14]: dat = wb.download(indicator=ind, country='all', start=2011, end=2011).dropna()
In [15]: dat.columns = ['gdp', 'cellphone']
In [16]: print(dat.tail())
gdp cellphone
country year
Swaziland 2011 2413.952853 94.9
Tunisia 2011 3687.340170 100.0
Uganda 2011 405.332501 100.0
Zambia 2011 767.911290 62.0
Zimbabwe 2011 419.236086 72.4
Finally, we use the statsmodels
package to assess the relationship between
our two variables using ordinary least squares regression. Unsurprisingly,
populations in rich countries tend to use cellphones at a higher rate:
In [17]: import numpy as np
In [18]: import statsmodels.formula.api as smf
In [19]: mod = smf.ols('cellphone ~ np.log(gdp)', dat).fit()
In [20]: print(mod.summary())
OLS Regression Results
==============================================================================
Dep. Variable: cellphone R-squared: 0.297
Model: OLS Adj. R-squared: 0.274
Method: Least Squares F-statistic: 13.08
Date: Thu, 25 Jul 2013 Prob (F-statistic): 0.00105
Time: 15:24:42 Log-Likelihood: -139.16
No. Observations: 33 AIC: 282.3
Df Residuals: 31 BIC: 285.3
Df Model: 1
===============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
-------------------------------------------------------------------------------
Intercept 16.5110 19.071 0.866 0.393 -22.384 55.406
np.log(gdp) 9.9333 2.747 3.616 0.001 4.331 15.535
==============================================================================
Omnibus: 36.054 Durbin-Watson: 2.071
Prob(Omnibus): 0.000 Jarque-Bera (JB): 119.133
Skew: -2.314 Prob(JB): 1.35e-26
Kurtosis: 11.077 Cond. No. 45.8
==============================================================================
Country Codes¶
The country
argument accepts a string or list of mixed
two or three character
ISO country codes, as well as dynamic World Bank exceptions to the ISO standards.
For a list of the the hard-coded country codes (used solely for error handling logic) see pandas_datareader.wb.country_codes
.
Problematic Country Codes & Indicators¶
Note
The World Bank’s country list and indicators are dynamic. As of 0.15.1,
wb.download()
is more flexible. To achieve this, the warning
and exception logic changed.
The world bank converts some country codes, in their response, which makes error checking by pandas difficult. Retired indicators still persist in the search.
Given the new flexibility of 0.15.1, improved error handling by the user may be necessary for fringe cases.
To help identify issues:
There are at least 4 kinds of country codes:
Standard (2/3 digit ISO) - returns data, will warn and error properly.
Non-standard (WB Exceptions) - returns data, but will falsely warn.
Blank - silently missing from the response.
Bad - causes the entire response from WB to fail, always exception inducing.
There are at least 3 kinds of indicators:
Current - Returns data.
Retired - Appears in search results, yet won’t return data.
Bad - Will not return data.
Use the errors
argument to control warnings and exceptions. Setting
errors to ignore or warn, won’t stop failed responses. (ie, 100% bad
indicators, or a single ‘bad’ (#4 above) country code).
See docstrings for more info.
OECD¶
OECD Statistics are available via DataReader
.
You have to specify OECD’s data set code.
To confirm data set code, access to each data -> Export -> SDMX Query
. Following
example is to download ‘Trade Union Density’ data which set code is ‘TUD’.
In [15]: import pandas_datareader.data as web
In [16]: import datetime
In [17]: df = web.DataReader('TUD', 'oecd')
In [18]: df.columns
Out[18]:
MultiIndex([( 'Australia', 'Annual', 'Percentage of employees'),
( 'Austria', 'Annual', 'Percentage of employees'),
( 'Belgium', 'Annual', 'Percentage of employees'),
( 'Canada', 'Annual', 'Percentage of employees'),
( 'Czech Republic', 'Annual', 'Percentage of employees'),
( 'Denmark', 'Annual', 'Percentage of employees'),
( 'Finland', 'Annual', 'Percentage of employees'),
( 'France', 'Annual', 'Percentage of employees'),
( 'Germany', 'Annual', 'Percentage of employees'),
( 'Greece', 'Annual', 'Percentage of employees'),
( 'Hungary', 'Annual', 'Percentage of employees'),
( 'Iceland', 'Annual', 'Percentage of employees'),
( 'Ireland', 'Annual', 'Percentage of employees'),
( 'Italy', 'Annual', 'Percentage of employees'),
( 'Japan', 'Annual', 'Percentage of employees'),
( 'Korea', 'Annual', 'Percentage of employees'),
( 'Luxembourg', 'Annual', 'Percentage of employees'),
( 'Mexico', 'Annual', 'Percentage of employees'),
( 'Netherlands', 'Annual', 'Percentage of employees'),
( 'New Zealand', 'Annual', 'Percentage of employees'),
( 'Norway', 'Annual', 'Percentage of employees'),
( 'Poland', 'Annual', 'Percentage of employees'),
( 'Portugal', 'Annual', 'Percentage of employees'),
('Slovak Republic', 'Annual', 'Percentage of employees'),
( 'Spain', 'Annual', 'Percentage of employees'),
( 'Sweden', 'Annual', 'Percentage of employees'),
( 'Switzerland', 'Annual', 'Percentage of employees'),
( 'Turkey', 'Annual', 'Percentage of employees'),
( 'United Kingdom', 'Annual', 'Percentage of employees'),
( 'United States', 'Annual', 'Percentage of employees'),
( 'OECD - Total', 'Annual', 'Percentage of employees'),
( 'Chile', 'Annual', 'Percentage of employees'),
( 'Colombia', 'Annual', 'Percentage of employees'),
( 'Costa Rica', 'Annual', 'Percentage of employees'),
( 'Estonia', 'Annual', 'Percentage of employees'),
( 'Israel', 'Annual', 'Percentage of employees'),
( 'Latvia', 'Annual', 'Percentage of employees'),
( 'Lithuania', 'Annual', 'Percentage of employees'),
( 'Slovenia', 'Annual', 'Percentage of employees')],
names=['Country', 'Frequency', 'Measure'])
In [19]: df[['Japan', 'United States']]
Out[19]:
Country Japan United States
Frequency Annual Annual
Measure Percentage of employees Percentage of employees
Time
2017-01-01 17.500000 10.6
2018-01-01 17.200001 10.3
2019-01-01 16.799999 9.9
2020-01-01 NaN 10.3
Eurostat¶
Eurostat are available via DataReader
.
Get Rail accidents by type of accident (ERA data) data. The result will be a DataFrame
which has DatetimeIndex
as index and MultiIndex
of attributes or countries as column. The target URL is:
You can specify dataset ID ‘tran_sf_railac’ to get corresponding data via DataReader
.
In [20]: import pandas_datareader.data as web
In [21]: df = web.DataReader('tran_sf_railac', 'eurostat')
In [22]: df
Out[22]:
UNIT Number ...
ACCIDENT Collisions of trains, including collisions with obstacles within the clearance gauge ... Unknown
GEO Austria Belgium Bulgaria Switzerland Channel Tunnel Czechia ... Romania Sweden Slovenia Slovakia Turkey United Kingdom
FREQ Annual Annual Annual Annual Annual Annual ... Annual Annual Annual Annual Annual Annual
TIME_PERIOD ...
2017-01-01 7.0 1.0 1.0 3.0 0.0 11.0 ... NaN NaN NaN NaN 0.0 NaN
2018-01-01 4.0 0.0 1.0 3.0 0.0 6.0 ... NaN NaN NaN NaN 0.0 NaN
2019-01-01 1.0 2.0 4.0 4.0 0.0 7.0 ... NaN NaN NaN NaN NaN NaN
[3 rows x 264 columns]
Thrift Savings Plan (TSP) Fund Data¶
Download mutual fund index prices for the Thrift Savings Plan (TSP).
In [23]: import pandas_datareader.tsp as tsp
In [24]: tspreader = tsp.TSPReader(start='2015-10-1', end='2015-12-31')
In [25]: tspreader.read()
Out[25]:
L Income L 2025 L 2030 L 2035 L 2040 L 2045 L 2050 L 2055 L 2060 L 2065 G Fund F Fund C Fund S Fund I Fund
Date
2015-12-31 17.7733 NaN 25.0635 NaN 26.5715 NaN 15.0263 NaN NaN NaN 14.9154 16.9549 27.5622 35.2356 24.0952
2015-12-30 17.8066 NaN 25.2267 NaN 26.7707 NaN 15.1556 NaN NaN NaN 14.9146 16.9249 27.8239 35.5126 24.4184
2015-12-29 17.8270 NaN 25.3226 NaN 26.8905 NaN 15.2319 NaN NaN NaN 14.9137 16.9150 28.0236 35.8047 24.4757
2015-12-28 17.7950 NaN 25.1691 NaN 26.7015 NaN 15.1101 NaN NaN NaN 14.9128 16.9799 27.7230 35.4625 24.2816
2015-12-24 17.7991 NaN 25.2052 NaN 26.7481 NaN 15.1407 NaN NaN NaN 14.9093 16.9596 27.7831 35.6084 24.3272
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2015-10-07 17.6639 NaN 24.8629 NaN 26.3598 NaN 14.9063 NaN NaN NaN 14.8429 17.0725 26.7751 35.6035 24.3671
2015-10-06 17.6338 NaN 24.7268 NaN 26.1898 NaN 14.7979 NaN NaN NaN 14.8421 17.0790 26.5513 35.1320 24.2294
2015-10-05 17.6395 NaN 24.7571 NaN 26.2306 NaN 14.8233 NaN NaN NaN 14.8413 17.0531 26.6467 35.3565 24.1475
2015-10-02 17.5707 NaN 24.4472 NaN 25.8518 NaN 14.5805 NaN NaN NaN 14.8388 17.0924 26.1669 34.6504 23.6367
2015-10-01 17.5164 NaN 24.2159 NaN 25.5690 NaN 14.4009 NaN NaN NaN 14.8380 17.0467 25.7953 34.0993 23.3202
[62 rows x 15 columns]
Nasdaq Trader Symbol Definitions¶
Download the latest symbols from Nasdaq.
Note that Nasdaq updates this file daily, and historical versions are not available. More information on the field definitions.
In [12]: from pandas_datareader.nasdaq_trader import get_nasdaq_symbols
In [13]: symbols = get_nasdaq_symbols()
In [14]: print(symbols.loc['IBM'])
Nasdaq Traded True
Security Name International Business Machines Corporation Co...
Listing Exchange N
Market Category
ETF False
Round Lot Size 100
Test Issue False
Financial Status NaN
CQS Symbol IBM
NASDAQ Symbol IBM
NextShares False
Name: IBM, dtype: object
Stooq Index Data¶
Google finance doesn’t provide common index data download. The Stooq site has the data for download.
In [26]: import pandas_datareader.data as web
In [27]: f = web.DataReader('^DJI', 'stooq')
In [28]: f[:10]
Out[28]:
Open High Low Close Volume
Date
2021-07-12 34836.75 35014.90 34730.15 34996.18 344606907
2021-07-09 34457.51 34893.72 34457.51 34870.16 340542786
2021-07-08 34569.01 34569.01 34145.59 34421.93 374853414
2021-07-07 34604.17 34708.78 34435.59 34681.79 340215866
2021-07-06 34790.16 34814.20 34358.42 34577.37 390545107
2021-07-02 34642.42 34821.93 34613.49 34786.35 332517041
2021-07-01 34507.32 34640.28 34498.85 34633.53 309690314
2021-06-30 34290.74 34553.16 34245.48 34502.51 333493947
2021-06-29 34338.89 34469.83 34266.83 34292.29 321388212
2021-06-28 34428.10 34449.65 34186.13 34283.27 320257590
MOEX Data¶
The Moscow Exchange (MOEX) provides historical data.
pandas_datareader.get_data_moex(*args)
is equivalent to
pandas_datareader.moex.MoexReader(*args).read()
In [29]: import pandas_datareader as pdr
In [30]: f = pdr.get_data_moex(['USD000UTSTOM', 'MAGN'], '2020-07-02', '2020-07-07')
In [31]: f.head()
Out[31]:
ADMITTEDQUOTE ADMITTEDVALUE BEICLOSE BID BOARDID BOARDNAME BUYBACKDATE CBRCLOSE CLOSE CLOSEAUCTIONPRICE ... TRENDWAP TRENDWAPPR TYPE VALUE VOLRUR VOLUME WAPRICE WAVAL YIELDLASTCOUPON YIELDTOOFFER
TRADEDATE ...
2020-07-02 37.605 670695507.0 NaN NaN TQBR NaN NaN NaN 37.6050 NaN ... NaN NaN NaN 670695507.0 NaN 17876400.0 37.5150 0.0 NaN NaN
2020-07-03 37.950 517431999.5 NaN NaN TQBR NaN NaN NaN 37.9500 NaN ... NaN NaN NaN 517431999.5 NaN 13648900.0 37.9100 0.0 NaN NaN
2020-07-06 38.560 640380631.5 NaN NaN TQBR NaN NaN NaN 38.5600 NaN ... NaN NaN NaN 640380631.5 NaN 16565100.0 38.6600 0.0 NaN NaN
2020-07-07 38.140 781482393.5 NaN NaN TQBR NaN NaN NaN 38.1400 NaN ... NaN NaN NaN 781482393.5 NaN 20373700.0 38.3550 0.0 NaN NaN
2020-07-02 NaN NaN NaN NaN CETS NaN NaN NaN 70.5525 NaN ... NaN NaN NaN NaN 1.760773e+11 NaN 70.4005 NaN NaN NaN
[5 rows x 66 columns]
In [32]: f = pdr.moex.MoexReader('SBER', '2020-07-02', '2020-07-03').read()
In [33]: f.head()
Out[33]:
ADMITTEDQUOTE ADMITTEDVALUE BEICLOSE BID BOARDID BOARDNAME BUYBACKDATE CBRCLOSE CLOSE CLOSEAUCTIONPRICE ... TRENDCLSPR TRENDWAP TRENDWAPPR TYPE VALUE VOLUME WAPRICE WAVAL YIELDLASTCOUPON YIELDTOOFFER
TRADEDATE ...
2020-07-02 210.00 1.514438e+10 NaN NaN TQBR NaN NaN NaN 209.60 NaN ... NaN NaN NaN NaN 1.589563e+10 76382050.0 208.11 0.0 NaN NaN
2020-07-03 210.08 1.036950e+10 NaN NaN TQBR NaN NaN NaN 210.95 NaN ... NaN NaN NaN NaN 1.065166e+10 50644210.0 210.32 0.0 NaN NaN
[2 rows x 65 columns]
In [34]: f = pdr.moex.MoexReader('SBER', '2020-07-02', '2020-07-03').read_all_boards()
In [35]: f.head()
Out[35]:
ADMITTEDQUOTE ADMITTEDVALUE BEICLOSE BID BOARDID BOARDNAME BUYBACKDATE CBRCLOSE CLOSE CLOSEAUCTIONPRICE ... TRENDCLSPR TRENDWAP TRENDWAPPR TYPE VALUE VOLUME WAPRICE WAVAL YIELDLASTCOUPON YIELDTOOFFER
TRADEDATE ...
2020-07-02 NaN NaN NaN 0.0 MXBD NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2020-07-03 NaN NaN NaN 0.0 MXBD NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2020-07-02 NaN 0.000000e+00 NaN NaN SMAL NaN NaN NaN 209.90 NaN ... NaN NaN NaN NaN 1.014523e+04 49.0 208.11 0.0 NaN NaN
2020-07-02 210.0 1.514438e+10 NaN NaN TQBR NaN NaN NaN 209.60 NaN ... NaN NaN NaN NaN 1.589563e+10 76382050.0 208.11 0.0 NaN NaN
2020-07-03 NaN 0.000000e+00 NaN NaN SMAL NaN NaN NaN 210.25 NaN ... NaN NaN NaN NaN 5.882460e+03 28.0 210.32 0.0 NaN NaN
[5 rows x 65 columns]
Yahoo Finance Data¶
Yahoo Finance provides stock market data
The following endpoints are available:
yahoo
- retrieve daily stock prices (high, open, close, volu,e and adjusted close)yahoo-actions
- retrieve historical corporate actions (dividends and stock splits)yahoo-dividends
- retrieve historical dividends
In [39]: import pandas_datareader.data as web
In [40]: import pandas as pd
In [41]: import datetime as dt
In [42]: df = web.DataReader('GE', 'yahoo', start='2019-09-10', end='2019-10-09')
In [43]: df.head()
Out[43]:
High Low Open Close Volume Adj Close
Date
2019-09-10 9.27 8.90 8.91 9.14 62617200.0 9.062220
2019-09-11 9.36 9.06 9.15 9.36 57094900.0 9.280347
2019-09-12 9.52 9.22 9.40 9.26 68115100.0 9.181198
2019-09-13 9.45 9.14 9.31 9.34 45589400.0 9.270529
2019-09-16 9.42 9.17 9.30 9.38 45748400.0 9.310231
In [44]: start = dt.datetime(2010, 1, 29)
In [45]: end = dt.datetime.today()
In [46]: actions = web.DataReader('GOOG', 'yahoo-actions', start, end)
In [47]: actions.head()
Out[47]:
action value
2015-04-27 SPLIT 0.997262
2014-03-27 SPLIT 0.499500
In [48]: dividends = web.DataReader('IBM', 'yahoo-dividends', start, end)
In [49]: dividends.head()
Out[49]:
action value
2021-05-07 DIVIDEND 1.64
2021-02-09 DIVIDEND 1.63
2020-11-09 DIVIDEND 1.63
2020-08-07 DIVIDEND 1.63
2020-05-07 DIVIDEND 1.63