Data input
Let's generate a very simple input dataset first, as shown here. Its name and location is c:/temp/test.txt
. The format of the dataset is text:
a b 1 2 3 4
The code is shown here:
>>>f=open("c:/temp/test.txt","r") >>>x=f.read() >>>f.close()
The print()
function could be used to show the value of x
:
>>>print(x) a b 1 2 3 4 >>>
For the second example, let's download the daily historical price for IBM from Yahoo!Finance first. To do so, we visit http://finance.yahoo.com:
Enter IBM
to find its related web page. Then click Historical Data, then click Download:
Assume that we save the daily data as ibm.csv
under c:/temp/
. The first five lines are shown here:
Date,Open,High,Low,Close,Volume,Adj Close 2016-11-04,152.399994,153.639999,151.869995,152.429993,2440700,152.429993 2016-11-03,152.509995,153.740005,151.800003,152.369995,2878800,152.369995 2016-11-02,152.479996,153.350006,151.669998,151.949997,3074400,151.949997 2016-11-01,153.50,153.910004,151.740005,152.789993,3191900,152.789993
The first line shows the variable names: date, open price, high price achieved during the trading day, low price achieved during the trading day, close price of the last transaction during the trading day, trading volume, and adjusted price for the trading day. The delimiter is a comma. There are several ways of loading the text file. Some methods are discussed here:
- Method I: We could use
read_csv
from thepandas
module:>>> import pandas as pd >>> x=pd.read_csv("c:/temp/ibm.csv") >>>x[1:3] Date Open High Low Close Volume \ 1 2016-11-02 152.479996 153.350006 151.669998 151.949997 3074400 2 2016-11-01 153.500000 153.910004 151.740005 152.789993 3191900 Adj.Close 1 151.949997 2 152.789993>>>
- Method II: We could use
read_table
from thepandas
module; see the following code:>>> import pandas as pd >>> x=pd.read_table("c:/temp/ibm.csv",sep=',')
Alternatively, we could download the IBM daily price data directly from Yahoo!Finance; see the following code:
>>> import pandas as pd >>>url=url='http://canisius.edu/~yany/data/ibm.csv' >>> x=pd.read_csv(url) >>>x[1:5] Date Open High Low Close Volume \ 1 2016-11-03 152.509995 153.740005 151.800003 152.369995 2843600 2 2016-11-02 152.479996 153.350006 151.669998 151.949997 3074400 3 2016-11-01 153.500000 153.910004 151.740005 152.789993 3191900 4 2016-10-31 152.759995 154.330002 152.759995 153.690002 3553200 Adj Close 1 152.369995 2 151.949997 3 152.789993 4 153.690002>>>
We could retrieve data from an Excel file by using the ExcelFile()
function from thepandas
module. First, we generate an Excel file with just a few observations; see the following screenshot:
Let's call this Excel file stockReturns.xlxs
and assume that it is saved under c:/temp/
. The Python code is given here:
>>>infile=pd.ExcelFile("c:/temp/stockReturns.xlsx") >>> x=infile.parse("Sheet1") >>>x date returnAreturnB 0 2001 0.10 0.12 1 2002 0.03 0.05 2 2003 0.12 0.15 3 2004 0.20 0.22 >>>
To retrieve Python datasets with an extension of .pkl
or .pickle
, we can use the following code. First, we download the Python dataset called ffMonthly.pkl
from the author's web page at http://www3.canisius.edu/~yany/python/ffMonthly.pkl.
Assume that the dataset is saved under c:/temp/
. The function called read_pickle()
included in the pandas
module can be used to load the dataset with an extension of .pkl
or .pickle
:
>>> import pandas as pd >>> x=pd.read_pickle("c:/temp/ffMonthly.pkl") >>>x[1:3] >>> Mkt_RfSMBHMLRf 196308 0.0507 -0.0085 0.0163 0.0042 196309 -0.0157 -0.0050 0.0019 -0.0080 >>>
The following is the simplest if
function: when our interest rate is negative, print a warning message:
if(r<0): print("interest rate is less than zero")
Conditions related to logical AND
and OR
are shown here:
>>>if(a>0 and b>0): print("both positive") >>>if(a>0 or b>0): print("at least one is positive")
For the multiple if...elif
conditions, the following program illustrates its application by converting a number grade to a letter grade:
grade=74 if grade>=90: print('A') elif grade >=85: print('A-') elif grade >=80: print('B+') elif grade >=75: print('B') elif grade >=70: print('B-') elif grade>=65: print('C+') else: print('D')
Note that it is a good idea for such multiple if...elif
functions to end with an else
condition since we know exactly what the result is if none of those conditions are met.