Pandas

Posted on September 2, 2022

Tags: codeetc

Pandas

indexing
- iloc for index
- loc for key

>>> s = pd.Series(list("abcdef"), index=[49, 48, 47, 0, 1, 2]) 
49    a
48    b
47    c
0     d
1     e
2     f

>>> s.loc[0]    # value at index label 0
'd'

>>> s.iloc[0]   # value at index location 0
'a'

>>> s.loc[0:1]  # rows at index labels between 0 and 1 (inclusive)
0    d
1    e

>>> s.iloc[0:1] # rows at index location between 0 and 1 (exclusive)
49    a

df["wage"]

SELECT wage from Dataset;

filtering

df[df.wage > 12]

wh[wh.isnull().any(axis=1)]
wh.dropna(axis=1).shape # Drops the columns containing missing values

fillna parameter * None - use position parameter to fill missing content * ffill - use previous value to fill current value * bfill - use next value to fill current value

import pandas as pd
url = r'https://en.wikipedia.org/wiki/List_of_S%26P_500_companies'
tables = pd.read_html(url) # Returns list of all tables on page
sp500_table = tables[0] # Select table of interest

import pandas as pd
tables = pd.read_html("https://www.finra.org/investors/learn-to-invest/advanced-investing/margin-statistics")`