python – How to apply a function to two columns of Pandas dataframe

python – How to apply a function to two columns of Pandas dataframe

Heres an example using apply on the dataframe, which I am calling with axis = 1.

Note the difference is that instead of trying to pass two values to the function f, rewrite the function to accept a pandas Series object, and then index the Series to get the values needed.

In [49]: df
Out[49]: 
          0         1
0  1.000000  0.000000
1 -0.494375  0.570994
2  1.000000  0.000000
3  1.876360 -0.229738
4  1.000000  0.000000

In [50]: def f(x):    
   ....:  return x[0] + x[1]  
   ....:  

In [51]: df.apply(f, axis=1) #passes a Series object, row-wise
Out[51]: 
0    1.000000
1    0.076619
2    1.000000
3    1.646622
4    1.000000

Depending on your use case, it is sometimes helpful to create a pandas group object, and then use apply on the group.

There is a clean, one-line way of doing this in Pandas:

df[col_3] = df.apply(lambda x: f(x.col_1, x.col_2), axis=1)

This allows f to be a user-defined function with multiple input values, and uses (safe) column names rather than (unsafe) numeric indices to access the columns.

Example with data (based on original question):

import pandas as pd

df = pd.DataFrame({ID:[1, 2, 3], col_1: [0, 2, 3], col_2:[1, 4, 5]})
mylist = [a, b, c, d, e, f]

def get_sublist(sta,end):
    return mylist[sta:end+1]

df[col_3] = df.apply(lambda x: get_sublist(x.col_1, x.col_2), axis=1)

Output of print(df):

  ID  col_1  col_2      col_3
0  1      0      1     [a, b]
1  2      2      4  [c, d, e]
2  3      3      5  [d, e, f]

If your column names contain spaces or share a name with an existing dataframe attribute, you can index with square brackets:

df[col_3] = df.apply(lambda x: f(x[col 1], x[col 2]), axis=1)

python – How to apply a function to two columns of Pandas dataframe

A simple solution is:

df[col_3] = df[[col_1,col_2]].apply(lambda x: f(*x), axis=1)

Leave a Reply

Your email address will not be published.