In this tutorial, we are going to learn how to perform a range lookup using pandas’ library in Python.

This lesson is a bit more advanced and is gear toward users with at least some experience using pandas for data analysis. But if you don’t have experience working with pandas or just started with pandas, you can follow along as I will be going everything in detail.



I will be showing you two different approaches. First method is using the numpy dot function, and the 2nd method is to create a pandas interval index object.



Example 1:

import numpy as np
import pandas as pd

df_RawData = pd.DataFrame({'Dates': {0: '06-01', 1: '06-02', 2: '06-03', 3: '06-04', 4: '06-05'},
                           'Value': {0: 3, 1: 7, 2: 13, 3: 16, 4: 22}, })

df_LookupTable = pd.DataFrame({'Start': {0: 1, 1: 6, 2: 11, 3: 16},
                               'Stop': {0: 5, 1: 10, 2: 15, 3: 20},
                              'Color':  {0: 'Red', 1: 'Orange', 2: 'Green', 3: 'Blue'}, })

result = np.dot(
    (df_RawData['Value'].values[:, None] >= df_LookupTable['Start'].values) &
    (df_RawData['Value'].values[:, None] <= df_LookupTable['Stop'].values),
    df_LookupTable['Color']
)

print(result)

df_RawData['Color'] = result



Example 2:

import numpy as np
import pandas as pd

dfStudents = pd.DataFrame({
    'Student': ['Joe', 'Mark', 'Mary' ,' Jane', 'Lucy', 'Tom', 'Jerry'],
    'Overall Score': np.random.randint(49, 100, 7) 
})

dfGradeLookup = pd.DataFrame({
    'Grade': ['A', 'B', 'C', 'D', 'F'],
    'Min': [90, 80, 70, 60, 0],
    'Max': [100, 89, 79, 69, 59]
})


val = dfGradeLookup.loc[:, 'Min':'Max'].apply(tuple, 1).tolist()
print(val)

indx = pd.IntervalIndex.from_tuples(val, closed='both')
print(indx)

dfStudents['Grade'] = dfGradeLookup.loc[indx.get_indexer(dfStudents['Overall Score']), 'Grade'].values
print(dfStudents)