python – Pandas: ValueError: cannot convert float NaN to integer

python – Pandas: ValueError: cannot convert float NaN to integer

For identifying NaN values use boolean indexing:

print(df[df[x].isnull()])

Then for removing all non-numeric values use to_numeric with parameter errors=coerce – to replace non-numeric values to NaNs:

df[x] = pd.to_numeric(df[x], errors=coerce)

And for remove all rows with NaNs in column x use dropna:

df = df.dropna(subset=[x])

Last convert values to ints:

df[x] = df[x].astype(int)

ValueError: cannot convert float NaN to integer

From v0.24, you actually can. Pandas introduces Nullable Integer Data Types which allows integers to coexist with NaNs.

Given a series of whole float numbers with missing data,

s = pd.Series([1.0, 2.0, np.nan, 4.0])
s

0    1.0
1    2.0
2    NaN
3    4.0
dtype: float64

s.dtype
# dtype(float64)

You can convert it to a nullable int type (choose from one of Int16, Int32, or Int64) with,

s2 = s.astype(Int32) # note the I is uppercase
s2

0      1
1      2
2    NaN
3      4
dtype: Int32

s2.dtype
# Int32Dtype()

Your column needs to have whole numbers for the cast to happen. Anything else will raise a TypeError:

s = pd.Series([1.1, 2.0, np.nan, 4.0])

s.astype(Int32)
# TypeError: cannot safely cast non-equivalent float64 to int32

python – Pandas: ValueError: cannot convert float NaN to integer

Also, even at the lastest versions of pandas if the column is object type you would have to convert into float first, something like:

df[column_name].astype(np.float).astype(Int32)

NB: You have to go through numpy float first and then to nullable Int32, for some reason.

The size of the int if its 32 or 64 depends on your variable, be aware you may loose some precision if your numbers are to big for the format.

Leave a Reply

Your email address will not be published. Required fields are marked *