Here my dataset is pd
and i have split it into training and testing data as pd_train1
and pd_train2
sku national_inv lead_time in_transit_qty forecast_3_month forecast_6_month1 3921548 8 12 0 0 02 3191009 83 2 33 157 3773 2935810 8 4 0 0 04 2205847 31 4 63 70 1605 4953497 3 12 0 0 06 2286884 0 8 0 0 0 forecast_9_month sales_1_month sales_3_month sales_6_month sales_9_month min_bank1 0 1 1 2 5 22 603 44 98 148 156 533 0 0 0 1 1 04 223 27 90 164 219 05 0 0 0 0 0 06 0 0 0 0 0 0 potential_issue pieces_past_due perf_6_month_avg perf_12_month_avg local_bo_qty1 0 0 0.63 0.75 02 0 0 0.68 0.66 03 0 0 0.73 0.78 04 0 0 0.73 0.78 05 0 0 0.81 0.74 06 0 0 0.91 0.96 0 deck_risk oe_constraint ppap_risk stop_auto_buy rev_stop went_on_backorder data1 0 0 0 1 0 No train2 0 0 0 1 0 No train3 0 0 0 1 0 No train4 0 0 1 1 0 No train5 0 0 0 1 0 No train6 0 0 0 1 0 No train
I wanted to create a lm model for my training data pd_train1
But i am getting this error as below:
> fit=lm(went_on_backorder~.,data=pd_train1)Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y'In addition: Warning message:In storage.mode(v) <- "double" : NAs introduced by coercion
I tried searching for infinite values:
sapply(pd_train1, function(x) sum(is.infinite(x))) sku national_inv lead_time in_transit_qty forecast_3_month 0 0 0 0 0 forecast_6_month forecast_9_month sales_1_month sales_3_month sales_6_month 0 0 0 0 0 sales_9_month min_bank potential_issue pieces_past_due perf_6_month_avg 0 0 0 0 0 perf_12_month_avg local_bo_qty deck_risk oe_constraint ppap_risk 0 0 0 0 0 stop_auto_buy rev_stop went_on_backorder data 0 0 0 0
And also for NA/NaN values in my training data on which i want to make linear model
sku national_inv lead_time in_transit_qty forecast_3_month 0 0 0 0 0 forecast_6_month forecast_9_month sales_1_month sales_3_month sales_6_month 0 0 0 0 0 sales_9_month min_bank potential_issue pieces_past_due perf_6_month_avg 0 0 0 0 0 perf_12_month_avg local_bo_qty deck_risk oe_constraint ppap_risk 0 0 0 0 0 stop_auto_buy rev_stop went_on_backorder 0 0 0 Inf %in% pd_train1$went_on_backorder1] FALSENaN %in% pd_test$went_on_backorder1] FALSE
Henceforth I am not able to get the NA/NaN/Inf values in my datasetCan someone help me understand why is this throwing an error, please?Here went_on_backorder
is my target variable.