Skip to main content

Questions tagged [missing-data]

For questions relating to missing data problems, which can involve special data structures, algorithms, statistical methods, modeling techniques, visualization, among other considerations.

missing-data
Filter by
Sorted by
Tagged with
0 votes
0 answers
26 views

I am using a .NET 4.8 application with Angular v22. When I try to login via a controller method, it always throws an exception that the email claim is missing. But it is present in the JWT token. Here ...
Ashu's user avatar
  • 40
1 vote
0 answers
33 views

Found multiple crashes in Firebase Crashlytics with following info: Crashed: com.apple.main-thread EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000000000000002b ...
Hikari's user avatar
  • 11
0 votes
1 answer
94 views

I'm working on a messaging platform type sales platform. A buyer can contact a seller about a product. Their message concerns a specific product. The exchange of messages is only possible between a ...
AMP59's user avatar
  • 45
0 votes
0 answers
53 views

I wrote code to modify values ​​if the values ​​in the two arrays satisfy a condition. However, in the process, all missing values that did not satisfy the conditions were converted to 0. I want to ...
user31457645's user avatar
-2 votes
5 answers
177 views

When you estimate a model, the estimation function will drop observations (i.e., rows) for which at least one variable (i.e., column) used either in the LHS or in the RHS of the formula is missing. ...
robertspierre's user avatar
0 votes
1 answer
103 views

I am trying to use the smcfcs package to impute missing values. I have repeatedly run into the error seen in the title, and have not been able to find a solution anywhere. I have tested out the code ...
Jenny's user avatar
  • 69
0 votes
0 answers
41 views

My Configuration is Java version JDK 17 Elastic search client version = 8.16 Elastic Cluster Version = 7.17 when we deployed above configuration in production , we faced severe prod issue and our ...
uma mahesh's user avatar
0 votes
0 answers
117 views

I am doing some work involving a dataset that I call via API every day. This gives a large JSON file of transactions. We're talking 40k plus entries a day with each having 500 parameters. ...
Noob2025's user avatar
0 votes
0 answers
25 views

I am using miffForest to impute my data for my causal forest analysis. I keep running into a problem with my post imputed outcome, smoke category variable, and financial strain variables should be ...
NICOLA CHURCHILL's user avatar
0 votes
2 answers
108 views

I have a dataset of olive oil samples and the goal of creating a classification model for oil quality. I'm having trouble deciding how to deal with missing data. have a look at the data here if you ...
BOBTHEBUILDER's user avatar
-1 votes
3 answers
196 views

It looks like tidyr's drop_na will drop rows if any of the specified columns contain missing values. Example: ...
robertspierre's user avatar
1 vote
0 answers
59 views

I have been trying to resolve this for two days and feel the need for help. I've created a cumulative graph, only it's showing as cumulative! That is because there aren't necessarily rows of data ...
Sally Parkes's user avatar
1 vote
2 answers
217 views

I am looking for guidance on how to check for missing values in a DataFrame that are not the typical "NaN" or "np.nan" in Python. I have a dataset/DataFrame that has a string ...
gnocchi17's user avatar
1 vote
1 answer
107 views

I am training a random forest classifier in python sklearn, see code below- ...
lsr729's user avatar
  • 854
0 votes
0 answers
163 views

I'm using ydata-profiling to generate profiling reports from a large PySpark DataFrame without converting it to Pandas (to avoid memory issues on large datasets). Some columns contain the string "...
hexxetexxeh's user avatar
1 vote
0 answers
147 views

I have a large data frame with a lot of variables measured at three time points t1, t2 and t3. I only want to impute those missings where the according time point was answered at all, that is where ...
Qwertzu-iop's user avatar
1 vote
1 answer
58 views

I am processing data using Random Forest, and I am trying to create random artificial gaps in my dataset so that I can test how accurate the random forest predictions are. ...
shrimp's user avatar
  • 101
0 votes
2 answers
71 views

I am trying to make an app where it controls the aspects of a garden. changing the temperature, the humidity, wind, and etc. My new issue is that my data keeps dissapearing after I click another ...
Isa's user avatar
  • 23
0 votes
0 answers
26 views

SET UP MY PROBLEM I have two pandas dataframes. First, I have main: ...
bismo's user avatar
  • 1,655
0 votes
0 answers
57 views

I have data that are responses to several multi-item scales. My plan is to use Multiple Imputation (via Proc MI in SAS) to deal with missing values, and then examine the relationships among the scale ...
Steve Scher's user avatar
0 votes
4 answers
179 views

How to insert a new row every time there is a break in the y column numbering, while filling the other columns with NA? Edit: ...
denis's user avatar
  • 1,248
1 vote
0 answers
123 views

I've been consistently running into a problem when running some basic code using the PySerial library in Python. Specifically, when I use the write function given by PySerial, I lose bytes of the ...
namettra's user avatar
1 vote
1 answer
64 views

For a study, I need to generate five complete data sets for each of the 100 incomplete data sets with the help of mice package in R. This code is working correctly (...
MetehanGungor's user avatar
2 votes
1 answer
319 views

I am doing multiple imputation with R mice before my primary analysis which is a binary logistic regression with functional decline as outcome and a number of variables as predictors (continuous and ...
paola's user avatar
  • 135
0 votes
1 answer
96 views

I'm working on a project with a dataset that has quite a lot of missing values—really a lot. Here's the output of colSums(is.na(dati_train)), showing the number of ...
giulio lo verde's user avatar
0 votes
2 answers
119 views

I have a genetic dataset as shown below, which contains replicates (same genomic positions) in the column pos. I want to group the data by ...
SAL's user avatar
  • 2,296
0 votes
2 answers
91 views

Why: data.frame(a=c(TRUE,NA)) %>% mutate(a=replace_na(FALSE)) returns a 1 FALSE 2 FALSE and so set the entire column ...
robertspierre's user avatar
0 votes
1 answer
151 views

After the OpenId update I can't see any way to retrieve company name and the user position openid Use your name and photo profile Use your name and photo w_member_social Create, modify, and delete ...
Sahil's user avatar
  • 1
0 votes
1 answer
201 views

I have a dataset with missing data. They are encoded as NaN. This is fine for model fitting with XGBoost. When I want to understand the model, analyzing model importance with SHAP scatter plots, I am ...
LudvigH's user avatar
  • 4,973
2 votes
2 answers
86 views

I have two dataframes like this: ...
LulY's user avatar
  • 1,405
0 votes
1 answer
84 views

I have these two dataframes, in the first one, I have category2 and category3 , while category1 is missing, I need to fill this for each , year, each month, each class and each region based on ...
user12715151's user avatar
0 votes
1 answer
53 views

App crashed showing error index out of range. In cloudkit dashboard, querying the table showed half of the records missing. No changes were made neither to the app nor the database since the last ...
nixxe's user avatar
  • 1
1 vote
2 answers
63 views

I have a data.frame like below where I don't have record for winter months i.e. January, February, November and December. I want to compelete the ...
Hydro's user avatar
  • 1,127
0 votes
1 answer
48 views

I'm working with a wealth component dataset that includes variables for housing, business, financial assets, loans, and non-housing loans. These variables have varying levels of randomly allocated ...
Jack's user avatar
  • 867
0 votes
1 answer
81 views

Update: Adding uncertainty bounds to the original data.frame that should be part of the plot I've asked this question before, but I'm asking again to see if there's another method for shading missing ...
Hydro's user avatar
  • 1,127
0 votes
1 answer
38 views

I am working on converting DDL master files to SQL Server tables, and every one of them have MISSING=ON in the file, which I am interpreting as "if missing then make aware" sort of thing. ...
Georgia Miller's user avatar
1 vote
3 answers
144 views

Question How to fill missing values of a pandas dataframe based on the relationship between an existing preceeding row (predictions for a commodity), and an associated existing value in another column ...
vestland's user avatar
  • 62.2k
2 votes
3 answers
1k views

I'm working with a large dataset (~10 million rows and 50 columns) in pandas and experiencing significant performance issues during data manipulation and analysis. The operations include filtering, ...
Olusoji's user avatar
  • 21
0 votes
0 answers
55 views

I'm trying to run a workflow on snakemake. I have to automate a couple of steps which are depending all on python scripts or pipelines already made. My rule Gene_flow_between _species has to run once ...
Awa's user avatar
  • 1
2 votes
0 answers
90 views

I have a Matlab R2020b cell array with <missing> values: ...
AmericanJael's user avatar
1 vote
1 answer
521 views

I am uploading data from an Excel file which was provided in a specific format, for which a minimum reproducible example is shown below: I am trying to save each column into an array, using the ...
AmericanJael's user avatar
2 votes
2 answers
119 views

This is an MRE that shows an inconsistency in the use of emmeans() along with fct_na_value_to_level(). It wasn't easy to get the why of the error from my initial code ;) I prefer to put it here ...
doana's user avatar
  • 75
0 votes
1 answer
71 views

I am trying to replace missing value using K nearest neighbor classification (KNN). After intensive search, I found simputation package which I can use. The documentation gives the following format: <...
Faisal Mustafa's user avatar
0 votes
1 answer
78 views

I want to interpolate zero values in a time series dataframe but only if: 1) there is only one missing value so subsequent and proceeding values are non-zero, 2) the surrounding non-zero values are ...
Dove_pigeon's user avatar
0 votes
1 answer
39 views

I have this data set: ...
MetehanGungor's user avatar
1 vote
0 answers
73 views

For MCAR, it is simply missing completely at random, and using random might work for MCAR. Example: ...
Rajesh Bhandari's user avatar
0 votes
1 answer
277 views

I read that the default setting of lme4 is listwise deletion. My data is in long format (repeated measures with two time points) and it doesn't appear that it really deletes listwise (as I understand ...
FabRic's user avatar
  • 1
1 vote
2 answers
88 views

I have two dataframes (A and B). Dataframe A contains weather from 2010 to 2013. But the row which is suppoesd to contain data for the first day is missing for each of the years (e.g. '2013-01-01'). ...
Kelechi Igwe's user avatar
0 votes
1 answer
30 views

Basically, I have a df with car model+year as a string in 1 column. The dataframe is a collection of used cars for sale so there are duplicate model+year rows but not duplicated entire rows. In this ...
Dalv32's user avatar
  • 1
0 votes
1 answer
52 views

Here is one dataset. ...
J.K Kim's user avatar
  • 964

1
2 3 4 5
58