Questions tagged [missing-data]
For questions relating to missing data problems, which can involve special data structures, algorithms, statistical methods, modeling techniques, visualization, among other considerations.
missing-data
2,870 questions
0
votes
0
answers
26
views
"preferred_username" and Email are present in token, but show null in claims .NET 4.8
I am using a .NET 4.8 application with Angular v22. When I try to login via a controller method, it always throws an exception that the email claim is missing. But it is present in the JWT token.
Here ...
1
vote
0
answers
33
views
iOS Firebase EXC_BAD_ACCESS KERN_INVALID_ADDRESS
Found multiple crashes in Firebase Crashlytics with following info:
Crashed: com.apple.main-thread
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000000000000002b
...
0
votes
1
answer
94
views
Database data missing when reading - ASP NET MVC
I'm working on a messaging platform type sales platform. A buyer can contact a seller about a product. Their message concerns a specific product.
The exchange of messages is only possible between a ...
0
votes
0
answers
53
views
The SAS program changed missing values to 0 while using array functions
I wrote code to modify values if the values in the two arrays satisfy a condition.
However, in the process, all missing values that did not satisfy the conditions were converted to 0.
I want to ...
-2
votes
5
answers
177
views
Get dataframe of observations dropped in estimates
When you estimate a model, the estimation function will drop observations (i.e., rows) for which at least one variable (i.e., column) used either in the LHS or in the RHS of the formula is missing.
...
0
votes
1
answer
103
views
How to resolve error in rmultinom(1, size = 1, prob = prob) : NA in probability vector
I am trying to use the smcfcs package to impute missing values. I have repeatedly run into the error seen in the title, and have not been able to find a solution anywhere.
I have tested out the code ...
0
votes
0
answers
41
views
Some json events are missing while bulkingester writes data to elasticsearch
My Configuration is
Java version JDK 17
Elastic search client version = 8.16
Elastic Cluster Version = 7.17
when we deployed above configuration in production , we faced severe prod issue and our ...
0
votes
0
answers
117
views
Missing data when calling an API
I am doing some work involving a dataset that I call via API every day. This gives a large JSON file of transactions. We're talking 40k plus entries a day with each having 500 parameters. ...
0
votes
0
answers
25
views
MissForest outcome imputation
I am using miffForest to impute my data for my causal forest analysis. I keep running into a problem with my post imputed outcome, smoke category variable, and financial strain variables should be ...
0
votes
2
answers
108
views
Missing values in olive oil dataset
I have a dataset of olive oil samples and the goal of creating a classification model for oil quality. I'm having trouble deciding how to deal with missing data. have a look at the data here if you ...
-1
votes
3
answers
196
views
Drop rows with missing values in all columns [duplicate]
It looks like tidyr's drop_na will drop rows if any of the specified columns contain missing values.
Example:
...
1
vote
0
answers
59
views
How can I add zero for empty or missing rows?
I have been trying to resolve this for two days and feel the need for help. I've created a cumulative graph, only it's showing as cumulative! That is because there aren't necessarily rows of data ...
1
vote
2
answers
217
views
Python - How to check for missing values not represented by NaN? [duplicate]
I am looking for guidance on how to check for missing values in a DataFrame that are not the typical "NaN" or "np.nan" in Python. I have a dataset/DataFrame that has a string ...
1
vote
1
answer
107
views
Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?
I am training a random forest classifier in python sklearn, see code below-
...
0
votes
0
answers
163
views
Why does ydata-profiling not detect missing values in PySpark DataFrame when using None?
I'm using ydata-profiling to generate profiling reports from a large PySpark DataFrame without converting it to Pandas (to avoid memory issues on large datasets).
Some columns contain the string "...
1
vote
0
answers
147
views
R mice leaves missing values when I use a where-matrix
I have a large data frame with a lot of variables measured at three time points t1, t2 and t3. I only want to impute those missings where the according time point was answered at all, that is where ...
1
vote
1
answer
58
views
Creating Artificial Gaps in R Dataset [duplicate]
I am processing data using Random Forest, and I am trying to create random artificial gaps in my dataset so that I can test how accurate the random forest predictions are.
...
0
votes
2
answers
71
views
How do I get my data to not dissapear when I click another fragment ? android studio
I am trying to make an app where it controls the aspects of a garden. changing the temperature, the humidity, wind, and etc. My new issue is that my data keeps dissapearing after I click another ...
0
votes
0
answers
26
views
Pandas - How to backfill a main dataframe with values from another while prioritizing the main dataframe [duplicate]
SET UP MY PROBLEM
I have two pandas dataframes. First, I have main:
...
0
votes
0
answers
57
views
Box-Cox transformation in SAS Proc MI
I have data that are responses to several multi-item scales. My plan is to use Multiple Imputation (via Proc MI in SAS) to deal with missing values, and then examine the relationships among the scale ...
0
votes
4
answers
179
views
How to insert a new row for each missing number of one variable while filling with NA for other variables using data.table in R?
How to insert a new row every time there is a break in the y column numbering, while filling the other columns with NA?
Edit: ...
1
vote
0
answers
123
views
Losing bytes over PySerial.write()
I've been consistently running into a problem when running some basic code using the PySerial library in Python.
Specifically, when I use the write function given by PySerial, I lose bytes of the ...
1
vote
1
answer
64
views
Creating 5 complete data sets from one incomplete data set in a simulation study [mice package in R]
For a study, I need to generate five complete data sets for each of the 100 incomplete data sets with the help of mice package in R.
This code is working correctly (...
2
votes
1
answer
319
views
R mice code for delta sensitivity analysis for a binary variable : apparently not working
I am doing multiple imputation with R mice before my primary analysis which is a binary logistic regression with functional decline as outcome and a number of variables as predictors (continuous and ...
0
votes
1
answer
96
views
Handling Systematic Missing Values in a Dataset for Logistic Regression, LDA, and Tree-Based Models
I'm working on a project with a dataset that has quite a lot of missing values—really a lot.
Here's the output of colSums(is.na(dati_train)), showing the number of ...
0
votes
2
answers
119
views
How to handle complementary pairs of rows and fill missing values based on reference column?
I have a genetic dataset as shown below, which contains replicates (same genomic positions) in the column pos. I want to group the data by ...
0
votes
2
answers
91
views
Why replace_na with mutate doesn't work as expected?
Why:
data.frame(a=c(TRUE,NA)) %>% mutate(a=replace_na(FALSE))
returns
a
1 FALSE
2 FALSE
and so set the entire column ...
0
votes
1
answer
151
views
How can I retrieve company name and position of a user using Linkedin developer API
After the OpenId update I can't see any way to retrieve company name and the user position
openid
Use your name and photo
profile
Use your name and photo
w_member_social
Create, modify, and delete ...
0
votes
1
answer
201
views
How to make shap.plots.scatter with xgboost.DMatrix holding missing data?
I have a dataset with missing data. They are encoded as NaN. This is fine for model fitting with XGBoost. When I want to understand the model, analyzing model importance with SHAP scatter plots, I am ...
2
votes
2
answers
86
views
Merge two dataframes and keep non-missing entries
I have two dataframes like this:
...
0
votes
1
answer
84
views
How to generate missing data in one dataframe based on a distribution from another dataframe
I have these two dataframes, in the first one, I have category2 and category3 , while category1 is missing, I need to fill this for each , year, each month, each class and each region based on ...
0
votes
1
answer
53
views
Cloudkit records disappeared
App crashed showing error index out of range. In cloudkit dashboard, querying the table showed half of the records missing.
No changes were made neither to the app nor the database since the last ...
1
vote
2
answers
63
views
Complete data sequence with NA for missing months in R?
I have a data.frame like below where I don't have record for winter months i.e. January, February, November and December. I want to compelete the ...
0
votes
1
answer
48
views
Calculating net assets with missing values: discrepancy in results
I'm working with a wealth component dataset that includes variables for housing, business, financial assets, loans, and non-housing loans. These variables have varying levels of randomly allocated ...
0
votes
1
answer
81
views
ggplot with shade for the missing months in R?
Update: Adding uncertainty bounds to the original data.frame that should be part of the plot
I've asked this question before, but I'm asking again to see if there's another method for shading missing ...
0
votes
1
answer
38
views
Converting DDL MasterFile table to SQL Server table, How do I write in missing data?
I am working on converting DDL master files to SQL Server tables, and every one of them have MISSING=ON in the file, which I am interpreting as "if missing then make aware" sort of thing. ...
1
vote
3
answers
144
views
How to fill missing values based on relationships between existing data
Question
How to fill missing values of a pandas dataframe based on the relationship between an existing preceeding row (predictions for a commodity), and an associated existing value in another column ...
2
votes
3
answers
1k
views
Optimizing pandas performance on large datasets
I'm working with a large dataset (~10 million rows and 50 columns) in pandas and experiencing significant performance issues during data manipulation and analysis. The operations include filtering, ...
0
votes
0
answers
55
views
Missing Output Exception Error in Snakemake with output directory modified
I'm trying to run a workflow on snakemake. I have to automate a couple of steps which are depending all on python scripts or pipelines already made. My rule Gene_flow_between _species has to run once ...
2
votes
0
answers
90
views
Completely deleting elements from Matlab R2020b cell array [duplicate]
I have a Matlab R2020b cell array with <missing> values:
...
1
vote
1
answer
521
views
<Missing> Values in Cell Arrays Created from Matlab readcell()
I am uploading data from an Excel file which was provided in a specific format, for which a minimum reproducible example is shown below:
I am trying to save each column into an array, using the ...
2
votes
2
answers
119
views
Bug report when emmeans() is used along with fct_na_value_to_level()
This is an MRE that shows an inconsistency in the use of emmeans() along with fct_na_value_to_level().
It wasn't easy to get the why of the error from my initial code ;)
I prefer to put it here ...
0
votes
1
answer
71
views
Using impute_knn from simputation package in r
I am trying to replace missing value using K nearest neighbor classification (KNN). After intensive search, I found simputation package which I can use. The documentation gives the following format:
<...
0
votes
1
answer
78
views
Interpolate zero values only if one zero and surrounding values are bigger than zero
I want to interpolate zero values in a time series dataframe but only if: 1) there is only one missing value so subsequent and proceeding values are non-zero, 2) the surrounding non-zero values are ...
0
votes
1
answer
39
views
assigning row median to NA values [duplicate]
I have this data set:
...
1
vote
0
answers
73
views
How can you generate MCAR, MNAR and MAR missingness pattern in a dataset using python?
For MCAR, it is simply missing completely at random, and using random might work for MCAR.
Example:
...
0
votes
1
answer
277
views
Mixed-effects models: Does lmer function really do listwise deletion?
I read that the default setting of lme4 is listwise deletion. My data is in long format (repeated measures with two time points) and it doesn't appear that it really deletes listwise (as I understand ...
1
vote
2
answers
88
views
Replace missing rows in dataframe A, along with their corresponding values with data from dataframe B
I have two dataframes (A and B). Dataframe A contains weather from 2010 to 2013. But the row which is suppoesd to contain data for the first day is missing for each of the years (e.g. '2013-01-01'). ...
0
votes
1
answer
30
views
How do I fill missing values in a dataframe with the values found in another dataframe using a lookup value? [duplicate]
Basically, I have a df with car model+year as a string in 1 column. The dataframe is a collection of used cars for sale so there are duplicate model+year rows but not duplicated entire rows. In this ...
0
votes
1
answer
52
views
In R, when missing values exist, how to draw line graphs? [duplicate]
Here is one dataset.
...