English 中文(简体)
如果在因素方面节省费用,则没有读数。 颜色被变换成有等级的因素,现在又具有特征。
原标题:Does R not read csv s as they were saved with respect to factors? Columns were mutated to factors with levels and are now characters again
  • 时间:2023-12-22 23:18:17
  •  标签:
  • r
  • read-csv

I m basically working with a massive spreadsheet of over 5 million rows, and made 2 column mutations of 2 character columns to create factors and levels of factors in the dataset - a column with a factor of 3 levels, and a column with a factor of 2 levels. After filtering sets of data from this source, I saved them in separate .csv files to continue working on them later. Now, when reading any of the .csv files back into RStudio, it treats all of those adjusted columns in all the tables as characters again. Do I have to re-do the factor work every time I open up RStudio?

我在使用“现成”功能之前装上了所有以前的图书馆,但图书馆(电离层)除外,因为图书馆在试图管理数据时制造了一系列冲突。

Libraries currently loaded:

library(data.table)
library(readr)
library(tidyverse)
library(lubridate)
library(dplyr)

它仍然保留着 d、bl和一栏,它节省下来并正确地阅读了用斜体加起来的其他栏目,因此,我的折合因素栏为何回去? 我曾尝试过一些不同的方法,用读物读读读到《国际法》中,但我不知道在这些文件中读出的具体方式,以便回到我离开的地方,而不会沦为多余的工作。

I ve tried using different read.csv, read_csv, and data.table::fread import functions, but I feel like I m shooting in the dark here and thought that just importing a .csv file would get me right back to where I was when I left it. I use glimpse(df) to check if it s being read correctly but it s never as I left it or it gets warped with other import functions. If there s some special function to use in conjunction with "stringsAsFactors = FALSE, UTF - 8" or if there s a special way to initially write the .csv file that I didn t do maybe that s my answer. I m just trying NOT to have to run all my factor and levels of factors in my now separate data sets every time I open them.

问题回答

Both Phil s and Onyambu make valid points, but I thought the question was how to properly read in CSV files that would be stacked and have some or all of the character valued columns converted to "stringsAsFactors" as you already appear to understand. The behavior of the read.* functions was formerly to bring in factors by default, but recent versions of R have changed the default controlling parameter that governed that behavior to FALSE and character valued columns are now read just as factors. If you are considering stacking the results of reading multiple csv files and converting to factors, then by all means do the stacking first and only after that is successful should you convert the columns to factors. Otherwise you will experience the grief of trying to concatenate factor columns that have different labels and numbering systems.

我承认我不知道数据。 表格fread 在Rread.*功能发生变化的同时或之后发生违约情况。 不应因试验而难以确定。





相关问题
How to plot fitted model over observed time series

This is a really really simple question to which I seem to be entirely unable to get a solution. I would like to do a scatter plot of an observed time series in R, and over this I want to plot the ...

REvolution for R

since the latest Ubuntu release (karmic koala), I noticed that the internal R package advertises on start-up the REvolution package. It seems to be a library collection for high-performance matrix ...

R - capturing elements of R output into text files

I am trying to run an analysis by invoking R through the command line as follows: R --no-save < SampleProgram.R > SampleProgram.opt For example, consider the simple R program below: mydata =...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Changing the order of dodged bars in ggplot2 barplot

I have a dataframe df.all and I m plotting it in a bar plot with ggplot2 using the code below. I d like to make it so that the order of the dodged bars is flipped. That is, so that the bars labeled "...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Generating non-duplicate combination pairs in R

Sorry for the non-descriptive title but I don t know whether there s a word for what I m trying to achieve. Let s assume that I have a list of names of different classes like c( 1 , 2 , 3 , 4 ) ...

Per panel smoothing in ggplot2

I m plotting a group of curves, using facet in ggplot2. I d like to have a smoother applied to plots where there are enough points to smooth, but not on plots with very few points. In particular I d ...

热门标签