English 中文(简体)
进口到R时有固定栏目宽度的问题
原标题:Problem when importing into R with fixed column widths

I am trying to import my data with fixed column widths. After the first special character appears, the number of characters per column changes. What could be the reason for this? File: https://gist.github.com/1902r/0b596431e15bde833e9d9a0640e12ba7

library(readr)
columns_widths <- c(3, 8, 7)
source_data <- "data.rpt"
source_data_raw <- read_fwf(source_data, fwf_widths(columns_widths))

我尝试用读到数据中。 但是,没有任何理由支持固定列宽。 我在座的脚石是这样做的。

最佳回答

以下代码应当使用<代码>utils:read.fwf(<>/code>从R基数中解决您的问题。 如果(未引述)特性说明本身含有 b(例如“共同人”)的话,你需要做这项工作。 否则,data.table:fread(),而无任何理由,则应予罚款。

UTF-8-BOM encodings should not be used (if possible). See https://cran.r-project.org/doc/manuals/r-patched/R-data.pdf and https://en.wikipedia.org/wiki/Byte_order_mark

。 因此,含有多种特性的行文是流离失所的。

library(readr)
columns_widths <- c(3, 8, 7)
source_data <- "https://gist.githubusercontent.com/1902r/0b596431e15bde833e9d9a0640e12ba7/raw/215e50d1db79b4a7aeb3560680a55bba8c9f1503/data.rpt"
source_data_raw <- read_fwf(source_data, fwf_widths(columns_widths))
#> Rows: 4 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> 
#> chr (3): X1, X2, X3
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
source_data_raw
#> # A tibble: 4 × 3
#>   X1    X2     X3    
#>   <chr> <chr>  <chr> 
#> 1 ID    Name   Amount
#> 2 1     Jo hn  100   
#> 3 2     Bílušk a 450 
#> 4 3     Jane   200

## first read colnames from file
datar_names <- read.fwf(source_data,
  widths = columns_widths,
  n = 1, fileEncoding = "UTF-8-BOM",
  strip.white = TRUE
)

## read data using names from above
datar <- read.fwf(source_data,
  widths = columns_widths,
  skip = 1, col.names = datar_names,
  fileEncoding = "UTF-8-BOM",
  strip.white = TRUE
)
datar
#>   ID    Name Amount
#> 1  1   Jo hn    100
#> 2  2 Bíluška    450
#> 3  3    Jane    200
str(datar)
#>  data.frame :    3 obs. of  3 variables:
#>  $ ID    : int  1 2 3
#>  $ Name  : chr  "Jo hn" "Bíluška" "Jane"
#>  $ Amount: int  100 450 200

Created on 2024-04-26 with reprex v2.1.0

问题回答

暂无回答




相关问题
xml and html escaping special characters

i store my article in a xml file, so if i write into it special characters " xml automatically escapes this characters and when i get(via PHP) the xml content i get something like ". so if i write ...

C# Special Characters not displayed propely in XML

i have a string that contains special character like (trademark sign etc). This string is set as an XML node value. But the special character is not rendered properly in XML, shows ??. This is how im ...

Storing special characters in database

Can somebody provide some best practices when storing special characters such as the trademark (tm or r) or copyright (c)? I am storing them in a varchar field with other text in SQL Server, and ...

java regular expression , how to add [ to expression

I want to create a regular expression for strings which uses special characters [ and ]. Value is "[action]". Expression I am using is "[\[[\x00-\x7F]+\]]". I tried doing it by adding \ before ...

热门标签