English 中文(简体)
在 CSV 中代表 XML : 处理儿童标签的建议
原标题:Representing XML in CSV: advice for dealing with child tags

我在构建一个数据可视化, 我想用 CSV 作为我的基础数据格式, 用于亮度和易用性。 我的源数据是重 XML, 所以我把它转换成 CSV, 使用 Python 和 lxml 。

我的问题是。 当我在 XML 中有多个子标记时, 比如以下的 < code@ lt; City> 标记 :

<Country>
   <Name>France</Name>
   <Cities>
   <City><Name>Paris</Name></City>
   <City><Name>Lyon</Name></City>
   </Cities>
</Country>
<Country>
   <Name>Germany</Name>
   <Cities>
   <City><Name>Berlin</Name></City>
   <City><Name>Munich</Name></City>
   <City><Name>Aachen</Name></City>
   </Cities>
</Country>

我该如何在我的 CSV 文件中代表他们?我可以想到两个选项。 第一个选项是为每个城市增加一列,直到 CityN:

 Country,City1,City2,City3
 France,Paris,Lyon,,
 Germany,Berlin,Munich,Aachen

第二是为所有城市使用一个阵列:

 Country,Cities
 France,[Paris,Lyon]
 Germany,[Berlin,Munich,Aachen]

也许最好的格式 仅仅取决于我想如何查询数据, 但我想我会在这里检查一下, 看看是否有一个既定的 或更好的方法来这样做。

问题回答

鉴于您将使用 CSV, 数组版本将保存文档的基于字段的结构。 没有数组术语, 逗号会以字段分隔符和字段分隔符内的一个值超载, 无法确定哪个在起作用, 除非从记录左侧计数字段 。

非数组版本还将您的数据限制为每个记录类型的一个嵌套收藏。 这不是当前示例中的一个问题, 但它可能属于您应用程序中的另一个记录类型。 使用( 单一的) 标准方法可以提高清晰度和可维护性 。





相关问题
The Fastest DataStructure to Filter with in C#

Currently we are filtering and sorting data with a datatable. /// <summary> /// Filters the data table and returns a new data table with only the filtered rows. /// </summary>...

Efficient queue in Haskell

How can I efficiently implement a list data structure where I can have 2 views to the head and end of the list, that always point to a head a tail of a list without expensive calls to reverse. i.e: ...

Java large datastructure for storing a matrix

I need to store a 2d matrix containing zip codes and the distance in km between each one of them. My client has an application that calculates the distances which are then stored in an Excel file. ...

Holding onto items after a postback

I have an ASP.NET web application and I want to be able to take items from a master list and store them temporarliy into one of four other lists. The other lists need to survive post backs so that ...

negative number in the stack

I am a new student in the compilers world ^_^ and I want to know is legal represent negative number in the stack. For example: infix: 1-5=-4 postfix: 15- The statements are: push(1) push(5) x=...

What type of struct/container would you use in this instance?

I am trying to figure out what type of structure or container I should use for a quick project. I need to have an unknown number of sets that will be entered from the GUI (each one will have a name, ...

热门标签