English 中文(简体)
• 如何使Ruby为档案系统提供安全保护?
原标题:How to make a Ruby string safe for a filesystem?

我有用户条目作为档案名称。 当然,这不是一个好的想法,因此,我想除<条码>[a-z]、<条码>[A-Z]、<条码>[0-9]、<条码>、<条码>>和<条码>>>外,放弃一切。

例如:

my§document$is°°   very&interesting___thisIs%nice445.doc.pdf

......

my_document_is_____very_interesting___thisIs_nice445_doc.pdf

————

my_document_is_very_interesting_thisIs_nice445_doc.pdf

这样做是否有 n和leg?

最佳回答

http://web.archive.org/web/20110529023841/http://devblog.muziboo.com/2008/06/17/attachment-fu-sanitize-filename-regex-and-unicode-gotcha/“rel=“nofollow noreferer” http://web.archive.org/web/20110529023841/http://devblog.muziboo.com/2008/06/17/attachment-fu-sanitize-filename-regex-and-unicode-gotcha/:

def sanitize_filename(filename)
  returning filename.strip do |name|
   # NOTE: File.basename doesn t work right with Windows paths on Unix
   # get only the filename, not the whole path
   name.gsub!(/^.*(\|/)/,   )

   # Strip out the non-ascii character
   name.gsub!(/[^0-9A-Za-z.-]/,  _ )
  end
end
问题回答

我愿建议一种不同于旧的解决办法。 注:旧代码使用deprecated>。 顺便说一句,它向铁路(<>/strong>)倾斜,你在问题(仅作为tag)中明确提到铁路。 此外,现有的解决办法未能按照你的要求将<代码>.doc.pdf并入_doc.pdf。 当然,这并没有使强调崩溃。

我的解决办法是:

def sanitize_filename(filename)
  # Split the name when finding a period which is preceded by some
  # character, and is followed by some character other than a period,
  # if there is no following period that is followed by something
  # other than a period (yeah, confusing, I know)
  fn = filename.split /(?<=.).(?=[^.])(?!.*.[^.])/m

  # We now have one or two parts (depending on whether we could find
  # a suitable period). For each of these parts, replace any unwanted
  # sequence of characters with an underscore
  fn.map! { |s| s.gsub /[^a-z0-9-]+/i,  _  }

  # Finally, join the parts with a period and return the result
  return fn.join  . 
end

您没有具体说明有关转换的所有细节。 因此,我提出以下假设:

  • There should be at most one filename extension, which means that there should be at most one period in the filename
  • Trailing periods do not mark the start of an extension
  • Leading periods do not mark the start of an extension
  • Any sequence of characters beyond AZ, az, 09 and - should be collapsed into a single _ (i.e. underscore is itself regarded as a disallowed character, and the string $%__°# would become _ – rather than ___ from the parts $% , __ and °# )

其复杂部分是,我把档案名称分为主要部分和延伸部分。 在定期表达的帮助下,我搜索了最后一段时期,之后是一段时期之外的东西,因此,在一段时期里没有达到同样的标准。 然而,在进行之前,必须具有某种特性,以确保它不至于扼杀的第一个特性。

我测试了这一职能:

1.9.3p125 :006 > sanitize_filename  my§document$is°°   very&interesting___thisIs%nice445.doc.pdf 
 => "my_document_is_very_interesting_thisIs_nice445_doc.pdf"

我认为,这是你的要求。 我希望这够了。

在铁路方面,你还可以使用ActiveStorage:Filename#sanitized:

ActiveStorage::Filename.new("foo:bar.jpg").sanitized # => "foo-bar.jpg"
ActiveStorage::Filename.new("foo/bar.jpg").sanitized # => "foo-bar.jpg"

如果你使用铁路,你也可以使用Sting#paraileize。 这不是特别打算这样做的,但你将获得满意的结果。

"my§document$is°°   very&interesting___thisIs%nice445.doc.pdf".parameterize

铁路 我发现自己想保留任何档案延期,但使用<编码>参数,以达到其余性质:

filename = "my§doc$is°° very&itng___thsIs%nie445.doc.pdf"
cleaned = filename.split(".").map(&:parameterize).join(".")

执行细节和想法见来源:https://github.com/rails/blob/master/support/active/inflector/transmol.rb

def parameterize(string, separator: "-", preserve_case: false)
  # Turn unwanted chars into the separator.
  parameterized_string.gsub!(/[^a-z0-9-_]+/i, separator)
  #... some more stuff
end




相关问题
Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签