下表按7种不同方法列出了对4份档案及其同份数据进行对比的速度。 选定这4份档案是为了方便地放在我计算机的一条路上,以及位于外部SD的一条相应的路上。 它们都属于光彩的录像档案,不应与之相关,而是其结构的基点:,其结果是在其中一种方法上产生了有意义的效果。 表格显示了Mb/sec对每种方法和每个档案的比较过程的速度。 计算速度时,将案卷的大小区分在程序结束的时间。 下文将进一步介绍进行和时间比较的文字。
表一栏是:
- Size: The size of the files in Mb.
- PS: PowerShell version in which the process was run. A hyphen ("-") indicates the process was run in a Windows batch file, not in PowerShell. The scripts were run directly in Windows, not in a scripting environment (ISE or VS Code).
- The methods:
- Comp: The Windows command
comp
.
- FC: The Windows command
FC
.
- Compare-object: The PowerShell command
compare-object
acting on a get-content
of each file to be compared.
- Compare raw: The PowerShell command
compare-object
in which the get-content
s have the parameter -raw
.
- (not included:) Compare as byte: I attempted to include the PowerShell command
compare-object
in which the get-content
s have just the parameter -encoding byte
(PS 5) or -AsByteStream
(PS 7), but this sat for over a half-hour in both PS 5 and 7, so either the process hung or it took so long that it might as well have hung.
- Compare as byte raw: The PowerShell command
compare-object
in which the get-content
s have the parameters -encoding byte
(PS 5) or -AsByteStream
(PS 7) plus -raw
.
- Compare as byte read 0: The PowerShell command
compare-object
in which the get-content
s have the parameters -encoding byte
(PS 5) or -AsByteStream
(PS 7) plus -ReadCount 0
.
- Buffered: The PowerShell custom function
bFilesCompareBinary
, which performs a buffered comparison (code included in script below).
- (not included:) Hash comparisons. All the methods tested do direct byte-by-byte comparisons of the file contents.
www.un.org/Depts/DGACM/index_spanish.htm 成果: 在所有情况下,缓冲方法最快。
由于检测的奶制品完全相同,因此所有测量结果必须比较档案中的所有 by。 对于不一定完全相同的奶制品,Windows的指挥和缓冲方法能够在发现差异后进行消.,这样会更快地运行。 <代码>compare-object 方法比较了整个档案,即使第一种书目有所不同。
Size |
PS |
Comp |
FC |
Compare-object |
Compare raw |
Compare as byte raw |
Compare as byte read 0 |
Buffered |
74 |
- |
29.0 |
30.3 |
|
|
|
|
|
"" |
5 |
29.2 |
30.2 |
4.1 |
18.7 |
0.5 |
0.5 |
35.2 |
"" |
7 |
29.2 |
30.9 |
3.4 |
20.7 |
1.2 |
0.9 |
36.5 |
66 |
- |
25.3 |
26.2 |
|
|
|
|
|
"" |
5 |
25.5 |
26.1 |
5.6 |
20.4 |
0.5 |
0.5 |
35.4 |
"" |
7 |
25.4 |
26.3 |
2.8 |
22.0 |
1.2 |
1.0 |
37.1 |
162 |
- |
25.6 |
26.1 |
|
|
|
|
|
"" |
5 |
25.5 |
26.5 |
15.0 |
18.7 |
0.5 |
Error |
35.8 |
"" |
7 |
25.8 |
26.8 |
17.8 |
24.6 |
1.2 |
1.0 |
36.8 |
56 |
- |
25.5 |
25.8 |
|
|
|
|
|
"" |
5 |
25.5 |
26.0 |
21.6 |
3.0 |
0.5 |
0.5 |
35.2 |
"" |
7 |
26.0 |
26.5 |
17.6 |
25.1 |
1.3 |
1.1 |
36.0 |
Table: Speed, in Mb/sec, of comparing four identical pairs of files (identified by their size in Mb) by seven methods running in Windows batch, in Windows PowerShell 5.1, and in PowerShell 7.
请注意,用“Compare-object”方法,第三和第四卷的运行速度远远超过前两卷。 这是my original question询问的,并在答复中加以解释。
www.un.org/Depts/DGACM/index_spanish.htm 错误和错误
在最大档案中显示为“Error”的案例中(在最大卷宗中的PS 5中,“Compare as byte”改为0),该过程用电击了电荷,“有点: 超过支助范围
正如我前面指出的:else where,“compareant”方法与, when presented with a pair of file of 3.7 Gb.
警告: 在初步测试中,结果似乎表明,Windows 指挥系统FC
比缓冲方法快七倍左右。 我已经用缓冲方法对其背书进行了1个Tb 夹的比较,大约需要10个小时的时间。 摘录 NC 可以更快地工作,因此,我改写我的文字,重复这一比较,并混淆起来,发现它需要14个小时。 然后我认识到,最初的结果在我与<代码>comp/code”进行比较时,由Windows对档案进行ach弄,因此,在用<代码>再次做时,其速度要快得多。 NC。 在上文报告的结果中,测量工作是以空洞的海滩进行的。 页: 1 找不到从Cache上删除档案的途径,因此每次测量都是在重新配置计算机(而且没有任何其他操作)之后立即进行的。
Environment
对AMD Ryzen 7 Pro 6850H Processor, RAM32 Gb, 运行Windows 11 Pro 64进行了测量。 每个楼梯的档案都存放在内部的SSD和外部的USB SSD。
Code
我很想得到关于改进这些文字的反馈意见,有两项建议。 首先,我知道打字片是粗略的;刚刚迅速布告,以完成这项工作。 更多地注意PowerShell 书的设计。 在这方面,我知道,我的编码风格是非常规的,但我多年来一直在发展这一风格,如果你不喜欢,我只能道歉。 但是,如果你想办法改进文字的功能,请说什么。
她还有兴趣了解其他人是否掌握了文字并取得了与地雷或不同的结果。
<代码>comp和FC
:
rem Script: "measure speed - comp.bat"
rem Measure the time taken to compare two files using "comp" running in a Windows batch script.
rem To ensure that none of the files is in cache, run this immediately after booting the computer.
time < nul
comp /m "<path 1><file 1>" "<path 2><file 1>"
time < nul
comp /m "<path 1><file 2>" "<path 2><file 2>"
time < nul
comp /m "<path 1><file 3>" "<path 2><file 3>"
time < nul
comp /m "<path 1><file 4>" "<path 2><file 4>"
time < nul
The console output was copy pasted into Excel, which then subtracted the times to get the elapsed time of each process. The batch for FC
was the same with comp /m
replaced with FC /b
.
PowerShell script, including function bFilesCompareBinary
:
# measure-speed-of-file-comparisons.ps1
# Set the $sFolder_n to a pair of folders with identical content. This script will measure and record,
# by one of eight different methods, the time taken to verify that all the files are identical.
# To ensure that none of the files is in cache, run this immediately after booting the computer.
# On use of get-content parameters "-encoding byte", "-AsByteStream", "-raw", and "-ReadCount 0":
# www.jonathanmedd.net/2017/12/powershell-core-does-not-have-encoding-byte.-replaced-with-new-parameter-asbytestream.html/
# www.powershellmagazine.com/2014/03/17/pstip-reading-file-content-as-a-byte-array/
# www.github.com/PowerShell/PowerShell/issues/11266
# www.github.com/MicrosoftDocs/PowerShell-Docs/issues/3215
# Calls to get-content with as-byte paremters are wrapped in an array ("@(, )") per instructions in
# www.stackoverflow.com/questions/76842081/powershell-why-is-this-timing-not-working/#76843506
# =========================================================================
# Manually set these paths before running:
# =========================================================================
$sFolder_1 = "<path to first folder, including final >"
$sFolder_2 = "<path to second folder, including final >"
$sOutputFilespec = "<filespec of output csv file>"
# =========================================================================
# Function bFilesCompareBinary()
# =========================================================================
function bFilesCompareBinary ([System.IO.FileInfo] $oFile_1, [System.IO.FileInfo] $oFile_2, `
[uint32] $nBufferSize = 524288, $sRetIfSame = "Same", $sRetIfDif = "Dif")
{# Return message for whether two given files are identical by binary comparison, or error description.
# Assumes the files are the same size, else error.
# From "www.stackoverflow.com/questions/19990788/powershell-binary-file-comparison#22800663"
# But comment by @mclayton on "www.stackoverflow.com/questions/76842081/powershell-why-is-this-timing-not-working/#76843506"
# warns that .read() does not always get all the bytes requested, so I ve added a test for that.
# FileInfo Class: "https://learn.microsoft.com/en-us/dotnet/api/system.io.fileinfo"
# FileStream Class: "https://learn.microsoft.com/en-us/dotnet/api/system.io.filestream"
if ($nBufferSize -eq 0) {$nBufferSize = 524288}
try{$oStream_1 = $oFile_1.OpenRead()
$oStream_2 = $oFile_2.OpenRead()
$oBuffer_1 = New-Object byte[] $nBufferSize
$oBuffer_2 = New-Object byte[] $nBufferSize
if ($oFile_1.Length -ne $oFile_2.Length) {throw "Files are different sizes: $oFile_1.Length , $oFile_2.Length"}
$nBytesLeft = $oFile_1.Length
$bDifferenceFound = $false
$sError = ""
do {$nBytesToGet = [math]::Min($nBytesLeft, $nBufferSize)
$nBytesRead_1 = $oStream_1.read($oBuffer_1, 0, $nBytesToGet)
$nBytesRead_2 = $oStream_2.read($oBuffer_2, 0, $nBytesToGet)
if ($nBytesRead_1 -ne $nBytesRead_2) {throw "Different byte count each file: $nBytesRead_1 , $nBytesRead_2"}
if ($nBytesRead_1 -ne $nBytesToGet) {throw "Byte count different from requested: $nBytesRead_1 , $nBytesToGet"}
$nBytesLeft -= $nBytesRead_1
if (-not [System.Linq.Enumerable]::SequenceEqual($oBuffer_1, $oBuffer_2)) {$bDifferenceFound = $true}
} while ((-not $bDifferenceFound) -and $nBytesLeft -gt 0)
}
catch {$sError = "Error: $_"}
finally {$oStream_1.Close() ; $oStream_2.Close()}
if ($sError -ne "") {return $sError}
elseif ($bDifferenceFound) {return $sRetIfDif}
else {return ($sRetIfSame)}
}
# =========================================================================
# User interaction
# =========================================================================
$bBooted = (read-host ("Did you boot the computer immediately before running this? (Enter ""Y"" or ""N"".)")).ToUpper()
$sPSenv = (read-host ("PowerShell environment: Enter ""D"" if running directly in Windows or ""S"" if in scripting environment (ISE or VS Code)")).ToUpper()
$nMethod = read-host ("Comparison method: Enter 1 for comp, 2 for FC, 3 for compare-object, 4 for compare raw, " + `
# "5 for compare as byte, " + `
"6 for compare as byte raw, 7 for compare as byte read 0, or 8 for buffered")
switch ($nMethod) {1 {$sMethod = "comp"} 2 {$sMethod = "FC"}
3 {$sMethod = "compare-object"} 4 {$sMethod = "compare raw"}
5 {$sMethod = "compare as byte"} 6 {$sMethod = "compare as byte raw"}
7 {$sMethod = "compare as byte read 0"} 8 {$sMethod = "buffered"}}
# =========================================================================
# Scan the folders and compare files.
# =========================================================================
$nLen_1 = $sFolder_1.Length
$PSversion = $PSVersionTable.PSVersion.Major
get-ChildItem -path $sFolder_1 -Recurse | ForEach-Object `
{$oItem_1 = $_
$sItem_1 = $oItem_1.FullName
# If it s a file, compare in both folders:
if (Test-Path -Type Leaf $sItem_1) `
{$nSize_1 = $oItem_1.Length
$sItem_rel = $sItem_1.Substring($nLen_1)
$sItem_2 = join-path $sFolder_2 $sItem_rel
$oItem_2 = get-item $sItem_2
$LastExitCode = 99
$nMid = ""
write-output "Check $sItem_rel"
$dStart = $(get-date)
switch ($nMethod)
{{$_ -in 1, 2}
{switch ($nMethod)
{1 {comp /m "$sItem_1" "$sItem_2"}
2 {FC.exe /b "$sItem_1" "$sItem_2"}}
switch ($LastExitCode) {0 {$sResult = "Same"} 1 {$sResult = "Dif"} default {$sResult = "Error: $LastExitCode"}}}
{$_ -in 3, 4, 5, 6, 7}
{switch ($nMethod)
{3 {$oContent_1 = (get-content $sItem_1)
$oContent_2 = (get-content $sItem_2)}
4 {$oContent_1 = (get-content $sItem_1 -raw)
$oContent_2 = (get-content $sItem_2 -raw)}
{$_ -in 5, 6, 7}
{switch ($PSversion)
{5 {switch ($nMethod)
{5 {$oContent_1 = @(, (get-content $sItem_1 -encoding byte))
$oContent_2 = @(, (get-content $sItem_2 -encoding byte))}
6 {$oContent_1 = @(, (get-content $sItem_1 -encoding byte -raw))
$oContent_2 = @(, (get-content $sItem_2 -encoding byte -raw))}
7 {$oContent_1 = @(, (get-content $sItem_1 -encoding byte -ReadCount 0))
$oContent_2 = @(, (get-content $sItem_2 -encoding byte -ReadCount 0))}
} }
7 {switch ($nMethod)
{5 {$oContent_1 = @(, (get-content $sItem_1 -AsByteStream))
$oContent_2 = @(, (get-content $sItem_2 -AsByteStream))}
6 {$oContent_1 = @(, (get-content $sItem_1 -AsByteStream -raw))
$oContent_2 = @(, (get-content $sItem_2 -AsByteStream -raw))}
7 {$oContent_1 = @(, (get-content $sItem_1 -AsByteStream -ReadCount 0))
$oContent_2 = @(, (get-content $sItem_2 -AsByteStream -ReadCount 0))}
} }
default {$sResult = "Error: PowerShell version is $PSversion"}
} } }
$nMid = ($(get-date) - $dStart).Ticks / 1e7
if (compare-object $oContent_1 $oContent_2) `
{$sResult = "Dif"} else {$sResult = "Same"}}
8 {$sResult = bFilesCompareBinary $oItem_1 $oItem_2}
}
$nElapsed = ($(get-date) - $dStart).Ticks / 1e7
$oOutput = [PSCustomObject]@{Booted = $bBooted ; PSversion = $PSversion ; PSenv = $sPSenv ; Method = $sMethod ; Item = $nItem ; Result = $sResult
Size = $nSize_1 ; tStart = $dStart ; tMid = $nMid ; tElapsed = $nElapsed ; Filespec = $sItem_rel}
Export-Csv -InputObject $oOutput -Path $sOutputFilespec -Append -NoTypeInformation
} }
# =========================================================================
# End of script
# =========================================================================