Question

我们的网络应用发送电子邮件。我们拥有很多用户，我们也会收到很多退信。例如，当用户更换公司时，他的公司电子邮件不再有效。

为了查找退回邮件，我使用日志解析器分析SMTP日志文件。这些日志来自于Microsoft SMTP服务器。

有些退信非常好，比如550+#5.1.0+地址+被拒绝+user@domain.com。退信中有user@domain.com。

但是有些错误信息中没有电子邮件，比如550+没有此收件人。

我创建了一个简单的Ruby脚本，解析日志（使用日志分析器）以查找哪封邮件导致了类似550 +无此收件人的情况。

我只是惊讶地发现我找不到一个可以做到这一点的工具。我已经找到了像Zabbix和Splunk这样的日志分析工具，但它们似乎对于这样一个简单的任务来说过于复杂了。

有人知道一个工具，可以解析SMTP日志，查找退信和导致退信的电子邮件吗？

Answer 1

这篇文章正是你要找的。它基于伟大的工具日志分析器。

Log parser is a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows® operating system such as the Event Log, the Registry, the file system, and Active Directory®. You tell Log Parser what information you need and how you want it processed. The results of your query can be custom-formatted in text based output, or they can be persisted to more specialty targets like SQL, SYSLOG, or a chart. Most software is designed to accomplish a limited number of specific tasks. Log Parser is different... the number of ways it can be used is limited only by the needs and imagination of the user. The world is your database with Log Parser.

Answer 2

就我所知，日志文件分析只有在检测SMTP会话级别上被拒绝的邮件时才真正有用。那么关于在远程MTA接受邮件以便投递后出现的退件呢？

我们使用以下设置来检测并分类远程MTA投递后的所有退信。

所有外发邮件都会附带一个唯一的退信头部，通过解码可以识别收件人电子邮件地址和邮件的具体内容。
收到返回路径地址的邮件的 Apache James 服务器。
一个定制的Mailet，使用Java开发，并在Apache James中执行，解码收件人地址，将电子邮件文本发送到boogietools bounce studio进行反弹类型分类，然后将结果持久化到我们的数据库中。

它工作得非常好。我们能够检测永久硬反弹和暂时软反弹，进一步将它们分类为非常细致的反弹类型，例如垃圾邮件拒收，办公室回复等。

Answer 3

你不想解析日志来尝试识别退信。如果仅仅查看日志，你会得到假阴性和假阳性。

可能会在你邮件传递的服务器下游产生退信。在你的发件服务器日志中，它们会看起来像是成功传递的邮件。

对于来自虚拟发件人之一的VERP地址的入站日志反弹的天真模式匹配将是不准确的。这样做有几个原因：

There will be delay warnings mixed in with actual failure bounces.
Most Out-of-Office and similar autoresponders use the null sender to prevent battlin-bots syndrome.
Similarly, challenge-response systems (like *spit* boxbe.com) tend to use the null sender.
Your VERP-ed sender addresses, if they are persistent per recipient, will get harvested by spammers and come back as either spam targets or backscatter.

所以，可悲的是，唯一可靠的方法是检查反弹消息本身。大多数的消息都会有一个符合RFC1894的“report / delivery-status” MIME部分，根据您选择的语言，可能有库或模块来帮助处理其他反弹格式。我直接经验的唯一一种是Perl Mail :: DeliveryStatus :: BounceParser模块，它的工作足够好。

Answer 4

我喜欢LogParser。当我需要解析非常具体、自定义或使用正则表达式的东西时，我使用biterScripting。他们实际上有一些我用来入门的示例脚本。其中一个在http://www.biterscripting.com/Download/SS_WebLogParser.txt。

Answer 5

我基于这篇文章编写了一个反弹计数器程序，后来才发现这种方法并不适用于高发送量的发送者，因为SMTP日志不是按顺序排列的。我的博客文章中有更多关于此的内容：SMTP日志中的电子邮件反弹检测及其不可能性。

友情链接