English 中文(简体)
What s a good way to collect logs from Amazon EC2 instances?
原标题:

My app is hosted on an Amazon EC2 cluster. Each instance writes events to log files. I need to collect (and data mine) over these logs at the end of each day. What s a recommended way to collect these logs in a central location? I have thought of several options, not sure which way to go:

  1. scp them to an instance using a cron job
  2. Log all events over TCP/IP to an instance
问题回答

We use Logstash on each host (deployed via Puppet) to gather and ship log events to a message queue (RabbitMQ, but could be Redis) on a central host. Another Logstash instance retrieves the events, processes them and stuffs the result into ElasticSearch. A Kibana web interface is used to search through this database.

It s very capable, scales easily and is very flexible. Logstash has tons of filters to process events from various inputs, and can output to lots of services, ElasticSearch being one of them. We currently ship about 1,2 million log events per day from our EC2 instances, on light hardware. The latency for a log event from event to searchable is about 1 second in our setup.

Here s some documentation on this kind of setup: https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html, and a demo of the Kibana search interface with some live data.

This question is old now (December 2014) but still ranks highly during a Google search on this topic.

Amazon now provides a way to do some of this through CloudWatch. It has the capability to pattern-match the log message and trigger alarms based on things happening in the application. Depending on the nature of the data-mining that needs to be done, it may be possible to use their API to fetch the desired, aggregate events. See http://aws.amazon.com/blogs/aws/cloudwatch-log-service/

I ve been using Loggly and it seems to do the trick

http://loggly.com/

It allows me to send all my logs through tcp to their service and have a central place to monitor all my log files,

It also allows me to archive my log files to S3 which is nice too

I ve not tried it for this purpose, but Amazon appear to recommend using SimpleDB:

http://aws.amazon.com/simpledb/usecases_logging/

Here s a package that you might find useful - it says you can use it to capture stdout/stderr into SimpleDB:

http://code.google.com/p/simpledb-appender/

Have a look at the free version of Splunk - it will handle the remote log file collection and also give you some really nice search and analysis tools.

Use syslog-ng, a popular tool that transfers log messages over TCP, optionally encrypted.

http://www.balabit.com/network-security/syslog-ng/

I haven t implemented it yet, but I came across Facebook Scribe, and it seems like a good idea. https://github.com/facebookarchive/scribe

Until I get that going I ve been doing just what you mention for #1 - I have a cron job which uses sftp to pull the files. I chose this because even if I did #2, I ve had instances where the ec2 machine crashes and I ended up having to pull the logfiles off anyway.

*Disclaimer: I work at Sumo:

Sumo Logic Free is a relatively easy option as well:

https://www.sumologic.com/pricing/





相关问题
Best logging approach for composite app?

I am creating a Composite WPF (Prism) app with several different projects (Shell, modules, and so on). I am getting ready to implement logging, using Log4Net. It seems there are two ways to set up the ...

How to make logging.debug work on Appengine?

I m having a tough time getting the logging on Appengine working. the statement import logging is flagged as an unrecognized import in my PyDev Appengine project. I suspected that this was just an ...

How to validate Java logging properties files?

I have a basic facility for allowing users to remotely apply changes to the logging files in my application. Some logs are configured using java.util.logging properties files, and some are configured ...

Logging SAS scripts

I ve been developing a lot of Java, PHP and Python. All of which offer great logging packages (Log4J, Log or logging respectively). This is a great help when debugging applications. Especially if the ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

热门标签