English 中文(简体)
Flink failed initials checkpoints
原标题:

enter image description hereI have a flink job deployed on a local kind cluster, it saves checkpoints to AWS S3.

The following error kept occurring in job manager log at the initial stage:

2023-07-07 19:33:48,657 INFO org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to trigger checkpoint for job 1ff112ff1bdf5c91c6e88f4112ecaf25 since Checkpoint triggering task Source: App Event Source (1/1) of job **** is not being executed at the moment. Aborting checkpoint. Failure reason: Not all required tasks are currently running..

but it disappeared, and checkpoint started working normally after these two logs:

"2023-07-07 18:23:03,819 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: App Event Source (1/1) (****) switched from INITIALIZING to RUNNING."

"2023-07-07 18:23:04,719 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Audit filter -> App Transform -> App Event Sink: Writer -> App Event Sink: Committer (1/1) (*****) switched from INITIALIZING to RUNNING."

Is there a way to fix this?

问题回答

I would presume that this is potentially an issue between your source and the checkpointing frequency. It appears the job is attempting take a checkpoint prior to this specific source operator being fully initialized/ready, and thus failing the checkpoint.

You could consider adjusting increasing the interval to see if that makes a noticeable difference, however it’s worth noting that Flink will attempt to retry the checkpoint should it fail (up to the allowed failures threshold). It can be common for one to fail occasionally but succeed on the subsequent retry (such that no checkpoints are missed).





相关问题
how to debug curl call to amazon s3 when it get stuck

I m using the PHP S3 class and this backup script to backup ~500Mb file from Linux server to S3. The call to s3 gets stuck (never returns) and top shows httpd process which consumes 100% CPU and 1% ...

Synchronizing S3 Folders/Buckets [closed]

I have an S3 Bucket that holds static content for all my clients in production. I also have a staging environment which I use for testing before I deploy to production. I also want the staging ...

Pure Javascript app + Amazon S3?

I m looking to confirm or refute the following: For what I have read so far it is not possible to write a web application with only javascript -- no server side logic -- served from Amazon S3 that ...

Using a CDN like Amazon S3 to control access to media

I want to use Amazon S3/CloudFront to store flash files. These files must be private as they will be accessed by members. This will be done by storing each file with a link to Amazon using a mysql ...

What s a good way to collect logs from Amazon EC2 instances?

My app is hosted on an Amazon EC2 cluster. Each instance writes events to log files. I need to collect (and data mine) over these logs at the end of each day. What s a recommended way to collect these ...

热门标签