Andy Bedinger

Replacing CloudFront with Cloudflare for a Blog Site

2/19/2022

Cloudflare explains how to serve static content from an S3 bucket in this article. They offer a free plan, which sounds appealing. There are some caveats with this approach, however:

  1. Cloudflare requires you to use their nameservers.
  2. The source S3 bucket must use static website hosting.
  3. The minimum TTL is two hours.

Cloudflare Requires You to Use Their Nameservers

If you are using the Cloudflare's free plan, then you have to use their nameservers to have them proxy content from your origin S3 bucket. Using your own nameservers is an option starting only with the Business plan ($200/mo).

On the other hand, Cloudflare's DNS is free, and Route53 is not. You pay $0.50/mo for each hosted zone on Route53. The domain I'm using costs $12/yr. Since this is a low traffic blog site, CloudFront costs nothing, and storage in S3 is basically free since the website is so small. So my yearly cost on AWS is $18, and on Cloudflare, it would be $12. In other words, this site is 50% more expensive to run on AWS than it would be on Cloudflare.

The Cloudflare Terraform provider allows for managing DNS records, so there is no obvious reason why, in my case, I couldn't switch off of Route53 and start using Cloudflare's nameservers.

Using an S3 Origin With Cloudflare Requires Static Website Hosting

When you're using CloudFront with S3, you don't have to use static website hosting. Without static website hosting, and if you have public access disabled, then how is CloudFront able to fetch content from S3? CloudFront is sending authenticated requests to the REST API for S3, using an origin access identity (OAI), defined as a "special CloudFront user." This allows you to ensure that no-one is able to access your origin other than CloudFront itself.

Cloudflare doesn't use this method - it doesn't talk to the S3 REST API. Instead, it reverse proxies your site by communicating with your origin over HTTP. So how do you ensure that only your Cloudflare edge is able to talk to your origin? Cloudflare's recommended approach is to put a list of all of their IP addresses in your S3 bucket's bucket policy.

There are a few things I don't like about using Cloudflare with S3:

  1. You have to keep your S3 bucket policy up to date.
  2. Data transfer from S3 to Cloudflare is not free.
  3. Requests from Cloudflare are not necessarily requests from your Cloudflare edge.
  4. S3 static website endpoints do not support HTTPS.

Not Great: Keeping Your S3 Bucket Policy Up to Date

Suppose Cloudflare adds a new CIDR block that could be contacting your origin (your S3 bucket), and you don't know. Now some of your visitors are not able to reach your website, and others can. Or, perhaps worse, visitors sporadically are able to reach your website. How will you resolve this? You're a developer, so you probably don't want to manually check the list of Cloudflare CIDRs. Now you have to write a script, maybe a Lambda function, that goes out and fetches the latest IP addresses and updates your bucket policy. It could be done, but do you want to write and maintain this?

Not Great: Data Transfer From S3 to Cloudflare is Not Free

Data transfer from S3 to CloudFront is free. S3 to Cloudflare is S3 to the Internet, after the first 100GB, is $0.09/GB. S3 retrieval costs, be they from Cloudflare or from CloudFront, should be the same.

Bad: Requests From Cloudflare Are Not Necessarily From Your Edge

Just because a request originates from a Cloudflare IP, it doesn't mean it's coming from your site. Another Cloudflare customer could point their edge hostname at your origin. Or, someone could use a Cloudflare worker to have their requests source from a Cloudflare IP address and reach your S3 bucket. How would you ever know the difference? With CloudFront, the OAI you use is yours and is unique to you. No one else has it or can use it.

Awful: S3 Static Website Endpoints Do Not Support HTTPS

This is the worst one, in my opinion. Since S3 static website endpoints do not support HTTPS, all requests from Cloudflare to your S3 bucket are going to traverse the Internet unencrypted. For a company that prides itself in security, you would think they would plaster what a bad idea this is all over their article about how to set up Cloudflare in front of S3 - but they don't.

Minimum TTL is Two Hours for Free Plans

This means that Cloudflare will cache content it retrieves from your origin (S3 bucket) for a minimum of two hours. This restriction could quickly get frustrating. Imagine you push out a bad update and break your site. If someone visits your site before you fix the issue, it'll be two hours before they'll see the fix. Or, let's say you have a great story and want to be first to break the news. It could be two hours before anybody gets to read it. On CloudFront, you can set your TTL to whatever you want it to be.

Conclusion

There's actually only one thing I do like about this approach: It's free, as opposed to free up to a certain point. With AWS, there's always that worry in the back of your mind: what if my site gets a flood of traffic and my AWS bill skyrockets? That could happen with AWS, whereas Cloudfront has a truly free plan. Free DNS also sounds good (although I think you get that only if you're hosting a site on Cloudflare).

However, the brittle, tightly-coupled approach of using hard-coded IP addresses in a bucket policy, the flimsy security that affords, and the total absence of security in serving content over HTTP makes using Cloudflare in front of an S3 bucket origin an absolutely bad idea.