Reading through this my first gut instinct is that using a DNS service (I've used Dynect in the past) could work, but you didn't include enough information to confirm if that's good/bad advice.  What you probably should do is think about your workload and decide your tolerance for errors.  Consider an average request:

- Client does a dns lookup on the hostname (1)
- The client does whatever else and sends an HTTP POST request to that IP (2)
- Traffic hits the server side network and routes/proxies to the approriate backend (3)
- Backend processes the request and returns a status to the client (4)

You could accomplish what you want at any of those 4 points:

1: This is easy because a provider handles the magic, it's also the least(?) reliable because best case scenario it's using an IP I could spoof to determine where traffic comes from

2: The client itself can send location data (this is ideal especially on mobile because the app would know best what the location is).  Obviously this requires the most work and again can obviously be spoofed

3: A Network level hook (like BGP) or some logic in a proxy (like nginx + lua) could route based on the HTTP headers

4: The backend code could also hit the headers, this becomes a matter of having 1 server or cluster with a shared (or sharded) database.

All those solutions have tradeoffs, additionally it's worth considering what the backend might look like.  If you're planning on doing a tradition LAMP (or Python or Ruby) application, that many POSTS will kill you on invoking a PHP/Ruby/Python interpreter, no matter how small the work-load is.

Good luck!


On Wed, Aug 6, 2014 at 11:17 PM, David Schwartz <newsletters@thetoolwiz.com> wrote:
Here’s something interesting for the infrastructure geeks on the list ...

How would you approach setting up a service that had to sink around, oh … say … 10-20 million small HTTP POST requests per minute throughout the day, from sources geographically distributed around the country?

To do development and get the logic working, a small server is sufficient. But it needs to scale quickly once it’s launched.

There will be a high degree of geo-locality, so servers could be set up to handle requests from different geographic areas.  HTTP requests from a given area would be routed to whatever server is dedicated for that area. I guess their IP address could be used for that purpose?

(How granular is the location data for IP addresses on mobile devices? Are they reliable? We could add a location geotag to the packet headers if that would help.)

Note that the servers don’t need to be physically LOCATED in the area; rather, they're dedicated to SERVING a well-defined geographic area.

There’s no need for cross-talk, either. That is, there’s no need for a server serving, say, the LA area to cross-post with one in San Diego, except in a very small overlapping area which is easy to address.

Can this sort of routing be done with a DNS service?  (eg., DNSMadeEasy.com is one I’m familiar with)

Or is something more massive needed?

Also note that this would be an automated service. It has a very steady stream of small incoming packets, peaking at various times of the day, with limited responses. No ads, no graphics, no user interactions at all.

I know there are infrastructure services in place to handle this kind of thing, like what Amazon offers, and others. I’m looking for any specific pointers to services that might fit this use case profile.

-David



---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
http://lists.phxlinux.org/mailman/listinfo/plug-discuss



--
Paul Mooring
Operations Engineer
Chef