App Engine?

Trent Shipley trent_shipley at yahoo.com
Wed Jul 14 19:16:56 MST 2010


Do those massive, distributed, and fast Internet platforms give up flexibility?  
A RDBMS is designed as a general solution for storing and querying structured 
data.  If the Internet solutions are general solutions why haven't they 
displaced the enterprise scale solutions?





________________________________
From: Joseph Sinclair <plug-discussion at stcaz.net>
To: Main PLUG discussion list <plug-discuss at lists.plug.phoenix.az.us>
Sent: Wed, July 14, 2010 6:45:26 PM
Subject: Re: App Engine?

MySQL IS a single-server environment.  No single MySQL instance spans multiple 
servers.  Clustering doesn't make software distributed, it makes it clustered 
(which is COMPLETELY different).
Cassandra is NOTHING like MySQL.  It actually is a distributed column-oriented 
datastore (and it's NOT an RDBMS).  Cassandra is not clustered either, it's 
*distributed*.
Try this:
  Cluster 50 MySQL instances; randomly pull power (without warning or shutdown) 
on 10.  Is the cluster still able to serve all rows?  Did you loose any data or 
transactions?
  Run a 50-node Cassandra instance (single instance, 50 machines); randomly pull 
power (without warning or shutdown) on 10.  Is the instance still able to serve 
all rows?  Did you loose any data?
That experiment will show you one of the MANY ways distributed systems are 
different from clustered (without having to run 2000 machines to see the 
difference).

Facebook uses actual distributed software (things like Hadoop, Hive, Cassandra, 
etc...)  They don't run their site off of MySQL (or Oracle, for that matter).
Digg uses distributed systems as well, because scaling to their load is 
"increasingly difficult with MySQL" (http://about.digg.com/node/564).
There isn't a clustered solution possible that would handle their scale, in fact 
they haven't been using a cluster, in the traditional sense, for years.

All of them use things like MySQL for smaller, internal-facing systems, but none 
of them use *any* RDBMS for a user-facing site.
I can show you conclusively that MySQL (and any RDBMS) fails at large scale 
because the n^2 locking problem kills it.
Clustering is fine for an Enterprise application.  It's death for an Internet 
application.

Amazon runs amazingly fast, have you actually used Amazon.com (you do realize 
that their cloud offerings are the same infrastructure they use to run their own 
sites?).
Google.com gets search results in <1 second every time.  Try doing that with 
MySQL or Oracle.  Neither is capable of even storing a small part of the index; 
their internal limits won't permit a table that big, much less an indexed table.

If all you've ever built is enterprise apps with less than 100,000 users, you'll 
never understand why enterprise solutions don't scale to Internet numbers (100 
million users or more).  5 years ago, I would have agreed with you; that was 
before I had to write software that could process more than 40,000,000 
transactions per day and produce multidimensional analyses of all that data.
There's a completely different world of scale between 100,000 users and 100 
million users, and solutions for the smaller scale are completely useless at the 
larger scale.
There are lots of people who think dumping their LAMP site on EC2 will make it 
fast, they're wrong.
You have to design for scale when you build the software.

AppEngine, BTW, is also good for small low-volume applications, just because it 
can be MUCH cheaper to run a small app on AppEngine than to run a hosted server 
(particularly if you want to write in Java or Python rather than PHP).

I've developed on Google's and Amazon's platforms.  I've also written (and am 
writing) the kind of distributed infrastructure those two use to enable their 
huge sites.  Not many systems require that kind of scale; for those that do 
there's no alternative to real lock-free/contention-free distributed systems.

When was the last time your app generated <100ms response times doing multiple 
PKI operations on 4M-40M files while sustaining >2000 requests/minute on <10 
commodity servers with no special hardware?

==Joseph++

Bryan O'Neal wrote:
> Joseph they are not a single server environment. You cluster them! It
> is like saying MySQL or Casandra are single server environments -
> combine them all and that would be one hell of a server sites like
> Facebook and Dig run off of ;)
> As for Fast - try Google's or Amazon's offering - WOW That is some
> speed! You'll start to feel like it's running on a cluster of iPhones!
> 
> On Tue, Jul 13, 2010 at 9:05 PM, Joseph Sinclair
> <plug-discussion at stcaz.net> wrote:
>> Apache and Tomcat are not even close to distributed computing environments.
>> They're single-server environments, and neither is even particularly fast in 
>>that role.
>> They are both well known and well supported, however.
>> If your application is simple enough to run on a single server (no matter how 
>>many users, as long as there aren't too many at one time), then that type of 
>>solution is fine (and a lot easier to program).
>> If your application's processing gets more complex as more users log in 
>>(relatively few applications do this), then no number of instances of a 
>>single-server-model web-server will handle the load, and you'll have to accept 
>>harder programming in order to scale beyond a few hundred thousand users.
>>
>>
>> Bryan O'Neal wrote:
>>> Every time I run the analysis your better off writing for a
>>> distributable open source app engine, like Apache / Tomcat. And
>>> horizontally spanning as required on commodity hosts, like go daddy.
>>>
>>> On Tue, Jul 13, 2010 at 7:00 PM, Doc Media <doc_media at yahoo.com> wrote:
>>>> Anyone had experience (good or bad) with Google's App Engine? �A friend
>>>> of mine was looking to start a project, and we were discussing the finer
>>>> points of a regular hosting company versus something like App Engine.
>>>> Any insights would be helpful.
>>>>
>>>> - Scott
>>
>>
>> ---------------------------------------------------
>> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
>> To subscribe, unsubscribe, or to change your mail settings:
>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20100714/cddf4023/attachment.html>


More information about the PLUG-discuss mailing list