For Web Applications on the Cloud CouchDB and MongolDB *are* replaying RDBMS. 

On Wed, Jul 14, 2010 at 7:16 PM, Trent Shipley <trent_shipley@yahoo.com> wrote:
Do those massive, distributed, and fast Internet platforms give up flexibility?  A RDBMS is designed as a general solution for storing and querying structured data.  If the Internet solutions are general solutions why haven't they displaced the enterprise scale solutions?



From: Joseph Sinclair <plug-discussion@stcaz.net>
To: Main PLUG discussion list <plug-discuss@lists.plug.phoenix.az.us>
Sent: Wed, July 14, 2010 6:45:26 PM
Subject: Re: App Engine?

MySQL IS a single-server environment.  No single MySQL instance spans multiple servers.  Clustering doesn't make software distributed, it makes it clustered (which is COMPLETELY different).
Cassandra is NOTHING like MySQL.  It actually is a distributed column-oriented datastore (and it's NOT an RDBMS).  Cassandra is not clustered either, it's *distributed*.
Try this:
  Cluster 50 MySQL instances; randomly pull power (without warning or shutdown) on 10.  Is the cluster still able to serve all rows?  Did you loose any data or transactions?
  Run a 50-node Cassandra instance (single instance, 50 machines); randomly pull power (without warning or shutdown) on 10.  Is the instance still able to serve all rows?  Did you loose any data?
That experiment will show you one of the MANY ways distributed systems are different from clustered (without having to run 2000 machines to see the difference).

Facebook uses actual distributed software (things like Hadoop, Hive, Cassandra, etc...)  They don't run their site off of MySQL (or Oracle, for that matter).
Digg uses distributed systems as well, because scaling to their load is "increasingly difficult with MySQL" (http://about.digg.com/node/564).
There isn't a clustered solution possible that would handle their scale, in fact they haven't been using a cluster, in the traditional sense, for years.

All of them use things like MySQL for smaller, internal-facing systems, but none of them use *any* RDBMS for a user-facing site.
I can show you conclusively that MySQL (and any RDBMS) fails at large scale because the n^2 locking problem kills it.
Clustering is fine for an Enterprise application.  It's death for an Internet application.

Amazon runs amazingly fast, have you actually used Amazon.com (you do realize that their cloud offerings are the same infrastructure they use to run their own sites?).
Google.com gets search results in <1 second every time.  Try doing that with MySQL or Oracle.  Neither is capable of even storing a small part of the index; their internal limits won't permit a table that big, much less an indexed table.

If all you've ever built is enterprise apps with less than 100,000 users, you'll never understand why enterprise solutions don't scale to Internet numbers (100 million users or more).  5 years ago, I would have agreed with you; that was before I had to write software that could process more than 40,000,000 transactions per day and produce multidimensional analyses of all that data.
There's a completely different world of scale between 100,000 users and 100 million users, and solutions for the smaller scale are completely useless at the larger scale.
There are lots of people who think dumping their LAMP site on EC2 will make it fast, they're wrong.
You have to design for scale when you build the software.

AppEngine, BTW, is also good for small low-volume applications, just because it can be MUCH cheaper to run a small app on AppEngine than to run a hosted server (particularly if you want to write in Java or Python rather than PHP).

I've developed on Google's and Amazon's platforms.  I've also written (and am writing) the kind of distributed infrastructure those two use to enable their huge sites.  Not many systems require that kind of scale; for those that do there's no alternative to real lock-free/contention-free distributed systems.

When was the last time your app generated <100ms response times doing multiple PKI operations on 4M-40M files while sustaining >2000 requests/minute on <10 commodity servers with no special hardware?

==Joseph++

Bryan O'Neal wrote:
> Joseph they are not a single server environment. You cluster them! It
> is like saying MySQL or Casandra are single server environments -
> combine them all and that would be one hell of a server sites like
> Facebook and Dig run off of ;)
> As for Fast - try Google's or Amazon's offering - WOW That is some
> speed! You'll start to feel like it's running on a cluster of iPhones!
>
> On Tue, Jul 13, 2010 at 9:05 PM, Joseph Sinclair
> <plug-discussion@stcaz.net> wrote:
>> Apache and Tomcat are not even close to distributed computing environments.
>> They're single-server environments, and neither is even particularly fast in that role.
>> They are both well known and well supported, however.
>> If your application is simple enough to run on a single server (no matter how many users, as long as there aren't too many at one time), then that type of solution is fine (and a lot easier to program).
>> If your application's processing gets more complex as more users log in (relatively few applications do this), then no number of instances of a single-server-model web-server will handle the load, and you'll have to accept harder programming in order to scale beyond a few hundred thousand users.
>>
>>
>> Bryan O'Neal wrote:
>>> Every time I run the analysis your better off writing for a
>>> distributable open source app engine, like Apache / Tomcat. And
>>> horizontally spanning as required on commodity hosts, like go daddy.
>>>
>>> On Tue, Jul 13, 2010 at 7:00 PM, Doc Media <doc_media@yahoo.com> wrote:
>>>> Anyone had experience (good or bad) with Google's App Engine? �A friend
>>>> of mine was looking to start a project, and we were discussing the finer
>>>> points of a regular hosting company versus something like App Engine.
>>>> Any insights would be helpful.
>>>>
>>>> - Scott
>>
>>
>> ---------------------------------------------------
>> PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
>> To subscribe, unsubscribe, or to change your mail settings:
>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss



---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss