Microsoft ArcReady: Architecting Scalable and Usable Web Applications

Presenter: Larry Clarkin
Email: larry.clarkin@microsoft.com
Blog: larryclarkin.com
Podcasts: thirstydeveloper.com

Scalable

Discussed the meaning of scalable
Performance = how app behaves with one user
Scalability = how app behaves with multiple users
Bad scalability is not a totally bad thing, means you are more popular than your setup allows for
Scalability is a step function instead of a linear line
Basically by increasing our hardware, we can support more users, until we can support no more
Our goal is to minimize the amount of money spent per request
Most websites have just 1 application server and one database server
This works for most websites
Where can we go from here if we want to improve performance/scalability?
Block and tackle (I didn’t really understand the metaphor)
use basic strategies to improve performance
turn off debugging
ensure that you understand the network architecture to prevent surprises/problems
told story of a slow website because someone incorrectly configured a bridge

Scale up
add hardware to existing hardware (more RAM, etc.) without changing architecture
improving network connections
typically applies to database servers
don’t overlook scaling up with software
going to next version might be faster

Scale out
putting more application servers into the system
offers scalability boost, but starts introducing more complicated issues
problems include session affinity, load balancing, SSL connection problems
reduce/eliminate single-point-of-failure problem (SPOF)
unless your load balancer goes down
this is unlikely due to simplicity of hardware
plus, you can have a backup sitting around to swap in

Specialize
have certain servers be responsible for certain services
might introduce more SPOF
helpful to have image server
doesn’t need to know about session, etc. usually

Split the application
microsoft.com, msdn.microsoft.com, technet.microsoft.com
each section has its own DB
information may be shared between DB/app servers

Split the database 1
reference data (read) vs. transaction data (write)
there are some problems with normalizing the database

Split the database 2
many read databases, fewer write databases
web 2.0 typically has a lot more reads of data than writes
news feeds, wall posts, contact information, etc.
after a write, lazily loaded into read databases
the small time lag is usually not perceptible to users
“typically if a user refreshes and they see what they expect, they think it’s a browser problem”
write database could be setup as a queue

Split the database 3
essentially sharding
have all users A-L on one database, M-Z on another
myspace has 100000 users per database, just keeps adding databases
talked about bloom filters for hashing into correct database

Geodistribution
have data centers in various places
more redundancy, better performance
obviously this presents new issues to consider

Offload the work
content distribution network
might be expensive, but improves performance
example is silverlight streaming

Anti-patterns
spending all of your time looking at the code
caching everything
services calling services

Discussion
37signals says to scale later, focus on getting things out quickly
one problem is a fickle customer base, don’t want to alienate customers
Retail seems to have more of a valve than purely information/marketing/viral apps
people can only buy so much stuff

Usability

Probably read Don’t Make Me Think and you can get most of the points he made
“A good application can make you cry.”
A good application is:
desirable
usable
useful
adaptive
cost-effective
reliable
Tradeoffs can be made

ProtoXAML = users don’t always respond with good feedback to polished demos

70-20-10 rule
Use the 70-20-10 rule for the home page
70% of information/functionality for new users
20% of information/functionality for returning users
10% of information/functionality for power users

Derek Featherstone is an expert on usability, has a good website