Archive for June, 2010

Doing “Big Science” In Academia

Recently, there has been a lot of handwringing in the systems community about the work that we can do in the age of mega-scale data centers and cloud computing.  The worry is that the really interesting systems today consist of tens of thousands of machines interconnected both within data centers and across the wide area.  Further, appropriate system architectures are heavily dependent on the workloads imposed by millions of users on particular software architectures.  The worry is that  we in academia cannot perform good research because we do not have access to either systems of the appropriate scale or application workloads to inform appropriate system architectures.

The concern further goes that systems research is increasingly being co-opted by industry, with many (sometimes most) of the papers in top systems and networking conferences being written by our colleagues in industry.

One of my colleagues hypothesized that perhaps the void in the systems community was partially caused by the void in “big funding” that was historically available to the academic systems community from DARPA. Starting in about 2000, DARPA moved to more focused funding to efforts likely to have direct impact in the near term.  Though, it looks that this policy is changing under new DARPA leadership, the effects in the academic community have yet to be felt.

My feeling is that all this worry is entirely misplaced.  I will outline some of the opportunities that go along with the challenges that we currently face in academic research.

First, for me, this may in fact be another golden age in systems research, borne out of tremendous opportunity to address a whole new scale of problems collaboratively between industry and academia. Personally, I find interactions with my colleagues in industry to be a terrific source of concrete problems to work on.  For example, our recent work on data center networking could never have happened without detailed understanding of the real problems faced in large-scale network deployments.  While we had to carry out a significant systems building effort as part of the work, we did not need to build a 10,000-node network to carry out interesting work in this space.  Even the terrific work coming out of Microsoft Research on related efforts such as VL2, DCell, and BCube typically employ relatively modest-sized system implementations as proofs of concepts of their designs.

A related approach is to draw inspiration from a famous baseball quote by Willie Keeler, “I keep my eyes clear and I hit ’em where they ain’t.” The analog in systems research is to focus on topics that may not currently be addressed by industry.  For example, while there has been tremendous interest and effort in building systems that scale seemingly arbitrarily, there has been relatively little focus on per-node efficiency.  So a recent focus of my group has been on building scalable systems that do not necessarily sacrifice efficiency.  More on this in a subsequent post.

The last, and perhaps best, strategy is to actively seek out collaborations with industry to increase overall impact on both sides. One of the best papers I read in the set of submissions to SIGCOMM 2010 was on DCTCP, a variant of TCP targeting the data center.  This work was a collaboration between Microsoft Research and Stanford with the protocol deployed live on a cluster consisting of thousands of machines.  The best paper award from IMC 2009 was on a system called WhyHigh, a system for diagnosing performance problems in Google’s Content Distribution Network.  This was a multi-way collaboration between Google, UC San Diego, University of Washington, and Stony Brook.  Such examples of fruitful collaborations abound.  Companies like Akamai and AT&T are famous for multiple very successful academic collaborations with actual impact on business operations.  I have personally benefitted from insights and collaborations with HP Labs on topics such as virtualization and system performance debugging.

I think the big thing to note is that industry and academia have long lived in a symbiotic relationship. When I was a PhD student at Berkeley, many of the must read systems papers came out of industry: the Alto, Grapevine, RPC, NFS, Firefly, Logic of Authentication, Pilot, etc., just as systems such as GFS, MapReduce, Dynamo, PNUTS, and Dryad are heavily influencing academic research today.  At the same time, GFS likely could not have happened without the lineage of academic file systems research, from AFS, Coda, LFS, and Zebra to xFS.  Similarly, Dynamo would not have been as straightforward if it had not been informed by Chord, Pastry, Tapestry, CAN, and all the peer to peer systems that came afterward.  The novel consistency model in PNUTS that enables its scalability was informed by decades of research in strong and weak data consistency models.

Sometimes things go entirely full circle multiple times between industry and academia.  IBM’s seminal work on virtual machines in the 1960’s lay dormant for a few decades before inspiring some of the top academic work of the 1990’s, SimOS and DISCO.  This work in turn led to the founding of VMWare, perhaps one of the most influential companies to directly come out of the systems community.  And of course, VMWare has helped define part of the research agenda for the system’s community in the past decade, through academic efforts like Xen.  Interestingly, academic work on Xen led to a second high-profile company, XenSource.

This is all to say that I believe that the symbiotic relationship between industry and academia in systems and networking will continue.  We in academia do not need a 100,000-node data center to do good research, especially by focusing on direct collaboration with industry where it makes sense and otherwise on topics that may not be being directly addressed by industry.  And the fact that there are so many great systems and networking papers from industry in top conferences should only serve as inspiration, both to define important areas for further research and to set the bar higher for the quality of our own work in academia.

Finally, and only partially in jest, all the fundamental work in industrial research is perhaps further affirmation of the important role that academia plays, since many of the people carrying out the work were MS and PhD students in academia not so long ago.

SIGCOMM 2010 Travel Grants and VISA Workshop

This year, I had the pleasure of serving on the SIGCOMM 2010 program committee.  I may write more about the experience later, but the short version is that I really enjoyed reading the papers and was particularly impressed by the deep discussions at the two-day program committee last month.  K.K. Ramakrishnan and Geoff Voelker did a terrific job as co-chairs and I believe their efforts are well reflected in a very strong program.

The conference will be held in New Delhi this year and the organizing committee has been fortunate to secure some generous support for travel grants.  This year, grants will be available not just for students, but also for post docs and junior faculty.  The deadline for application has been extended to June 12, 2010.  Full details are available here.  On behalf of the SIGCOMM organizing committee, I encourage everyone interested to apply.

If you do attend SIGCOMM, let me also put in a plug for the VISA workshop.  This is the second workshop on Virtualized Infrastructure Systems and Architecture, building on the successful program we had last year.  I was the co-program chair this year with Guru Parulkar and Cedric Westphal.  Virtualization remains an important topic and VISA is playing an important role for discussion of important problems across systems and networks.

Amin Vahdat is a Professor in Computer Science and Engineering at UC San Diego.

June 2010