Current Research Projects
Common Sense
To leverage this potential, we are developing sensor-equipped mobile devices that allow everyday citizens to collect environmental data. We are also collaborating with researchers at the University of California, Berkeley on the development of a novel sensor to measure particulate matter, which is a critical air pollutant.
Further, we are developing mobile and Internet-based software applications that allow people and communities to analyze and share the environmental data they collect, so that they can influence environmental regulations and policies. To make environmental sensing useful for practical action, one must do more than just “collect” and “present” data. While mobile sensing is an active research area, as yet little is known of how such systems might fit into the context of real-world environmental action. In order to inform future applications of mobile and pervasive technology, we have conducted design fieldwork on the social and organizational landscape of environmental action – government agencies, public health NGOs, atmospheric scientists, and so on.
We are leveraging our fieldwork in the design and development of a family of devices and software. First, we have developed a personal mobile device that collects data as people go about their daily lives. We are deploying this device and accompanying web software in West Oakland, in collaboration with a community action group. Second, we have developed a vehicular platform. We have collaborated with the City of San Francisco to put this system on the municipal fleet of street sweepers. The goal is to leverage mobile infrastructure to collect street-by-street readings as the vehicles move throughout the city. Our devices currently report GPS, carbon monoxide, ozone, nitrogen oxide, temperature, and humidity data. « less
|
Project Team
Intel Labs Berkeley
Paul Aoki Alan Mainwaring Allison Woodruff |
Collaborators
City of San Francisco
Office of the Mayor Department of the Environment Department of Public Works Isopod Design Chris Myers Nokia Research R.J. Honicky State University of Nizhny Novgorod Nikolay Chistyakov Max Sokolov UC Berkeley EECS Maneesh Agrawala John Canny Frederick Doering Prabal Dutta Neil Kumar Richard White Wes Willett Baladitya Yellapragada UC Berkeley Atmospheric Science Center Ron Cohen Paul Wooldridge West Oakland Environmental Indicators Project Brian Beveridge |
Confrontational Computing
How do people use the web to help them form beliefs about the world? How do people promote their own opinions to others online? What are the impacts of opinions expressed online, particularly when they are confrontational? Can we build tools that make it easier for a user to know when other people disagree with the opinion that they are reading? Can we build tools that help a user understand why others disagree with them?
Dispute Finder - Disputed Information on the Web
We have built Dispute Finder, a web browser extension that alerts a user when information they read online is disputed by a source that they might trust. Dispute Finder examines the text on every page the user reads and highlights claims and opinions that it believes conflict with information from other sources that the user might trust. If the user clicks on a highlighted phrase then Dispute Finder shows them a list of articles that put forward alternative points of view.
Dispute Finder builds a database of known disputed claims by crawling web sites that already maintain lists of disputed claims, and by allowing users to enter claims that they believe are disputed. Dispute Finder indentifies snippets that make known disputed claims by running a simple textual entailment algorithm inside the browser extension.
Disputed Information in Conversation
Much of the information we encounter in our lives comes from listening to other people talk. We are building an experimental device that listens to speech and alerts the user if they say are hear anything that is disputed. If a user hears or says something that is disputed, the device alerts them that there is another point of view that they might want to consider.Understanding the "how" and the "why" of Online Arguing
In addition to building new tools, this project is pursuing a social research agenda to better identify who is arguing online, why they argue, and how they go about arguing. Preliminary interviews with potential Dispute Finder users revealed a strong and nearly universal goal to "determine the bias" of regularly browsed news sources. This bias detection process generally involves comparing information between news sources and against personal experience, with a focus on which statements are in conflict and which ones are not. In terms of expressing one's (often controversial) opinions, we found that a slight majority of people avoid it. Their reasons are quite idiosyncratic, although they roughly fall into two the general categories: "don't have time" or "don't want to get involved". Of the remaining people that do express their opinions online, we have found that they very rarely express their opinions with the genuine hope of changing their readers' beliefs. Rather, they seem to be more motivated by the thrill of the conflict, or by the simple fact that expressing their opinions displays aspects of their own identity or skills. Come talk to us to learn more about what we have found and we are using the findings to refine Dispute Finder and the other tools we are building. « less|
Project Team
Intel Labs Berkeley
Rob Ennals Intel Labs Santa Clara John Mark Agosta Intel Labs People and Practices Research Tye Rattenbury Tad Hirsch |
Collaborators
UC Berkeley
Beth Trushkowsky Michael Armbrust Wesley Willet Dan Byler Nick Kong Joe Hellerstein Christine Robson Jesse Trutna Nick Lanham Armando Fox Maneesh Agrawala |
RouterBricks
The RouterBricks project is exploring a simple, but radical solution: that networks be built from general-purpose computers, rather than the narrowly specialized equipment used today. If feasible, this could lead to a whole new approach to building networks – one that leverages the familiarity and flexibility of the PC ecosystem to reshape the world of network equipment and services. By changing how networks are architected, new models for networked applications can be brought to the data center, internet, and networking infrastructure world. In short, what the PC did for computing could be extended to network infrastructure and programming.
But this isn’t just about recreating current networks with new building blocks. A transition from specialized networks to general-purpose ‘open’ infrastructure will pave the way for new network-centric services and business models – just as the PC broke the stranglehold mainframes had on the fledgling computer market. For example: “server-only” datacenters where compute servers do double duty as switches, with a fluid boundary between application and network processing; isolated “virtual” networks that can be leased to (and customized for) different customers and applications (video, gaming), akin to how data-centers used virtualization to launch cloud-computing; ISPs and data centers that can differentiate by quickly reprogramming their networks, deploying new services or implementing new security mechanisms.
In the RouterBricks project we have designed a software router architecture that achieves scalability by parallelizing router functionality both across multiple servers and across multiple cores within a single server. We have build a fully programmable 4-server prototype RouterBricks router (or “RB4”, as we call it) using the familiar Click/Linux environment and only off-the-shelf, general-purpose Intel server hardware. We are currently developing a set of applications to fully exploit the potential of the RouterBricks architecture, including content delivery, power management in enterprises, new data-center architectures and so forth. « less
|
Project Team
Intel Labs Berkeley
Kevin Fall Gianluca Iannaccone Maziar Manesh Sylvia Ratnasamy |
Collaborators
Ecole Polytechnique Federale de Lausanne (EPFL)
Katerina Argyraki Mihai Dobrescu Lancaster University Norbert Egi UC Los Angeles Eddie Kohler |
Disaster Response Communications
While governments help to organize response equipment such as neighborhood caches, communication technology is usually left to the basics (e.g., radios that may be unfamiliar or poorly maintained). The goal of the disaster response communications (DRC) project is to enable citizens to continue using familiar Internet applications on their personal devices (e.g., smart phones, laptops) even when the network infrastructure is degraded or barely functioning. There are three big challenges we are addressing in order to achieve this goal.
The first challenge is to provide communications in the face of degraded infrastructure. We propose handling this using the recently emerging Delay/Disruption Tolerant Networking (DTN) technology. DTN is a network architecture capable of tolerating significant connection disruption. DTN-equipped devices make use of relays that provide a store-carry-forward function (SCF). SCF is able to relay messages from one communication device to when a live network exists, can store messages until network connectivity is restored, and can also physically carry a message from one place to another (e.g., if mounted on or in a vehicle).
In our vision, regular laptops can serve as relays, thereby augmenting relays provided by disaster services. In order to cope with the stress and anxiety they experience during a disaster, people tend to communicate a great deal by any means possible. This can lead to the creation of an enormous amount of redundant content, such as “I’m ok” messages sent to everyone a person knows. This natural behavior aggravates the problem of limited connectivity, and leads to congestion and inefficient use of shared communication, storage, and power resources. To address this second big challenge we are designing support for sophisticated prioritization mechanisms, ranging from content-filtering, content compression, and meta data management. Using their personal devices, individuals can contribute text, images, audio and video to disaster information services or to their local community.
However relying on information collected from or distributed to individuals for critical decision making, creates a third challenge, namely that the supporting communication system needs to be trustworthy. In this context, security refers primarily to an ability to verify the origin and integrity of data, and to provide privacy and access control if requested. Common solutions to these issues require re-consideration in networks with intermittent communications. We are extending DTN with a security framework that supports data use controls. The approach allows the owner of data to specify the way data may be processed, stored, and combined in a format that is cryptographically bound to the data described (e.g., “disclose my location only to first responders”). How such controls may be relaxed or changed, by appropriate authorities, when lives or property are at risk, is a new area of research relevant to communications during disasters and in other extraordinary circumstances.
DRC is designing and implementing the above communications technologies, and is also interacting with the disaster response community to gain deeper understanding of the “real world” problems that arise during emergencies and disasters. « less
|
Project Team
Intel Labs Berkeley
Kevin Fall Gianluca Iannaccone Nina Taft |
Collaborators
UPMC Paris Universitas
Fernando Silveira Anna Pietilainen UC Berkeley Jayanthkumar Kannan Megan Finn Nokia Research and ICSI Pasi Sarolahti ESHARC - East Shore Amateur Radio Club Jordan Hayes Trinity College, Dublin, Ireland Alex McMahon Stephen Farrell Jet Propulsion Laboratory, NASA Scott Burleigh |
Power Aware Perception
The perception algorithms that make this future possible already exist today, but only as prototypes in universities and research labs. The algorithms aren’t accurate enough as yet and running them on phones will require throttling them to impractical speeds, which will drain your phone’s batteries extremely fast.
The PAPe project (pronounced "pa-pee") is about migrating these algorithms from research prototypes to compelling mobile applications on phones. We are designing new machine perception algorithms that consume less power, and developing a sophisticated cross layer power management architecture. In addition, given that existing hardware falls short in supporting such goals, we are also building a research prototype smartphone that will help the rest of the research community.
At the heart of most mobile perception systems are a host of compute- (and power-) hungry machine learning algorithms. To reduce their power consumption, we are building machine learning algorithms that run thousands of times faster than existing ones. These include algorithms for supervised learning, automatic clustering, and numerical optimization.
Our power management system is a closed-loop control system that measures the speed, power consumption, and accuracy of running programs and tunes their parameters on the fly to find a sweet spot between speed, power and accuracy. This control system simplifies the way people write programs: instead of carefully tuning every aspect of their applications when they develop their code, programmers can concentrate on functionality and simply expose the parameters of their applications to our run-time environment; the PAPe control system will measure the power footprint of the application and adaptively tune these parameters on the fly.
We are also building a mobile perception research platform that will let researchers experiment with these ideas on real hardware. This research platform isn't your typical smartphone: it has about 10x the computational capacity of your desktop; a variety of sensors help it see, hear, and be aware of its position and orientation; a built-in introspection system that allows it to measure the power consumption of every piece of hardware inside itself; and unlike your phone, which sits timidly in your pocket, you brandish the supercharged smartphone, so it can see your world the way you see it. « less
|
Project Team
Intel Labs Berkeley
Jaideep Chandrashekar Ling Huang Ali Rahimi Intel Labs Seattle Ben Greenstein |
Collaborators
UC Berkeley
Michael Jordan Ariel Kleiner Daniel Ting Steve Dawson-Haggerty Andrew Krioukov University of Illinois Urbana Champaign Rakesh Kumar Joseph Sloan David Kesler |
Flexible and Secure Distributed Systems
for Mobile and Cloud Applications
Dependable systems are secure systems that can tolerate faults during operation and are easy to manage. Building dependable systems is tricky, partly because the designer must anticipate the unexpected, and partly because the bar is high: such systems are expected to remain secure in the face of adverse conditions, which means protecting the privacy of sensitive data they hold, ensuring critical operations can only be performed by those authorized to do so, and that service is uninterrupted. Because of their complexity, dependable systems tend to be overspecialized: after you design, prove correct, build, and deploy a typical dependable system, you can't just go in and replace a component with a different one. If you do, you risk compromising all the desirable quality guarantees made by the designers, with adverse, unpredictable effects. On the other hand, deploying systems in practice requires not only doing the correct thing, but also doing it well given the available network bandwidth, CPUs, storage capacity, and other characteristics. For example, a system that is fast when the network bandwidth is ample may need to be changed significantly to be fast when the network bandwidth is constrained.
MOMMIE
Dependable replicated services (e.g., for high-assurance domains
including banking, finance, defense, health) are notorious for their
complexity and subtle deployment challenges. Our work focuses on
defining a clear, simple, expressive, and intuitive language that
algorithm designers can use to express their distributed algorithms
precisely and correctly, without worrying about deployment details and
optimizations. In parallel, we define a simple yet safe interface that
deployment engineers (or even a mathematical optimizer) can use to plug
in particular optimizations to match a given environment (network, CPUs,
trust assumptions, etc.), without worrying about violating the
correctness of the algorithm expressed by the designer. The system
resulting from the combination of the two independently developed
pieces—the algorithm and the deployment plan—can adapt to significantly
more deployment scenarios than a monolithically designed system would,
without burdening designers with undue complexity and without tying down
deployment engineers to a few, "vetted" optimization options. Our
prototype, MOMMIE (Middleware for Optimized Messaging in Insecure
Environments) demonstrates the ideas and provides us with an
experimentation platform for deeper optimizations and deeper algorithmic
abstractions.
Secure Data Capsules
The second focus of our flexible security work is web services, such as
on-line stores, image and video sharing sites, or brokerage
services. Such services typically contain sensitive information about
their customers, ranging from simple credit card numbers and mailing
addresses to high-volume information such as DVD-watching preferences,
detailed day-trading strategies, and health records. Because this
information lies within service data centers "in the raw," mixed with
complex service software, it is often abused. On one hand,
misconfiguration or software bugs may disclose it accidentally. On the
other hand, malicious insiders may exploit it for profit. Our work on
Secure Data Capsules aims to wrap customers' sensitive information in a
protected access interface that only discloses data in accordance with
the interface properties and the customer's desires. For benign
services, this provides an isolation buffer that protects their
customers' data and their own reputation. For less established services,
this allows customers to require and verify the existence of trusted
hardware or other trusted infrastructure, before yielding their
sensitive data. Depending on the needs of the particular service, the
expected performance, and the level of dependability the customer
requires, secure data capsules offer a variety of implementation choices
for the same logical isolation between customer data and service
code. We explore in particular physical isolation, software isolation
via virtualization, and software isolation via trusted hardware
enforcement. Our work will lead to greater dependability for web
services, greater privacy for customers, and more choice in the right
balance between cost and performance.
CloneCloud
The CloneCloud project takes the concept of flexibility to its extreme. It provides elastic execution for mobile applications by executing them on clouds of clones of the mobile device. CloneCloud specifically improves the performance of applications from resource-starved devices such as smartphones, by opportunistically off-loading them to available cloud resources in nearby datacenters. The idea is simple: clone the entire set of data and applications from the smartphone onto the cloud and selectively execute some operations on the clones, reintegrating the results back into the smartphone. One can have multiple clones for the same smartphone, clones pretending to be more powerful smartphones, etc. We can execute very expensive operations via cloud cloning such as image search, virus scanning, and data leak detection (a) without requiring application designers to explicitly plan for cloning, (b) without eating up the smartphone's battery power, and (c) with significant performance improvement. This same approach is broadly applicable to other weak devices such as tablets, netbooks, and mobile Internet devices.*-scope
The *-scope project seeks whole-system understanding, to ensure that
applications do what they think they do. This project answers the
question: are the expected security properties provided by the running
system? Those properties include data privacy, availability, and various
performance guarantees. At a "micro" level, *-scope traces data at a
fine granularity as they course through the different components of a
distributed application (e.g., smartphone applications, cloud software,
and enterprise networks). At a "macro" level, *-scope discovers
application and data dependencies, which it mines for property
violations. Our work will lead to better mobile device and cloud
management and greater security and privacy for customers.
« less
|
Project Team
Intel Labs Berkeley
Petros Maniatis Byung-Gon Chun |
Collaborators
UC Berkeley
Jayanthkumar Kannan Gunho Lee Lucian Popa Brown University Babi Papamanthou Rice University Michael Dietz Princeton University Sunghwan Ihm Intel Labs Seattle Jaeyeon Jung |
Eco-Sense Buildings
This work is performed in collaboration with Intel Labs Hillsboro and the Enjeu Energie Positive consortium in France that brings together key players in the eco-system for construction and operation of smart buildings. Consortium members include construction (Bouygues), building-management systems (Schneider Electric, Siemens), IT (Intel, Lexmark), lighting (Philips), office furniture (Steelcase), food preparation (Sodexo), alternative energy generation (Tenesol) and others. « less
Recent Research Projects
Yada
Yada aims to feel like existing sequential programming languages: programmers write code using the constructs they are familiar with (objects, loops, arrays, etc), but with explicit indications of parallelism: run this loop in parallel, run these two statements in parallel, etc. Yada guarantees that the parallel executions of these programs behave as if the loops and parallel statements were executed sequentially. For example, the radix sort example (facing page) can be understood as a sequential radix sort by considering that the forall loops are normal C for loops and by ignoring all the other Yada keywords (reduce, scan, barrier).
To ensure sequential-like behavior in a parallel execution, Yada programs must use special "sharing types" to declare data that is accessed in parallel in "interesting" (i.e. not just reads) fashion. For instance, in a Yada variable declared with a 'reduce(+)' annotation allows parallel increments. This annotation is used in the declaration of the buckets array in the radix sort example to allow the parallel increments (line 11) used to compute the histogram of the array being sorted. Similarly, the 'scan(+)' annotation allows increments and reads to be performed in parallel. This annotation is used in the declaration of the offsets array (line 5) to allow the parallel execution of the loop at line 18 which distributes elements from the input array x to the output array y, based on the earlier histogram results.
As a result of their sequential-like behavior, Yada programs can be understood, tested and debugged like sequential programs. This makes parallel program development much easier than in the more common, non-deterministic (two executions with the same input may produce different results) threaded and message-passing parallel programming paradigms. The second key element in Yada's design is explicit support for using parallel libraries. Currently multiple parallel libraries can readily only be used from sequential programs, greatly limiting their applicability. Fixing this problem is crucial to fast and cost-effective software development based on reusing existing libraries and frameworks.
An additional challenge is maintaining Yada's deterministic execution guarantee: if using a library in Yada reintroduces all the usual debugging and correctness problems common to parallel programming, then libraries will not help productivity. Thus, Yada will additionally enforce that programs using libraries remain deterministic, under reasonable assumptions about library behavior. We have built an initial prototype of Yada to help evaluate these ideas and refine our design. Our experience with this prototype on a collection of eight parallel algorithms and four applications shows that is practical to express realistic algorithms and applications in a deterministic programming language, with few changes from a sequential implementation. Furthermore, our prototype already achieves speedups (see the speedup graph for four sorting algorithms on various input sizes on an eight-core machine) that are competitive with implementations in non-deterministic programming environments. « less
|
Project Team
Intel Labs Berkeley
David Gay Mayur Naik |
Collaborators
UC Berkeley
Joel Galenson Kathy Yelick Susan Graham Paul Hilfinger |
Intel Mash Maker
As views a web page, Mash Maker will suggest ways that it can make the page more useful, and suggest these improvements on its tool bar. If the user clicks on the button for such an improvement then Mash Maker will apply it to the current page, potentially using other web sites and remote APIs, and potentially applying widgets that produce new visualizations or compute new data.
Mash Maker suggests improvements based on the meaning of the current page, the meaning of pages that the user has recently browsed, and the behavior of other users. « less
Data Mining for Anomaly Detection
Endhost and Enterprise Botnet Protection
The PROTEUS project aims to provide protection from botnets by tackling the problem from two vantage points: the end user and the centralized enterprise network control center. Some of our solutions are intended to live on laptops and desktops; other solutions are targeted to help IT departments manage network security more effectively within their enterprise. PROTEUS' guiding principle to helping users is that of building rich, user specific, location dependent behavioral profiles. These are composed by collecting a variety of data including network traffic patterns, location context, internet sites visited, user presence indicators, to name a few. Our behavioral-based detectors can successfully uncover covert botnet communication (when a PC communicates with the attack command and control center), and can identify attack activity in progress. A big focus in PROTEUS is that of reducing the number of false alarms that are generated, which are the bane of existing mitigation mechanisms in prevalence today.We have designed techniques that can rapidly differentiate a piece of malware as truly new (never seen before) from those malwares that are polymorphic variants of existing malware. Because IT departs can observe a few thousands of new malwares each day, this greatly helps human operators to sort out which malware requires manual inspection and which ones don't. A tool based on our malware classifier thus speeds up the productivity and effectiveness that IT security operators can provide to their enterprises.
We also work on protecting enterprise level mechanisms for DoS and scan detection from data poisoning. A key challenge in designing data driven mechanisms is protecting against adversaries that can inject erroneous data into the measurement infrastructure, which leads to an incorrect or inaccurate estimation of the normal behavior. If an algorithm learns the wrong model, the corresponding detector will behave poorly. To provide protection from data poisoning, we design algorithms that draw on methods from robust statistics to guard against such adversaries.
Diagnosis in Data Centers
Today's large-scale Internet services run on large server clusters in datacenters and cloud computing environments. The scale and complexity of such systems make it very difficult to monitor, debug and maintain the services. However, modern computers have more and more computing cores, and multiple cores can be allocated to monitoring the system itself; moreover, cloud computing makes it easy to use massively parallel infrastructure to process large-scale data for delivering timely monitoring and diagnosis results. In this project, we take advantage of the abundant computing power to mine console logs, the natural tracing information included in almost every software system, for system monitoring, problem detection and diagnosis. Our novel approach for mining console logs integrates source code analysis with text mining to extract structured information from textual console logs. This makes it very easy and flexible for system operators to create a variety of (application-specific) features, so that powerful machine learning methods can be applied to perform high quality pattern mining and accurate problem detection for the system. Our research yielded the first automated log mining process that can not only detect a large portion of runtime anomalies, but also provide easy-to-understand explanations to system operators.Researchers on these projects include Nina Taft, Jaideep Chandrashekar, Ling Huang, Dina Papagiannaki and Anthony Joseph. We collaborate with the RADlab at UC Berkeley, as well as with Cornell University, CMU, U.C. Irvine and U.C. Davis. « less
Millimeter Wavelength Systems
The cost-effective generation, modulation and detection of terahertz energies from 500Ghz to 10Thz. Thz energies occupy an interesting space that bridge the traditional domains of radio electronics and optical systems, and there are both electrical as well as optical approaches generating these signals. While the bandwidth of wireless systems operating at these frequencies will be very large, e.g., from 10's to 100's of Gbps, there are many other compelling applications for imaging and sensing systems. The realization of passive (blackbody) and active (illuminated) imaging systems, as well as micro-spectroscopy devices, are stepping stones in the development of component technologies leading to the eventual production of terahertz communication systems.
The application of electromagnetic metamaterials to antenna designs and engineering. Metamaterials are artificial structures of growing interest, made readily available by advances in micro- and nano-fabrication, that exhibit interesting properties, such as a negative index of refraction that do not normally occur in nature. These materials may both be characterized in material terms, such as permittivity and permeability as well as in transmission line terms, such as inductance, capacitance, and impedance. Metamaterials have broad implications for antenna design, especially for guiding, reflecting, and refracting electromagnetic waves at very high frequencies.
The design of electrically switched or steerable antenna systems and conformal arrays with controllable radiation patterns. Wireless systems operating at 60Ghz and above require high-gain directional antennas in order to achieve sufficient link margins because of the strong atmospheric attenuation and the limited transmission power achievable with small, low-power consumer devices. Moreover, these antenna systems must be co-designed given an understanding of the enclosure and form-factor of the end system. There are design challenges in both the antenna sub-system as well as its integration into the host device.
The exploration of higher-level protocols and algorithms for antenna discovery, alignment and tracking. The presence of high-gain antennas has broader implications for wireless systems. In the simple case, two devices may hear each other while using omni-directional antenna patterns and then iteratively optimize and refine their patterns to have increasing higher gain and narrow beams. In the more general case, two devices may only hear each other when their narrowly focused beams happen to be aligned, in which case a more general discovery and alignment system is necessary. Both RF techniques as well as new algorithms may contribute to the new generation of "directional MACs" needed to realize the potential of millimeter wavelength WLANs. « less