PILOT SEMINAR

Organized by Computer Science Associate DGS (started by Madhusudan Parthasarathy and Darko Marinov)

This seminar series is for practice of academic job talks by Illinois students and postdocs.
We have several students and postdocs going on the job market each year, and this seminar aims to give them feedback from faculty who are outside their area.

What do you need to do to give a talk in this seminar?

Any graduating student or postdoc who is going to be on the academic job market can give a PILOT seminar.
All faculty, postdocs, and students will be invited to attend and give comments on the talk.

These talks are meant to be the penultimate talk before the interview, where the presenter seeks feedback from faculty outside the area.

We expect the presenter to have already given some practice talks earlier (we encourage at least two earlier talks, one to the advisor's group, and another to the relevant area).

To ensure there will be some faculty who attend the talks, we expect the following:

The presenter, with the help of the advisor, must invite at least five faculty personally to attend their talk (consider inviting seminar founders, DGS, and Associate DGS), and ensure that at least three faculty can come to the talk. These faculty should be outside the primary area of the presenter. When you ask for available time, plan for 90min slots (60min for talk + 30min for feedback), and if you create a poll, enable the "maybe" option because some people may attend only a part of the talk and meet you later.
The time for the seminar should be fixed based on the availability of these faculty. To find a time that doesn't overlap with the job talks or departmental seminars, please ask your advisor to check the available dates on the departmental calendar or faculty Wiki that lists all upcoming job talks (e.g., in Spring 2022 departmental talks take place on Mondays and Wednesdays at 3:30 pm, so it is recommended to avoid scheduling your seminar on these dates/times).

More faculty may attend the talk as it will be publicly announced, but we would like to see some effort by the presenter and advisor in ensuring at least some people come to the talk.

Once you have a list of three faculty outside your area who have promised to attend your talk at some agreed time, please:

edit this page to enter the date/time of your talk (sorted by date), your name, the talk title and abstract, and short bio;
email Erin Henkelman (speakerseries@cs.illinois.edu) and Darko Marinov (marinov@illinois.edu) so that the department can schedule a physical room (hopefully we don't go fully online ever again!) and announce your talk; and
fill out the form https://forms.illinois.edu/sec/3480516 no later than Thursday the week before your presentation date so the Speakers Series team can have sufficient time to set up and advertise your talk (materials sent after Thursday may not be included in departmental advertising). A member of the Speakers Series team will try to be at the beginning of your talk to help you get set up.

If you want to be hybrid and use Zoom, create a room on your own (so you get the video faster than if the department created a room for you). The room should be reserved for at least 90 minutes (60 minutes to present and at least 30 minutes to get feedback). If you use Zoom, please ask one of your attending faculty members to serve as a question moderator for your talk. They can help you manage the chat/questions during your seminar.

Please put slide numbers (in a visible place) on your slides during practice job talks.

You may find it useful to read these guidelines about academic job interviews:

Getting an academic job by Michael Ernst - https://homes.cs.washington.edu/~mernst/advice/academic-job.html
Computer Science Grad Student Job Application & Interview Guide by Westley Weimer, Claire Le Goues, and Zak Fry - http://web.eecs.umich.edu/~weimerw/grad-job-guide/guide
How to get a faculty job, Part 2: The interview by Matt Welsh - http://matt-welsh.blogspot.com/2012/12/how-to-get-faculty-job-part-2-interview.html
Tips on the Interview Process by Jeannette M. Wing - https://www.cs.cmu.edu/afs/cs/usr/wing/www/talks/tips.pdf
Five Surprises from My Computer Science Academic Job Search by Arvind Narayanan - https://33bits.wordpress.com/2012/10/01/five-surprises-from-the-computer-science-academic-job-search
Welcome to the Job Market by Elizabeth Bondi-Kelly - https://sites.google.com/view/elizabethbondi/blog
Tips for Computer Science Faculty Applications by Yisong Yue - https://yisongyue.medium.com/checklist-of-tips-for-computer-science-faculty-applications-9fd2480649cc
Reflections on the CS academic and industry job markets by Rowan Zellers - http://rowanzellers.com/blog/rowan-job-search
Fantastic Faculty Jobs and How to Get Them by Jia-Bin Huang - https://dropbox.com/s/avkflol8mx99c7e/2022_12_05%20Academic%20Job%20workshop.pptx?dl=0
Faculty Application Advice by Sylvia Herbert - https://sylviaherbert.com/faculty-application-advice
UPenn has a lot of resources, e.g., https://cdn.uconnectlabs.com/wp-content/uploads/sites/74/2019/08/Faculty-job-application-guide.pdf linked from https://careerservices.upenn.edu/resources/guide-to-faculty-job-applications

If you are going on the job market soon, please add your info to https://cs.illinois.edu/about/people/graduating-phd-students or https://cs.illinois.edu/about/people/postdocs

2022-2023 Schedule

Date and Time	Room	Speaker	Title and Abstract
Jan 4th (Wednesday) 10am-11:30am	2405 SC, zoom (https://illinois.zoom.us/my/manling2?pwd=SzM5Wk5neWlEK3VVTXBoa2ZMUXduZz09)	Manling Li	Title: From Entity-Centric to Event-Centric Multimodal Knowledge Acquisition Abstract: Events (what happened, who, when, where, why) describe fundamental human activities and are the core knowledge communicated through multiple forms of information, such as text, images, videos, or other data modalities. Our minds represent events at various levels of granularity and abstraction, which allows us to quickly access historical scenarios and reason about the future. Traditionally, multimodal information consumption has been entity-centric with a focus on concrete concepts (such as objects, object types, physical relations), or oversimplifying event understanding to be single-modal (text-only or vision-only), local, sequential and flat. Real events are multimodal, structured and probabilistic. Hence, I focus on Multimodal Information Extraction, and propose Event-Centric Multimodal Knowledge Acquisition to transform traditional entity-centric single-modal knowledge into event-centric multi-modal knowledge. Such a transformation poses two significant challenges: (1) understanding multimodal semantic structures that are abstract (such as events and semantic roles of objects): I will present a novel framework, CLIP-Event, to learn visual semantic structures via a zero-shot cross-modal transfer; (2) understanding temporal dynamics: I will introduce Event Graph Schema to capture complex timelines, intertwined participant relations and multiple possible outcomes. Such Event-Centric Multimodal Knowledge opens up the next generation of information access for deep semantic understandings behind the multimodal information. I will also show its positive results on long-standing open problems, such as timeline generation, meeting summarization, and question answering. Bio: Manling Li is a Ph.D. candidate at the Computer Science Department of University of Illinois Urbana-Champaign. Her work on multimodal knowledge extraction won the ACL'20 Best Demo Paper Award, and the work on scientific information extraction from COVID literature won NAACL'21 Best Demo Paper Award. She was a recipient of Microsoft Research PhD Fellowship in 2021. She was selected as a DARPA Riser in 2022, and a EE CS Rising Star in 2022. She was awarded C.L. Dave and Jane W.S. Liu Award, and has been selected as a Mavis Future Faculty Fellow. She led 19 students to develop the UIUC information extraction system and ranked 1st in DARPA AIDA TA1 evaluation each year. She has more than 30 publications on multimodal knowledge extraction and reasoning, and gave tutorials about event-centric multimodal knowledge at ACL'21, AAAI'21, NAACL'22, AAAI'23, etc. Additional information is available at https://limanling.github.io/.
Jan 16th (Monday) 1pm-2:30pm	2405 SC, zoom https://illinois.zoom.us/j/85237726901?pwd=SXYxTUloOHAwaGIxa3pvbW93SUdRZz09	Saikat Dutta	Title: Randomness-Aware Testing of Machine Learning-based Systems Abstract: Machine Learning is rapidly revolutionizing the development of many modern-day systems. However, testing Machine Learning-based systems is challenging due to 1) the presence of non-determinism in internal components (e.g., stochastic algorithms) and external factors (e.g., execution environment) and 2) the lack of accuracy specifications. Most traditional software testing techniques, while widely used to improve software reliability, cannot tackle these challenges since they predominantly rely on an assumption of determinism and lack domain knowledge. The goal of my research is to develop novel testing techniques and tools to make Machine Learning-based systems more reliable. In this talk, I will present my work on automatically detecting bugs in Machine Learning-based systems and improving the quality of developer-written tests in such systems. My research exploits the fundamental principle that we can systematically reason about non-determinism and accuracy using rigorous statistical and probabilistic reasoning. I develop novel static and dynamic analyses for testing ML-based systems that build on this principle. My research exposed more than 50 bugs and improved the quality of hundreds of tests in more than 60 popular Machine Learning libraries, some of which are used in large-scale software ecosystems at companies like Microsoft, Google, Meta, Uber, and DeepMind. Bio: Saikat Dutta is a PhD Candidate in the Computer Science Department at UIUC, advised by Prof. Sasa Misailovic. Saikat’s research interests are at the intersection of Software Engineering and Machine Learning. Saikat’s research focuses on improving the reliability of Machine-learning based systems by developing novel testing techniques and tools. Saikat is the recipient of the Facebook PhD Fellowship, 3M Foundation Fellowship, and the Mavis Future Faculty Fellowship. More information at https://saikatdutta.web.illinois.edu.
Jan 26th (Thursday) 1:30 pm to 3pm	Online on Zoom - https://illinois.zoom.us/j/89767697233?pwd=TlcxanpMWDlmMEVZSk1xem1UOUp5Zz09	Pubali Datta	Title: Looking Past the Abstractions: Characterizing Information Flow in Real-World Systems Abstract: Abstractions have proven essential for us to manage computing systems that are constantly growing in size and complexity. However, as core design primitives are obscured, these abstractions can also engender new security challenges. My research investigates these abstractions and the underlying core functionalities to identify the implicit flow violations in modern computing systems. In this talk, I will detail my efforts in characterizing flow violations and investigating attacks leveraging them. I will first describe how the “stateless” abstraction of serverless computing platforms masks a reality in which functions are cached in memory for long periods of time, enabling attackers to gain quasi-persistence and how such attacks can be investigated through building serverless-aware provenance collection mechanisms. Then I will further investigate how IoT automation platforms (i.e., Trigger-Action Platforms) abstracts the underlying information flows among rules installed within a smart home. I will present my findings on modeling and discovering inter-rule flow violations through building an information flow graph for smart homes. These efforts demonstrate how practical and widely deployable secure systems can be built through understanding the requirements of systems as well as identifying the root cause of violations of these requirements. Bio: Pubali Datta is a PhD candidate at the University of Illinois Urbana-Champaign where she is advised by Professor Adam Bates in the study of system security and privacy. Pubali has conducted research on a variety of security topics, including IoT security, serverless cloud security, system auditing and provenance. Her dissertation is in the area of serverless cloud security, particularly in designing information flow control, access control and auditing mechanisms for serverless platforms – tailored to meet the design and operational requirements of such systems. Pubali has participated in graduate internships at Samsung Research America, SRI International and VMware. She will earn her Ph.D in Computer Science from the University of Illinois Urbana-Champaign in the Spring of 2023.
Jan 30th (Monday) 2pm-3:30pm	2405 SC, zoom: https://illinois.zoom.us/j/87476498465?pwd=UGV4b2dGU3ZFZ3dCckFDVEkwbzd3dz09	Riccardo Paccagnella	Title: Software Security Challenges in the Era of Modern Hardware Abstract: Today’s hardware cannot keep secrets. Indeed, the past two decades have seen the discovery of a slew of attacks where an adversary exploits hardware features to leak software’s sensitive data. These attacks have shaken the foundations of computer security and caused a major disruption in the software industry. Fortunately, there has been a saving grace, namely the widespread adoption of models that have enabled developers to build secure software while comprehensively preventing hardware vulnerabilities. In this talk, I will present two new classes of vulnerabilities that fundamentally undermine these prevailing models for building secure software. In the first part, I will demonstrate that the current constant-time programming model is insufficient to guarantee constant-time execution. In the second part, I will demonstrate that the current resource partitioning model is insufficient to guarantee software isolation. Finally, I will provide an overview of my future research plans for enabling the design of more secure software and hardware systems. Bio: Riccardo Paccagnella is a PhD candidate in Computer Science at the University of Illinois Urbana-Champaign. His research is in system and hardware security. Riccardo is a recipient of a Distinguished Reviewer Award at the IEEE S&P 2021 Shadow PC, a Siebel Scholars Award, and a Chirag Foundation Graduate Fellowship. His work has been covered by national and international press — including Ars Technica, New Scientist, and Wired — and recognized with prestigious awards, including the Pwnie 2022 Award for Best Cryptographic Attack, the CSAW 2022 Applied Research Competition Best Paper Runner-up Award, a Pwnie 2021 Nomination for Most Innovative Research, and a CSLSC 2022 Best Presentation Award. In light of his research, the cryptographic community and several companies (including Cloudflare, Microsoft, Intel, AMD, Ampere, ARM) have taken action that includes patching cryptographic libraries, issuing security advisories, and creating new guidance for writing secure cryptographic code.
Feb 6th (Monday) 11am-12:30pm	2405 SC, Zoom: https://illinois.zoom.us/j/5494764956?pwd=MDNnaE5CWG0yRVlEZWl5bldoRnErZz09 passcode if asked: 021795	Xiaohong Chen	Title: Matching Logic: Foundation of a Trustworthy Programming Language Framework Abstract: We write programs in programming languages and use various language tools to perform various computing/analyzing tasks. For example, we use a compiler or an interpreter to execute programs, a symbolic executer to execute programs with symbolic input, and a formal verifier to verify programs. However, these language tools work like a "black box" and produce no correctness certificates for the tasks they perform. Therefore, we have to trust them for what they claim about our programs, which creates a very large "trust base" in today's computing space. My research aims at reducing the trust base of language execution and analysis tools using a trustworthy programming language framework. In this framework, programming languages are rigorously and completely defined using logical axioms and mathematical notations. Language tools are automatically generated by the framework, and their correctness is certified by complete, rigorous, transparent, machine-checkable, and human-accessible proof certificates. Most importantly, these proof certificates can be automatically checked using a very small proof checker, serving as the minimal trust base of the framework. In this talk, I will present matching logic as the unifying logical foundation of such a trustworthy programming language framework. I will present the basics of matching logic and show how various program properties and programming languages can be uniformly specified using matching logic formulas and axioms. I will show how to generate matching logic proofs to certify the correctness of program interpreters and formal verifiers, and how to check those proofs using the matching logic proof checker, which has only 240 lines of code. Finally, I will provide an overview of my future research plans for enabling the design and implementation of more transparent and trustworthy programming language tools. Bio: Xiaohong Chen is a Ph.D. candidate in the computer science department at UIUC, advised by Prof. Grigore Rosu. Xiaohong's research interests are in logic, formal methods, and programming languages, with a focus on using rigorous machine-checkable proof certificates to reduce the trust base of various programming language tools. Xiaohong's research on matching logic (http://matching-logic.org) as a unifying foundation for programming has helped improve the safety and reliability of the K language framework (https://kframework.org). Xiaohong is the recipient of the Yunni and Maxine Pao Memorial Fellowship, the Mavis Future Faculty Fellowship, and the Graduate School Dissertation Completion Fellowship. His research proposal has been funded by the Ethereum Foundation for its potential to make smart contracts more trustworthy and transparent. More information at http://xchen.page/.
Feb 20th (Monday) 11am-12:15pm	Online on Zoom: https://illinois.zoom.us/j/7030162755?pwd=QkY2OHI2K1ZFdjY3S3FwcU5FT05tUT09	Jiaxin Huang	Title: Label-Efficient Textual Knowledge Extraction and Utilization Abstract: With tremendous amounts of texts across the Internet nowadays, various Natural Language Processing (NLP) systems are built to help people seek for valuable knowledge from massive corpora, by performing knowledge-intensive tasks like text retrieval, concept organization, commonsense reasoning, and question answering. Despite the remarkable success, most existing NLP systems still rely on large amounts of task-specific training data, which are costly to obtain. My research designs principled approaches for label-efficient, knowledge-based NLP applications which rely on minimal human supervision. In this talk, I will introduce a general framework for textual knowledge extraction and utilization: (1) concept ontology construction by transforming generic linguistic knowledge encoded in pre-trained language models into hierarchical structures connecting entities; (2) entity extraction by replacing manual prompt template designs with automatic soft verbalizer learning; (3) commonsense reasoning via entity knowledge prompting and iteratively optimizing reasoning paths generated by language models. Bio: Jiaxin Huang is a final-year Ph.D. candidate in the Department of Computer Science at University of Illinois, Urbana-Champaign, fortunately advised by Prof. Jiawei Han. Jiaxin's research interests lie in text mining and natural language processing with minimal human supervision. Her recent research focuses on (1) using pre-trained language models to automatically extract domain-specific hierarchical concepts and entities for structured knowledge construction; (2) extracting human actionable knowledge such as commonsense reasoning by prompting and training language models via machine-generated explicit reasoning paths. She is a recipient of the Microsoft Research PhD Fellowship (2021-2023).
Feb 23rd (Thursday) 11am-12:30pm	Online on Zoom: https://illinois.zoom.us/j/2432644784?pwd=M0NYZkpIUThXM0I2bCtpcUxYbHJjZz09 password if asked: 209453	Linyi Li	Title: Certifying Trustworthy Deep Learning Systems at Scale Abstract: Along with the wide deployment of deep learning (DL) systems, their lack of trustworthiness (robustness, fairness, numerical reliability, etc) is raising serious social concerns, especially in safety-critical scenarios such as autonomous driving, aircraft navigation, and facial recognition. Hence, a rigorous and accurate evaluation of the trustworthiness of DL systems is critical before their large-scale deployment. In this talk, I will introduce my research on certifying critical trustworthiness properties of large-scale DL systems. Inspired by techniques in optimization, cybersecurity, and software engineering, my work computes rigorous worst-case bounds to characterize the degree of trustworthiness for a given DL system and further improve such bounds via strategic training. Specifically, I will introduce two representative frameworks: (1) DSRS is the first framework with theoretically optimal certification tightness. DSRS along with our training method DRT and accompanying open-source tools (VeriGauge and alpha-beta-CROWN) is the state-of-the-art and award-winning solution for achieving DL robustness against constrained perturbations. (2) TSS is the first framework for building and certifying large DL systems with high accuracy against semantic transformations. TSS opens a series of subsequent research on guaranteeing semantic robustness for various downstream DL and AI applications. I will conclude this talk with a roadmap that outlines several core research questions and future directions on trustworthy machine learning. Bio: Linyi Li is a Computer Science PhD candidate advised by Prof. Bo Li and co-advised by Prof. Tao Xie at UIUC. Prior to his PhD, Linyi Li earned his bachelor’s degree in Computer Science from Tsinghua University in 2018. His research lies in the intersection of computer security, machine learning, and software engineering. He focuses on building certifiably trustworthy deep learning systems at scale by proposing state-of-the-art certification and training methods for various trustworthy properties such as robustness, fairness, and numerical reliability. He has published over 20 papers at S&P, CCS, ICML, NeurIPS, ICLR, ICSE, FSE, etc. He is the main developer or key contributor of several widely-known and award-winning deep learning certification tools including alpha-beta-CROWN (winner of VNN-COMP 2022), VeriGauge, CROP, and COPA. Linyi is a recipient of Adversarial Machine Learning Rising Star Award, Rising Star in Data Science Award, the Wing Kai Cheng Fellowship, and a finalist of Qualcomm Innovation Fellowship and Two Sigma PhD Fellowship.

2021-2022 Schedule

Date and Time	Room	Speaker	Title and Abstract
February 9th (Wednesday) 4pm-5:30pm	Zoom set up by presenter	Xinya Du	Title: Towards More Intelligent Extraction of Information from Documents Abstract: Large amounts of text are written and published daily. As a result, applications such as reading through the documents to automatically extract useful and structured information from the text have become increasingly needed for people’s efficient absorption of information. They are essential for applications such as answering user questions, information retrieval, and knowledge base population. In this talk, I will focus on the challenges of finding and organizing information about events and introduce my research on leveraging knowledge and reasoning for document-level information extraction. In the first part, I’ll introduce methods for better modeling the knowledge from context: (1) generative learning of output structures that better model the dependency between extracted events to enable more coherent extraction of information (i.e., event A happening in the earlier part of the document is usually correlated with event B in the later part). (2) How to utilize information retrieval to enable memory-based learning with even longer context. In the second part, to better access relevant external knowledge encoded in large models for reducing the cost of human annotations, we propose a new question-answering formulation for the extraction problem. I will conclude by outlining a research agenda for building the next generation of efficient and intelligent machine reading systems with close to human-level reasoning capabilities. Bio: Xinya Du is a Postdoctoral Research Associate at the University of Illinois at Urbana-Champaign working with Prof. Heng Ji. He earned a Ph.D. degree in Computer Science from Cornell University, advised by Prof. Claire Cardie. Before Cornell, he received a bachelor's degree in Computer Science from Shanghai Jiao Tong University. His research is on natural language processing, especially methods that leverage knowledge & reasoning skills for document-level information extraction. His work has been published in leading NLP conferences such as ACL, EMNLP, NAACL and has been covered by major media like New Scientist. He has received awards including the CDAC Spotlight Rising Star award and SJTU National Scholarship.
February 25th (Friday) 11:30am-1pm	Zoom set up by presenter	Suraj Jog	Title: Scalable Next-Generation Wireless Networks Abstract: The next generation of wireless technologies will provide unprecedented capabilities -- gigabyte communication speeds at ultra-low latencies, hyper-precise localization, and vision-like perception. This will enable a plethora of new applications like wireless virtual and augmented reality, self-driving cars, space communications, precision agriculture, high-performance computing, and more. However, while these performance leaps have been demonstrated in the context of constrained networks with single users and controlled environments, the question of scaling these next-gen wireless technologies to large networks in the wild consisting of multiple heterogeneous nodes remains unsolved. In this talk, I will present three examples of my research that addresses these scalability challenges across different applications, each with unique objectives and constraints. First, I will talk about enabling extreme dense spatial packing of users for untethered wireless streaming in multi-user VR and AR applications, where we can scale the wireless network data rate with the number of clients without suffering interference. Second, I will discuss the challenges of scaling hyper-precise localization enabled by the high bandwidth 5G cellular technologies to ubiquitously deployed low power IoT nodes in the wild. I will show how we can leverage RF-acoustics microsystems to design new kinds of RF filters that can preserve the high localization resolution on narrowband IoT devices that sample 16x below Nyquist. Finally, I will also discuss interdisciplinary research avenues where chip-scale millimeter-wave wireless networks promise to revolutionize new application domains like High-Performance Computing. I will demonstrate how we can leverage deep reinforcement learning and AI tools to learn and generate new networking protocols for the wireless interconnects on multicore processors, which in turn will enable multicore processors to scale to hundreds and thousands of cores. I will conclude the talk with future directions in next-gen cellular and wireless research, both in terms of core methods as well as applications. Bio: Suraj Jog is a Ph.D. candidate in Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign (UIUC), working with Haitham Hassanieh. His research is focused on next-generation wireless networking and wireless sensing. Through his research, he has designed and built systems that can deliver seamless scalability in multiple application domains for millimeter-wave technology, such as gigabit-speed wireless communications, localization and imaging, and wireless networks-on-chip. His research has been recognized with the Qualcomm Innovation Fellowship, Joan and Lalit Bahl Fellowship, Mavis Future Faculty Fellowship, M.E Van Valkenburg Fellowship, Rambus Computer Engineering Fellowship, and more.
March 2nd (Wednesday) 9am-10:30am	Zoom set up by presenter	Xuan Wang	Title: Automated Scientific Knowledge Extraction from Massive Text Data Abstract: Text mining is promising for advancing human knowledge in many fields, given the rapidly growing volume of text data (e.g., scientific articles, medical notes, and news reports) we are seeing nowadays. In this talk, I will present my work on automatically extracting knowledge from massive text data to enable and accelerate scientific discovery. First, I will talk about my work on information extraction with minimum human supervision. With the growing volume of text data and the breadth of information, it is inefficient or nearly impossible for humans to manually find, integrate, and digest useful information. To address the above challenge, I have developed methods that automatically extract entity and relation information from massive text data with minimum human supervision. Second, I will talk about my work on literature-based scientific knowledge discovery. This research direction aims to enable and accelerate real-world knowledge discovery with the rich information we automatically extracted from scientific text. I have collaborated with domain experts in various scientific disciplines (e.g., chemistry, biomedicine, and health) to achieve this goal. Last, I will conclude my talk with future directions on using text mining to address open scientific problems, such as to assist chemical and biological molecule design and to support clinical drug discovery. Bio: Xuan Wang is a fifth-year Ph.D. student in the Computer Science Department at the University of Illinois at Urbana-Champaign (UIUC). She is working in the Data Mining Group under the supervision of Prof. Jiawei Han. Xuan received M.S. in Statistics (2017) and M.S. in Biochemistry (2015) from UIUC. She received B.S. in Biological Science (2013) from Tsinghua University, China. Her research interests are in text mining and natural language processing, emphasizing applications to biological and health sciences. Her current research theme is developing effective and scalable algorithms and systems for automatically understanding massive text data to enable and accelerate scientific discovery. Xuan has published about 20 research/demo papers in top NLP conferences (e.g., ACL and EMNLP) and biomedical informatics journals (e.g., Bioinformatics) and conferences (e.g., ACM-BCB and IEEE-BIBM). She is the recipient of the YEE Fellowship Award in 2020-2021 from UIUC.
March 2nd (Wednesday) 2:30pm-4pm	Zoom set up by presenter	Jing Liu	Title: Robust Learning & Inference with Applications in Distributed Learning and IoT Abstract: Robustness is of paramount importance in modern, scalable, and distributed machine learning (ML) and artificial intelligence (AI), particularly for safety-critical applications. On the one hand, distributed learning (e.g., Federated Learning) has emerged as a communication efficient, privacy-enhancing, and scalable approach for training without explicit centralized data collection. Unfortunately, training models with distributed data and computation further increases vulnerability to adversarial corruptions. This talk will outline modern solutions to fundamental estimation problems such as certifiable Robust Linear Regression, Robust PCA, and High-dimensional Robust Mean Estimation. Using these tools as building blocks, I will present recent work on Robust Distributed Learning & Inference. I will conclude the talk with future directions in efficient and trustworthy Artificial Intelligence of Things (AIoT). Bio: Jing Liu is an Illinois Future Faculty fellow in computer science at the University of Illinois at Urbana Champaign. His research interests include Data Science, the Internet of Things (IoT), and Distributed Learning & Inference. Liu was a postdoc in the Coordinated Science Lab and obtained his Ph.D. from UCSD. Liu is the recipient of several awards, including the Shannon Graduate Fellowship nomination award and Frontiers of Innovation Fellowship in UCSD, Guanghua Fellowship in Tsinghua University, National Fellowships of China, as well as Silver Medal, Young Mentor award in Beijing Institute of Technology, and a prize of Beijing Science & Technology Award.

2020-2021 Schedule

Date and Time	Room	Speaker	Title and Abstract
February 1st (Monday) 11AM-12:30PM	Zoom set up by department	Wing Lam	Title: Taming Flaky Tests in a Non-Deterministic World Abstract: As software evolves, developers typically perform regression testing to ensure that their code changes do not break existing functionality. During regression testing, developers often waste time debugging their code changes because of spurious failures from flaky tests, which are tests that nondeterministically pass or fail on the same code. These spurious failures mislead developers because the failures are due to bugs that existed before the code changes. My work on characterizing flaky tests has helped open the research topic of flaky tests, and many companies (e.g., Facebook, Google, Microsoft) have since highlighted flaky tests as a major challenge in their software development. In this talk, I will describe my recent work on taming flaky tests. Two prominent kinds of flaky tests are order-dependent flaky tests, which pass when run in one order but fail when run in a different order, and async-wait flaky tests, which pass if an asynchronous call finishes on time but fail if it finishes too late. My results include the first automated techniques to (1) fix order-dependent flaky tests, fixing 92% of such flaky tests in a public dataset; (2) reduce the number of spurious failures from order-dependent flaky tests, reducing such failures by 73%; and (3) speed up async-wait flaky tests while also reducing their spurious failures, speeding up such tests by 38%. Overall, my work has helped detect more than 2000 flaky tests and fix more than 500 flaky tests in over 150 open-source projects. Bio: Wing Lam is a PhD candidate in the Computer Science department at the University of Illinois at Urbana-Champaign where he is co-advised by Professors Tao Xie and Darko Marinov. He works on several topics in software engineering, with a focus on software testing. Wing's research improves software dependability by characterizing bugs and developing novel techniques to detect and tame bugs. He has published in top-tier conferences such as ESEC/FSE, ICSE, ISSTA, OOPSLA, and TACAS. His techniques have helped detect and fix bugs in open-source projects and have impacted how Microsoft and Tencent developers test their code. Wing has been awarded several fellowships and scholarships, including a Google - CMD-IT Dissertation Fellowship Award. More information is available on his web page: http://mir.cs.illinois.edu/winglam
February 2nd (Tuesday) 11:30AM-1PM	Zoom set up by department	Wajih Ul Hassan	Title: Detecting and Investigating System Intrusions with Provenance Analytics Abstract: Stories of devastating data breaches continue to dominate headlines around the world. Equifax, Target, and Office of Personnel Management are just a few examples of high-profile data breaches over the past decade. Despite a panoply of security products and increasing investment in data security, attackers are continually finding new ways to outsmart defenses to gain access to valuable data, indicating that current security approaches are ineffective. Data provenance describes the detailed history of system execution, allowing us to understand how system objects came to exist in their present state and providing means to identify the root cause of system intrusions. My research leverages provenance analytics to empower system defenders to quickly and effectively detect and investigate malicious behaviors. In this talk, I will first present a provenance-based solution for combatting the “Threat Alert Fatigue” problem that currently plagues enterprise security. Next, I will describe an approach for performing accurate and high-fidelity attack forensics using a novel adaptation of program analysis techniques. I will conclude by discussing the promise of provenance analytics to address open security and auditing problems in complex computing systems and emerging technologies. Speaker Bio: Wajih Ul Hassan is a doctoral candidate advised by Professor Adam Bates in the Department of Computer Science at the University of Illinois at Urbana-Champaign. His research focuses on securing complex networked systems by leveraging data provenance approaches and scalable system design. He has collaborated with NEC Labs and Symantec Research Labs to integrate his defensive techniques into commercial security products. He received a Symantec Research Labs Graduate Fellowship, a Young Researcher in Heidelberg Laureate Forum, an RSA Security Scholarship, a Mavis Future Faculty Fellowship, a Sohaib and Sara Abbasi Fellowship, and an ACM SIGSOFT Distinguished Paper Award.
February 23 (Tuesday) 11:00AM-12:30PM	Zoom set up by department	Yunan Luo	Presentation title: Machine learning for large- and small-data biomedical discovery Abstract: In modern biomedicine, the role of computation becomes more crucial in light of the ever-increasing growth of biological data, which requires effective computational methods to integrate them in a meaningful way and unveil previously undiscovered biological insights. In this talk, I will discuss my research on machine learning for large- and small-data biomedical discovery. First, I will describe a representation learning algorithm for the integration of large-scale heterogeneous data to disentangle out non-redundant information from noises and to represent them in a way amenable to comprehensive analyses; this algorithm has enabled several successful applications in drug repurposing. Next, I will present a deep learning model that utilizes evolutionary data and unlabeled data to guide protein engineering in a small-data scenario; the model has been integrated into lab workflows and enabled the engineering of new protein variants with enhanced properties. I will conclude my talk with future directions of using data science methods to assist biological design and to support decision making in biomedicine. Bio: Yunan Luo (http://yunan.cs.illinois.edu/) is a Ph.D. student advised by Prof. Jian Peng in the Department of Computer Science, University of Illinois at Urbana-Champaign. Previously, he received his Bachelor’s degree in Computer Science from Tsinghua University in 2016. His research interests are in computational biology and machine learning. His research has been recognized by a Baidu Ph.D. Fellowship and a CompGen Ph.D. Fellowship.
March 16th (Tuesday) 1PM-2:30PM CT Alternative Time: March 18th (Thursday) 7PM-8:30PM CT	Zoom set up by department	Liyuan Liu	Title: Towards Easy-to-Use Deep Learning: Effort-Light Transformer Training as an Example Abstract: Deep learning methods stand out with their ability to handle complicated data and tasks. However, successfully applying cutting-edge deep learning methods usually requires lots of extra care (e.g., heuristic tricks, excessive tuning on hyper-parameters, and data annotation costs). Given the inherent resource limitation of real-world applications, the demand of these efforts has hindered various applications and research. Bearing this in mind, I strive to build productive algorithms that can effectively make deep learning effort-light and easy-to-use. In this talk, I will address the extra care required to train Transformer networks (the backbone of many recent breakthroughs like BERT). First, my analyses reveal that unbalanced gradients are not the root cause of the unstable Transformer training and uncover a long-overlooked issue, i.e., model sensitivity to parameter updates. In light of these analyses, I successfully stabilize Transformer training and achieve the new state of the art without introducing any additional hyper-parameters. Secondly, I identify a problem of the adaptive learning rate, which not only provides guidance on training configurations and further stabilizes model training, but also sheds insights on the mystery of why the learning rate warmup is necessary. Putting these two aspects together forms a comprehensive inspection on extra care required for Transformer and simplifies Transformer training by reducing those extra care. In closing, I present a broader overview of my research and discuss how it can benefit biostatistics and biomedical informatics research. Bio: Liyuan Liu is a Ph.D. candidate in Computer Science at the University of Illinois at Urbana-Champaign, advised by Prof. Jiawei Han. He received his B.Eng. in Computer Science and Engineering at the University of Science and Technology of China in 2016. In his research, he strives to develop productive algorithms that can effectively reduce the resource consumption of deep learning, including expert efforts for data annotation and computation resources for tuning and training. Liyuan has published more than 20 papers in Top-Tier Conferences during his Ph.D. study. Liyuan has been awarded several fellowships and scholarships, including 2020 Yee Fellowship and 2015 Guo Moruo Scholarship. More information is available on his web page: http://liyuanlucasliu.github.io/

2019-2020 Schedule

If you want to present in 2405 SC (recommended), preferred times this semester are Tuesday, Thursday, and Friday afternoons. You should try to avoid times with talks already scheduled at /wiki/spaces/dls/pages/52953149 or Featured Lectures.

Date and Time	Room	Speaker	Title and Abstract
Feb 4 (Tuesday) 1:00 PM	2405 SC	Raghavendra Pothukuchi	Title: Intelligent Computing Systems for Extreme-Efficiency and Security Abstract: We are at an exciting time when computing is becoming personalized and ubiquitous. Vital to the continued success of computing are systems – from internet-of-things (IoT) devices to datacenters – that can deliver on many fronts like performance, energy use, reliability and security, with only limited resources to expend, like energy, storage and time. Building such systems is hard, particularly when the operating environment is becoming more dynamic, and systems are becoming heterogeneous. Unfortunately, we continue to build systems with ad hoc heuristic policies that are suboptimal, unsafe, and non-composable. My vision is to develop a new generation of computing systems that deliver extreme efficiency, together with reliability and security. Each component in the computing system continuously senses its execution and configures itself using intelligent control derived from principled methods like formal control and machine learning. In my talk, I will describe the techniques and prototypes I developed so far, which cover multiple system layers and heterogeneous hardware, and present the remarkable benefits of systems built with intelligent control. Bio: Raghavendra (Raghav) is a PhD candidate with Prof. Josep Torrellas at the University of Illinois. His research is on building intelligent systems for extreme-efficiency and security. He has interdisciplinary interests in computer system architecture, OS and runtimes, distributed systems, machine learning, formal control, and security. He is a winner of the W. J. Poppelbaum Award at Illinois for architecture design creativity, a Mavis Future Faculty Fellowship at Illinois, an ACM SRC competition, and was chosen as a rising star in computer architecture in 2018. He collaborates with AMD on modular control for heterogeneous systems, and this work resulted in a joint patent. He received his Masters in CS from Illinois in 2014 and worked at Nvidia before graduate school. He has a Bachelors (Hons.) from the Birla Institute of Technology & Science (BITS) Pilani, India where he was the university topper. https://pothukuchi.web.illinois.edu/
Feb 6 (Thursday) 2:00 PM	2405 SC	August Shi	Title: Mitigating Flaky Tests Abstract: Software is an important part of our society, and we depend on software developers to implement and maintain the quality of software. In turn, developers rely on regression testing, the practice of running tests after every software change, to check that they do not introduce bugs as they make changes. However, regression testing is plagued by flaky tests that non-deterministically pass and fail independently of software changes. The outcomes of flaky tests often mislead developers about the validity of their changes. Flaky tests are also quite prevalent, e.g., Google once reported 16% of their 4.2 million tests are flaky and that 84% of their changes with test failures actually involve a flaky test. To mitigate the negative effects of flaky tests, I have developed several techniques to automatically fix and detect different types of flaky tests. My techniques have fixed and detected hundreds of flaky tests in open-source software. This talk will focus on my work in fixing and detecting a specific type of flaky tests, namely order-dependent flaky tests, which pass when run in one order of the test suite but fail when run in a different order. First, I will present my technique, iFixFlakies, for automatically fixing order-dependent flaky tests. Next, I will present my technique, PolDet, for proactively detecting root causes that lead to order-dependent flaky tests before such tests can lead to flaky test failures. Bio: August Shi is a PhD candidate in Computer Science at the University of Illinois at Urbana-Champaign. August works with Professor Darko Marinov in the area of Software Engineering, with a focus on Software Testing. His research aims to improve developer productivity by making regression testing (1) more reliable with respect to flaky tests and (2) faster without loss in quality of testing. August has published in ICSE, ESEC/FSE, ASE, ISSTA, OOPSLA, ICST, and ISSRE, and his work on improving regression testing received an ACM SIGSOFT Distinguished Paper Award at ICSE 2017. More information is available on his web page: http://mir.cs.illinois.edu/awshi2.
Feb 10 (Monday) 3:00 pm	2405 SC	Radha Venkatagiri	Abstract: We live in a world were errors in computation will become ubiquitous and come from a wide variety of sources -- from unintentional soft errors in shrinking transistors to deliberate errors introduced by approximation or malicious attacks. Guaranteeing perfect functionality across a wide range of future systems will be prohibitively expensive. Error-Efficient computing offers a promising solution by allowing the system to make controlled errors and only preventing those errors that it absolutely must to ensure an acceptable user experience. Allowing the system to intelligently make errors can lead to significant resource (time, energy, bandwidth, etc.) savings. Error-efficient computing can transform the way we design hardware and software to exploit new sources of compute efficiency; however, excessive programmer burden and a lack of principled design methodologies have thwarted its adoption. My research addresses these limitations through foundational contributions that enable the adoption of error-efficiency as a first-class design principle by a variety of users and application domains. In this talk, I will show how my work (1) enables an understanding of how errors affect program execution by providing a suite of automated and scalable error analysis tools, (2) demonstrates how such an understanding can be exploited to build customized error-efficiency solutions targeted to low-cost hardware resiliency and approximate computing and (3) develops methodologies for principled integration of error-efficiency into the software and hardware design workflow. Finally, I will discuss future research avenues in error-efficient computing with multi-disciplinary implications in core disciplines (programming languages, software engineering, hardware design, systems) and emerging application areas (AI, VR, robotics, edge computing). Bio: Radha is a doctoral candidate in Computer Science at the University of Illinois at Urbana-Champaign. Her research interests lie in the area of Computer Architecture and Systems. Radha’s dissertation work aims to build efficient computing systems that redefine “correctness” as producing results that are good enough to ensure an acceptable user experience. Radha’s research work has been nominated to the IBM Pat Goldberg Memorial Best Paper Award for 2019. She was among 20 people invited to participate in an exploratory workshop on error-efficient computing systems initiated by the Swiss National Science Foundation and is one of 200 young researchers in Math and Computer Science worldwide to be selected for the prestigious 2018 Heidelberg Laureate Forum. Radha was selected for the Rising Stars in EECS and the Rising Stars in Computer Architecture (RISC-A) workshops for the year 2019. Before joining the University of Illinois, Radha was a CPU/Silicon validation engineer at Intel where her work won a divisional award for key contributions in validating new industry standard CPU features. Prior to that, she worked briefly at Qualcomm on architectural verification of the Snapdragon processor.
Feb 18 (Tuesday) 12:30 pm	2405 SC	Umang Mathur	Title: Algorithmic Advances for Dynamic Concurrency Bug Detection Abstract: Concurrency is indispensable in modern software applications, but it can be extremely tricky to get right. Concurrency bugs, such as data races, occur in production despite rigorous development-time testing. Thus, techniques for detecting concurrency bugs have gained a lot of attention in the last three decades. However, existing techniques still suffer from significant problems: they do not scale, they report false alarms, or they miss many bugs. In this talk, I will discuss my work on fundamental algorithmic advances to solve these problems in the context of dynamic data race prediction, which infers data races from allowable reorderings of concurrent executions. In the first part of my talk, I will describe a new partial order, Weak Causal Precedence (WCP), and a race prediction algorithm based on WCP that reports no false alarms, reports more races than existing techniques, and, most importantly, achieves the holy grail of dynamic race prediction---linear running time. In the second part of my talk, I will discuss data race detection from compressed execution traces. Extremely large traces from applications like web browsers and high runtime overhead of dynamic analyses warrants offline analysis of stored traces. However, large traces require expensive warehousing and companies commonly store these traces in compressed form. I will discuss the first algorithms that directly detect data races from compressed traces without first decompressing the traces. These algorithms exploit compositionality and run in time that is linear in the size of the compressed trace. Bio: Umang Mathur is a PhD candidate in the CS Department of the University of Illinois at Urbana Champaign, where he is advised by Mahesh Viswanathan. His research interests are in Formal Methods and Logic with applications to Programming Languages and Software Engineering. During his PhD, he developed techniques for detecting concurrency bugs (including data races, deadlocks, atomicity violations), decidable program verification, program synthesis and analysis of stochastic and real-time cyber-physical systems. Umang is a recipient of the 2019 Google PhD Fellowship for Programming Technology and Software Engineering, C. W. Gear Outstanding Graduate Student Award, a Mavis Future Faculty Fellowship at Illinois, and an ACM Distinguished Paper Award at ESEC/FSE 2018. More information can be found on his website: http://umathur3.web.engr.illinois.edu/

2018-2019 Schedule

Date and Time	Room	Speaker	Title and Abstract
Feb 1 (Fri) 12:30 pm	3401SC	Qi Li	Title: Pattern-Based Mining of Entity/Relation Structures from Massive Text Abstract: Majority of information nowadays is carried by massive and unstructured text, in the form of news, articles, reports, or social media messages. This poses a major research challenge on mining entity/relation structures from unstructured text. Manual curation or labeling cannot be scalable to match the rapid growth of text. Most existing information extraction approaches rely on heavy human annotations, which can be too expensive to tune and not adaptable to new domains. In this talk, I will present a pattern-based methodology that conducts information extraction from the massive corpora using existing resources with little human effort. The first component, WW-PIE, discovers meaningful textual patterns that contain the entities of interest. The second component, TruePIE, discovers high quality textual patterns for target relation types. I will demonstrate how semi-supervised methods can empower information extraction for broad applications and provide explainable results. Bio: Qi Li is currently a postdoc researcher and adjunct professor at Department of Computer Science, University of Illinois at Urbana-Champaign, working with Prof. Jiawei Han. Her research interests lie in the area of data mining with a focus on the extraction and aggregation of information from multiple data sources. Qi obtained her PhD in Computer Science and Engineering from the State University of New York at Buffalo in 2017 advised by Prof. Jing Gao, and MS in Statistics from University of Illinois at Urbana-Champaign in 2012. She has received several awards including the Presidential Fellowship of University at Buffalo, the Best CSE Graduate Research Award and the CSE Best Dissertation Award at Department of Computer Science and Engineering, University at Buffalo. More information can be found at https://publish.illinois.edu/qili5/.
Feb 1 (Fri) 3:30pm	4405SC	Owolabi Legunsen	Title: Evolution-Aware Runtime Verification Abstract: The risk posed by software bugs has increased significantly as software is now essential to many areas of our daily lives. Runtime verification can help find bugs by monitoring program executions against formally specified properties. Over the last two decades, tremendous research progress has improved the performance of runtime verification. However, there has been very little focus on the benefits and challenges of using runtime verification during software testing. Yet, testing generates many executions on which properties can be monitored. In this talk, I will describe my work on studying and improving runtime verification during testing. My large-scale study was the first to show that runtime verification during testing is beneficial for finding many important bugs from tests that developers already have. However, my study also showed that runtime verification still incurs high overhead, both in machine time to monitor properties and in developer time to inspect violations of the properties. Moreover, all prior runtime verification techniques consider only one program version and would wastefully re-monitor unaffected properties and code as software evolves. To reduce the overhead across multiple program versions, I proposed the first evolution-aware runtime verification techniques. My techniques exploit the key insight that software evolves in small increments and reduce the accumulated runtime verification overhead by up to 10x, without missing new violations. Bio: Owolabi LegunsenisaPhD candidate in Computer Science at the University of Illinois at Urbana-Champaign, where he works with Darko Marinov and Grigore Rosu. Owolabi's interests are in Software Engineering and Applied Formal Methods, with a focus on Software Testing and Runtime Verification. His research on runtime verification during software testing received an ACM SIGSOFT Distinguished Paper Award at ASE 2016. More information is available on his web page: http://mir.cs.illinois.edu/legunsen
Feb 5 (Tue) 10:00am	SC 2405	Mengjia Yan	Title: Secure Computer Hardware in the Age of Pervasive Security Attacks Abstract: Recent attacks such as Spectre and Meltdown have shown how vulnerable modern computer hardware is. The root cause of the problem is that computer architects have traditionally focused on performance and energy efficiency. Security has never been a first-class requirement. Moving forward, however, this has to radically change: we need to rethink computer architecture from the ground-up for security. As an example of this vision, in this talk, I will focus on speculative execution in out-of-order processors --- a core computer architecture technology that is the target of the recent attacks. I will describe InvisiSpec, the first robust hardware defense mechanism against speculative (a.k.a transient) execution attacks. The idea is to make loads invisible in the cache hierarchy, and only reveal their presence at the point when they are safe. Once an instruction is deemed safe, our hardware is able to cheaply modify the cache coherence state in a consistent manner. Further, to reduce the cost of InvisiSpec and increase its protection coverage, I propose Speculative Taint Tracking (STT). This is a novel form of information flow tracking that is specifically designed for speculative execution. It reduces cost by allowing tainted instructions to become safe early, and by effectively leveraging the predictor hardware that is ubiquitous in modern processors. Further improvements of InvisiSpec-STT can be attained with new compiler techniques. Finally, I will conclude my talk by describing ongoing and future directions towards designing secure processors. BIO: Mengjia Yan is a Ph.D. student at the University of Illinois at Urbana-Champaign (UIUC), working with Professor Josep Torrellas. Her research interest lies in the areas of computer architecture and hardware security, with a focus on defenses against transient execution attacks and cache-based side channel attacks. Her work has appeared in some of the top venues in computer architecture and security, and has sparked a large research collaboration initiative between UIUC and Intel. Mengjia received the UIUC College of Engineering Mavis Future Faculty Fellow, the Computer Science W.J. Poppelbaum Memorial Award, a MICRO TopPicks in Computer Architecture Honorable Mention, and was invited to participate in two Rising Stars workshops.
Feb 8 (Fri) 3:30 pm	SC 4405	Sangeetha Abdu Jyothi	Title: Automated Resource Management in Large-Scale Networked Systems Abstract: The multitude of Internet applications relies on large-scale networked environments such as the cloud for their backend support. In these multi-tenanted environments, various stakeholders have diverse goals. The objective of the infrastructure provider is to increase revenue by utilizing the resources efficiently. Applications, on the other hand, want to meet their performance requirements at minimal cost. However, estimating the exact amount of resources required to meet the application needs is a difficult task, even for expert users. Easy workarounds employed for tackling this problem, such as resource over-provisioning, negatively impact the goals of the provider, applications, or both. In this talk, I will discuss the design of application-aware self-optimizing systems through automated resource management that helps meet the varied goals of the provider and applications in large-scale networked environments. The key steps in closed-loop resource management include learning of application resource needs, efficient scheduling of resources, and adaptation to variations in real-time. I will describe how I apply this high-level approach in two distinct environments using (a) Morpheus in enterprise clusters, and (b) Patronus in cellular provider networks with geo-distributed micro data centers. I will also touch upon my related work in application-specific context at the intersection of network scheduling and deep learning. I will conclude with my vision for self-optimizing systems including fully automated clouds and an elastic geo-distributed platform forthousandsofmicro data centers. Bio: Sangeetha Abdu Jyothi is a Ph.D. candidate at the University of Illinois at Urbana-Champaign advised by Brighten Godfrey. Her research interests lie in the areas of computer networking and systems with a focus on building application-aware self-optimizing systems through automated resource management. She is a winner of the Facebook Graduate Fellowship (2017-2019) and the Mavis Future Faculty Fellowship (2017-2018). She was invited to attend the Rising Stars in EECS workshop at MIT (2018). Website: http://abdujyo2.web.engr.illinois.edu
Feb 26 (Tue) 3:00 pm	SC 3403	Motahhare Eslami	Title: Communicating Opaque Algorithmic Processes in Socio-Technical Systems Abstract: Algorithms play a vital role in curating online information in socio-technical systems, however, they are usually housed in black-boxes that limit users’ understanding of how an algorithmic decision is made. While this opacity partly stems from protecting intellectual property and preventing malicious users from gaming the system, it is also designed to provide users with seamless, effortless system interactions. However, this opacity can result in misinformed behavior among users, particularly when there is no clear feedback mechanism for users to understand the effects of their own actions on an algorithmic system. The increasing prevalence and power of these opaque algorithms coupled with their sometimes biased and discriminatory decisions raise questions about how knowledgeable users are and should be about the existence, operation and possible impacts of these algorithms. In this talk, I will address these questions by exploring ways to investigate users’ behavior around opaque algorithmic systems. I will then present new design techniques that communicate opaque algorithmic processes to users and provide them with a more informed, satisfying, and engaging interaction. In doing so, I will add new angles to the old idea of understanding the interaction between users and automation by designing around algorithm sensemaking and algorithm transparency. Bio: Motahhare Eslami is a Ph.D. Candidate in Computer Science at the University of Illinois at Urbana-Champaign, where she is advised by Karrie Karahalios. Motahhare’s research develops new communication techniques between users and opaque algorithmic socio-technical systems to provide users a more informed, satisfying, and engaging interaction. Her work has been recognized with a Google PhD Fellowship, Best Paper Award at ACM CHI, and has been covered in mainstream media such as Time, The Washington Post, Huffington Post, the BBC, Fortune, and Quartz. Motahhare is also a Facebook and Adobe PhD fellowship finalist, and a recipient of C.W. Gear Outstanding Graduate Student Award, Saburo Muroga Endowed Fellowship, Feng Chen Memorial Award, Young Researcher in Heidelberg Laureate Forum and Rising Stars in EECS.
Mar 5 (Tue) 4:00 pm	SC 3403	Jingbo Shang	Title: AutoNet: Automated Network Construction from Massive Text Corpora Abstract: Mining structured knowledge from massive unstructured text data is a key challenge in data science. In this talk, I will discuss my proposed framework, AutoNet, that transforms unstructured text data into structured heterogeneous information networks, on which actionable knowledge can be further uncovered flexibly and effectively. AutoNet is a data-driven approach using distant supervision instead of human curation and labeling. It consists of four essential steps: (1) quality phrase mining; (2) entity recognition and typing; (3) relation extraction; and (4) taxonomy construction. Along this line, I have developed a number of state-of-the-art distantly-supervised/unsupervised methods and published them in top conferences and journals. Specifically, I will present my work about phrase mining, entity recognition, and taxonomy construction in details, while touching the other work slightly. Finally, I will summarize the AutoNet framework with a demo video and conclude by discussing future work collaborating with other disciplines. Bio: Jingbo Shang is a Ph.D. candidate in Department of Computer Science, the University of Illinois at Urbana-Champaign. He received his B.E. from the Computer Science Department, Shanghai Jiao Tong University, China. His research focuses on mining and constructing structured knowledge from massive text corpora with minimum human effort. His research has been recognized by many prestigious awards, including Computer Science Excellence Scholarship from CS@Illinois, Grand Prize of Yelp Dataset Challenge in 2015, and Google Ph.D. Fellowship in Structured Data and Database Management in 2017, and C.W. Gear Outstanding Graduate Award in 2018.

2017-2018 Schedule

Date and
Time

Room

Speaker

Title and Abstract

2/15 Thursday 1:30PM

SC2405

Wei Yang

Title: Adversarial-Resilience Assurance for Mobile Security Systems

Abstract: For too long, researchers have often tackled security in an attack-driven, ad hoc, and reactionary manner with large manual efforts devoted by security analysts. In order to make substantial progress in security, I advocate to shift such manner to be automated, intelligent, and adversarial resilient. Over the course of my Ph.D. research, I have built security systems incorporating intelligent security techniques based on program analysis, natural language processing, and machine learning, and I have developed corresponding defenses and testing methodologies to guard against emerging attacks specifically adversarial to these newly-proposed security techniques. In this talk, I will first highlight two of these systems for mobile security: AppContext and WHYPER. Then I will show how to generate adversarial inputs for testing and further strengthening these systems. I will conclude by discussing how future research efforts can leverage the interplay between AI and security techniques toward a defense-driven security ecosystem.

2/23 Friday 10am

SC3403

Chao Zhang

Title: Knowledge Cube Construction from Massive Social Sensing Data
Abstract: Social sensing data are massive and ubiquitous. Effective and scalable analytics of social sensing data can be game changing for urban science, business, healthcare, and homeland security. However, such data pose great challenges to computer science research since they are often unstructured, fragmented, noisy, and intermingled with rich contexts. In this talk, I will introduce a systematic framework, KnowCube, that addresses the above challenges by turning unstructured, noisy social sensing data into a structured, multidimensional knowledge cube. In particular, I will discuss in detail how to solve two key problems for knowledge cube construction: (1) how to extract events from noisy social sensing data; and (2) how to organize unstructured events into a multidimensional cube structure without supervision. KnowCube serves as a versatile and easy-to-use knowledge engine that can harness the power of social sensing for many applications. Finally, I will share some future research directions on better knowledge cube construction and building next-generation intelligent systems with the knowledge cube.

3/9 Friday 10am

CSL301

Izzat El Hajj

Title: Building Programming Systems in a World of Increasing Heterogeneity

Abstract:

The breakdown of Dennard Scaling and slowing down of Moore's Law has led to an explosion of new processor and memory technologies which is making computing systems evolve to become increasingly heterogeneous. We are seeing GPUs, FPGAs, and special purpose accelerators become central parts of systems, as well as a growing interest in persistent byte-addressable memories and near-memory acceleration. While these technologies provide massive performance gains and energy savings that are not possible on traditional systems, they tend to be very tedious to program which introduces a heavy burden on software developers and presents a significant barrier to adoption. It is therefore critical that these hardware innovations be met with software innovations that facilitate programmability.

In this talk, I will discuss my work on building programming systems (languages, compilers, runtimes, OS support) for emerging processor and memory technologies. My talk will focus on two particular systems: (1) a compiler and runtime for improving performance and programmability of irregular applications on GPUs, and (2) a novel programmable accelerator and compiler that leverage analog computing via memristive crossbars to accelerate deep learning workloads. I will also discuss my future directions in both lines of work.

Bio:

Izzat is a PhD candidate in Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign, and a member of the IMPACT Research Group working with Prof. Wen-mei Hwu. Izzat's research interests are in building programming systems for emerging processor and memory technologies. He has worked on programming systems for GPUs tackling issues of performance portability (CGO'15, MICRO'16), irregular application optimization (MICRO'16), and collaborative execution via unified virtual memory (ISPASS'17). For his work on GPU programming systems, he holds the Dan Vivoli Endowed Fellowship '17-'18. He has also worked on programming systems for emerging resistive memory technologies, tackling the issue of persistent object representation (ASPLOS'16, OOPSLA'17) and the use of memristive crossbars for accelerating deep learning workloads (in submission). For the former, he received the HiPEAC paper award and has submitted multiple patent applications. Izzat received his BE in Electrical and Computer Engineering in 2011 at the American University of Beirut (AUB), where he graduated with high distinction and received the Distinguished Graduate Award.

2016-2017 Schedule

Date and Time	Room	Speaker	Title and Abstract
February 9 (Thursday) 330 PM	2405 SC	Matt Sinclair	Title: Efficient Coherence and Consistency for Specialized Memory Hierarchies Abstract: As the benefits from transistor scaling slow down, specialized accelerators and heterogeneous computing are becoming increasingly important because they can significantly improve performance and energy efficiency for specific applications. An efficient and easy-to-program memory and communication architecture is critical in achieving the promise of such systems. Traditionally, accelerators in heterogeneous systems used discrete address spaces and employed specialized memories, e.g., scratchpads, for specific access patterns. These attributes make these systems difficult to program and inefficient in the presence of high data reuse and fine-grained synchronization -- traits that are common in emerging applications such as graph analytics workloads. My thesis resolves these inefficiencies with cross-cutting research that rethinks the software, hardware, and hardware-software interface of heterogeneous systems. Underlying my work is the efficient support of a global address space across all accelerator memories (for easier programming), an efficient cache coherence protocol (for efficient hardware), and a familiar memory consistency model (for an appropriate hardware-software interface). First, I consider heterogeneous systems that have recently started to support global address spaces. I expose the imbalance between coherence and consistency in such systems. Current systems use simple, software-based coherence protocols that require heavyweight actions at synchronization points. To deal with this, industry has moved to complex consistency models that use scoped synchronization, making consistency models for heterogeneous systems even more complex than the already complicated CPU consistency models. I introduce a low overhead cache coherence protocol, DeNovo, that adjusts the imbalance and enables heterogeneous systems to use the standard, simpler data-race-free (DRF) consistency model. Second, I explore a further source of complexity in consistency models: relaxed atomics. These have been the Achilles heel of CPU consistency models -- they have the promise of higher performance but have no known formal semantics. Heterogeneous systems' inefficient support for atomics makes using relaxed atomics particularly tempting. I extend the DRF consistency model to retain the efficiency benefits of relaxed atomics and provide better semantics for the common use cases of relaxed atomics in heterogeneous systems. Third, current systems continue to support specialized memories in private address spaces, which negate some of the benefits of specialization. I integrate these specialized memories into the global address space while retaining the benefits they provide. Overall, my research introduces a more efficient and easier-to-program heterogeneous memory hierarchy that significantly improves both performance and energy compared to the state-of-the-art heterogeneous systems. Bio: Matt Sinclair is a doctoral candidate in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He is interested in computer architecture and systems, with a current focus on building efficient memory hierarchies for heterogeneous systems. His papers at the 2015 International Symposium on Computer Architecture (ISCA) and 2015 International Symposium on Microarchitecture (MICRO) were recognized as 2016 IEEE Micro Top Picks Honorable Mentions. He is the recipient of a Qualcomm Innovation Fellowship, two Mavis Future Faculty Fellowships, the Feng Chen Memorial Award, the W.J. Poppelbaum Award, and a Saburo Muroga Fellowship. He was also selected to attend and present his research at the 2016 Heidelberg Laureate Forum. He received a BS in Computer Science with Honors & Computer Engineering (2009) and a MS in Electrical Engineering (2011) from the University of Wisconsin-Madison.
February 16 (Thursday) 3.30 PM	2405 SC	Renato Mancuso	Title: Safe, Real-Time Software Reference Architectures for Cyber-Physical Systems Abstract: There has been an uptrend in the demand and need for complex Cyber-Physical Systems (CPS), such as self-driving cars, unmanned aerial vehicles (UAVs), and smart manufacturing systems for Industry 4.0. CPS often need to accurately sense the surrounding environment by using high-bandwidth acoustic, imaging and other types of sensors; and to take coordinated decisions and issue time critical actuation commands. Hence, temporal predictability in sensing, communication, computation, and actuation is a fundamental attribute. Additionally, CPS must operate safely even in spite of software and hardware misbehavior to avoid catastrophic failures. To satisfy the increasing demand for performance, modern computing platforms have substantially increased in complexity; for instance, multi-core systems are now mainstream, and partially re-programmable system-on-chip (SoC) have just entered production. Unfortunately, extensive and unregulated sharing of hardware resources directly undermines theabilityofguaranteeingstrongtemporal determinism in modern computing platforms. Novel software architectures are needed to restore temporal correctness of complex CPS when using these platforms. My research vision is to design and implement software architectures that can serve as a reference for the development of high-performance CPS, and that embody two main requirements: temporal predictability and robustness. In this talk, I will address the following questions concerning modern multi-core systems: Why application timing can be highly unpredictable? What techniques can be used to enforce safe temporal behaviors on multi-core platforms? I will also illustrate possible approaches for time-aware fault tolerance to maximize CPS functional safety. Finally, I will review the challenges faced by the embedded industry when trying to adopt emerging computing platforms, and I will highlight some novel directions that can be followed to accomplish my research vision. Bio: Renato Mancuso is a doctoral candidate in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He is interested in high-performance cyber-physical systems, with a specific focus on techniques to enforce strong performance isolation and temporal predictability in multi-core systems. He has publishedaround20 papers in major conferences and journals. His papers were awarded a best student paper awardandabestpresentationaward at the Real-Time and Embedded Technology and Applications Symposium (RTAS) in 2013 and 2016, respectively. He was the recipient of a Computer Science Excellence Fellowship, and a finalist for the Qualcomm Innovation Fellowship. Some of the design principles for real-time multi-core computing proposed in his research have been officially incorporated in recent certification guidelines for avionics systems. They have also been endorsed by government agencies,industriesandresearch institutions worldwide. He received a B.S. in Computer Engineering with honors (2009)andaM.S.in Computer Engineering with honors (2012) from the University of Rome "Tor Vergata".
February 17 11am-12pm	2405 SC	Xiang Ren	Title: Effort-Light StructMine: Turning Massive Corpora into Structures Abstract: The real-world data, though massive, are hard for machines to resolve as they are largely unstructured and in the form of natural-language text. One of the grand challenges is to turn such massive corpora into machine-actionable structures. Yet, most existing systemshaveheavy reliance on human effort in the process of structuring various corpora, slowing down the development of downstream applications. In this talk, I will introduce a data-driven framework, Effort-Light StructMine, that extracts structured facts from massive corpora without explicit human labeling effort. In particular, I will discuss how to solve three StructMine tasks under Effort-Light StructMine framework: from identifying typed entitiesintext,tofine-grained entity typing, to extracting typed relationships between entities. Together, these threesolutionsformaclear roadmap for turning a massive corpus into a structured network to represent its factual knowledge. Finally, I will share some directions towards mining corpus-specific structured networks for knowledge discovery. Bio: Xiang Ren is a Computer SciencePhD candidate atUniversity of Illinois at Urbana-Champaign, working with Jiawei Han and the Data and Information System（DAIS）Research Lab. Xiang’s research develops data-driven methods for turning unstructured text data into machine-actionable structures. More broadly, his research interests span data mining, machine learning, and natural language processing, with a focus on making sense of massive text corpora. His research has been recognized with a GooglePhD Fellowship, Yahoo!-DAIS Research Excellence Award, C. W. Gear Outstanding Graduate Student Award, and has been transferred to US Army Research Lab, NIH,Microsoft,Yelp and TripAdvisor.
February 24 (Friday) 1:30 PM	2405 SC	Man-Ki Yoon	Title: CPS Security: Algorithms, Analysis and Experimental Validation Abstract: The increased computational power and connectivity in modern Cyber-Physical Systems (CPS) inevitably introduce more security vulnerabilities. CPS poses unique security challenges, as such systems are required to meet stringent requirements such as timing constraints as well as strong safety requirements. On the other hand, it provides defenders with an opportunity to take advantage of the design and implementation constraints and the tight coupling of cyber and physical components to deter attackers. In this talk, we will discuss how such intrinsic characteristics of CPS can be used as an asymmetric advantage to detect security attacks to safety-critical CPS. We will particularly discuss (a) modeling and reasoning about the logical (temporal and spatial) and physical behaviors of CPS, (b) architecturalandoperating-systemsupports for trusted, efficient run-time behavior monitoring, and (c) attack-resilient architectures. Bio: Man-Ki Yoon isaPhD candidate in Computer Science at the University of Illinois at Urbana-Champaign. His research interests are in analytic tools and system design principles for secure cyber-physical and real-time embedded systems, applying computer architecture, real-time scheduling, and statistical learning techniques. He is a recipient of Qualcomm Innovation Fellowship, Qualcomm Roberto Padovani Scholarship, and IntelPhD fellowship.
March 8 (Wednesday) 1:00 PM	2405 SC	Snigdha Chaturvedi	Title: Structured Approaches to Natural Language Understanding Abstract: Despite recent advancements in Natural Language Processing, computers today cannot understand text in the ways that humans can. My research aims at creating computational methods that not only read but also interpret and reason about text. To accomplish this, I develop machine-learning methods that incorporate different sources of social context, linguistic structure and semantic knowledge while processing text. In this talk, I exhibit this by discussing two specific applications of natural language understanding that focus on comprehension of narratives: (i) Choosing correct endings to stories, and (ii) Identifying inter-personal relationships from narratives. Automatic narrative comprehension is a fundamental challenge in Natural Language Understanding, and can enable computers to understand social norms, human behavior and common sense by processing large corpora of such texts. In the first part of the talk, I present a model that attempts to understand a story on three semantic axes: (i) its sequence of events, (ii) its emotional trajectory, and (iii) its plot consistency. We judge the model’s understanding by inquiring if, like humans, it can develop an expectation of what will happen next in a story and predict the correct ending from possible alternatives. In the next part of the talk I address another important aspect of Natural Language Understanding: identifying social relationships from unstructured text. Understanding such relationships is essential for developing an understanding of people's goals, actions and expected behavior in stories. We develop structured models that incorporate linguistic as well as contextual cues for capturing the evolving nature of human relationships. We automatically discover various types of relationships in a data-driven manner. I conclude with a discussion of future directions, and some real-world scenarios that would gain from such advancements in natural language understanding, including social networks, discussion fora, intelligent virtual assistants, and artificial tutors. Bio: Snigdha Chaturvedi is a postdoctoral fellow at University of Illinois, Urbana Champaign, working with Professor Dan Roth. She specializes in the field of Natural Language Processing with emphasis on developing methods for natural language understanding. Her research has been recognized with the IBM Ph.D. Fellowship (twice), a best paper award at NAACL, and first prize at ACM student research competition held at Grace Hopper Conference. She completed her Ph.D. in Computer Science at University of Maryland, College Park in 2016 and Bachelors in Technology from Indian Institute of Technology, Kanpur in 2009. She has previously held a position as a Blue Scholar at IBM Research, India.
March 9 (Thursday) 4:00pm-5:30pm	2405 SC	Yingyan Lin	Title: Energy-efficient Systems for Information Processing and Transfer Abstract: Machine learning (ML) algorithms are increasingly pervasive in tackling the data deluge of the 21st Century. Current ML systems adopt either a centralized cloud computing or a distributed mobile computing paradigm. In both paradigms, the challenge of energy efficiency is drawing increased attention. In cloud computing, data transfer due to inter-chip, inter-board, inter-shelf and inter-rack communications (I/O interface) within data centers is one of the dominant energy costs. This will only intensify with the growing demand for increased I/O bandwidth for high-performance computing in data centers. On the other hand, in mobile computing, energy efficiency is the primary design challenge, as mobile devices have limited energy, computation and storage resources. This challenge is being exacerbated by the need to embed ML algorithms for enabling local inference capabilities. In this talk, I will present system-to-circuit approaches for addressing these energy efficiency challenges. First, I will describe the design of a 4-bit, 4 GS/s bit-error-rate optimal analog-to-digital converter in 90nm CMOS and its use in realizing an energy-efficient 4Gb/s serial link receiver for I/O interface in data centers. Next, I will describe two techniques that can potentially enable on-device deployment of convolutional neural networks (CNNs) by significantly reducing the energy consumption via algorithmic/architectural innovation. Finally, I will identify future research directions in the emerging area of machine learning on resource-constrained silicon platforms. Bio: Yingyan Lin is a Ph.D. candidate in the Electrical and Computer Engineering Department at the University of Illinois at Urbana-Champaign under the advisement of Professor Naresh Shanbhag. She expects to receive her Ph.D. degree in June 2017. Her research includes analog and mixed signal circuits for I/O interfaces, error resiliency techniques, VLSI circuits and architectures for machine learning on resource-constrained silicon platforms. She has 11 peer-reviewed publications as the first author on the subject and designed three high-speed interface circuit IPs for large flat panel display applications that were acquired by TOSHIBA Microelectronics Corporation in Japan. She received the 2016 IEEE International Workshop on Signal Processing Systems second place Best Student Paper Award and is the recipient of the 2016-2017 Robert T. Chien Memorial Award for Excellence in Research at UIUC.

2015-2016 Schedule

Organized by Madhusudan Parthasarathy and Darko Marinov

Date and Time

Room

Speaker

Title and Abstract

Feb 8 (Monday)

4PM

3405SC

Parisa Kordjamshidi

Title: Declarative Learning based Programming for Structured Machine Learning in Natural Language Processing

Developing intelligent problem solving systems that deal with real world messy data requires addressing a range of scientific and engineering challenges. Conventional programming languages offer no help to application programmers that attempt to make use of real world data, and reason about it in a way that involves learning interdependent concepts from data, incorporating existing models, and reasoning about them. Over the last few years the research community has tried to address these problems from multiple perspectives, most notably various approaches based on Probabilistic programming, Logical programming and integrated paradigms. In this talk I present Saul, a new declarative learning based programming (DeLBP) language that aims at facilitating the design and development of intelligent real world applications that use machine learning and reasoning. Our new language addresses the following challenges: Interaction with messy data; Specifying the problem at a high level i.e. application level; Dealing with uncertainty in data and knowledge; Supporting structured learning and reasoning while considering expert knowledge that is represented declaratively. An additional advantage of such a paradigm is generating easily reusable models and code, henceforth increasing the replicability of research results. I exemplify the flexibility and the expressive power of this language using a number of applications in natural language processing domain.

Feb 19 (Friday)

2PM

2405SC

Jia-Bin Huang

Visual Analysis and Synthesis with Physically Grounded Constraints

The past decade has witnessed remarkable progress in image-based, data-driven vision and graphics. However, existing approaches often treat the images as pure 2D signals and not as a 2D projection of the physical 3D world. As a result, a lot of training examples are required to cover sufficiently diverse appearances and inevitably suffer from limited generalization capability. In this talk, I will present "inference-by-composition" approaches to overcome these limitations by modeling and interpreting visual signals in terms of physical surface, object, and scene. I will show how we can incorporate physically grounded constraints in a non-parametric optimization framework for (1) revealing the missing parts of an image due to removal of a foreground or background element, (2) recovering high spatial frequency details that are not resolvable in low-resolution observations, and (3) discovering multiple approximately linear structures in extremely noisy videos with an ecological application to bird migration monitoring at night. The resulting algorithms are simple and intuitive while achieving state-of-the-art performance without the need of training on an exhaustive set of visual examples. I will end my talk with a brief discussion of some key challenges and opportunities in visual learning with weak supervision.

Bio: Jia-Bin Huang is a Ph.D. candidate in the Department of Electrical and Computer Engineering at University of Illinois, Urbana-Champaign advised by Prof. Narendra Ahuja. His research interests include computer vision, computer graphics, and machine learning with a focus on visual analysis and synthesis with physically grounded constraints. His research received the best student paper award in IAPR International Conference on Pattern Recognition (ICPR) in 2012 for the work on computational modeling of visual saliency and the best paper award in the ACM Symposium on Eye Tracking Research and Applications (ETRA) in 2014 for work on learning-based eye gaze tracking. Huang is the recipient of the UIUC Graduate College Dissertation Completion Fellowship (2015), Thomas and Margaret Huang Award for Graduate Research (2015), Beckman Cognitive Science/Artificial Intelligence Award (2015), Sundaram Seshu Fellowship (2014), MOE Technologies Incubation Scholarship (2014), and the PURE Best Research Mentor Award (2012).

Personal website: http://www.jiabinhuang.com

2014-2015 Schedule

Organized by Madhusudan Parthasarathy and Darko Marinov

Date and Time	Room	Speaker	Title and Abstract
Feb 4 (Wed) 10am	2405	Milos Gligoric	Regression Testing: Theory and Practice Developers often build regression test suites that are automatically run for each code revision to check that code changes did not break any functionality. While regression testing is important, it is also expensive due to both the number of revisions and the number of tests. For example, Google recently reported that they observed a quadratic increase in daily test-suite run time (a linear increase in the number of revisions per day and a linear increase in the number of tests per revision). In this talk, I present a technique, called Ekstazi, to substantially reduce test-suite run time. Ekstazi introduces a novel approach to regression test selection, which runs only a subset of tests whose dependencies may be affected by the latest changes; Ekstazi keeps file dependencies for each test. Ekstazi also speeds up test-suite runs for software that uses modern distributed version-control systems; by modeling different branch and merge commands directly, Ekstazi computes test sets that can be significantly smaller than the entire test suite. I developed Ekstazi for JVM languages and evaluated it on several hundred revisions of 32 open-source projects (totaling 5M lines of code). Ekstazi can reduce test-suite run time an order of magnitude, including runs for merge revisions. Finally, only a few months after the initial release, Ekstazi was adopted and used daily by many developers from several open-source projects, including Apache Camel, Commons Math, and CXF. Bio: Milos Gligoric is a PhD candidate in Computer Science at the University of Illinois at Urbana-Champaign (UIUC). His research interests are in software engineering and formal methods, especially in designing techniques and tools that improve software quality and developers' productivity. His PhD work has explored test input generation, test quality assessment, testing concurrent code, and regression testing. He won an ACM SIGSOFT Distinguished Paper Award (ICSE 2010), and three of his papers were invited for a journal submission. He was awarded the Saburo Muroga Fellowship (2009), the C.L. and Jane W-S. Liu Award (2012), and the C. W. Gear Outstanding Graduate Award (2014) from the UIUC Department of Computer Science, and the Mavis Future Faculty Fellowship (2014) from the UIUC College of Engineering. He did internships at NASA Ames, Intel, Max Planck Institute for Software Systems, and Microsoft Research. Milos holds a BS (2007) and MS (2009) from the University of Belgrade, Serbia.
Feb 10 (Tue) 10 am	2405	Kai-Wei Chang	Practical Learning Algorithms for Structured Prediction Models The desired output in many machine learning tasks is a structured object such as a tree, a clustering of nodes, or a sequence. Learning accurate prediction models for such problems requires training on large amounts of data, making use of expressive features and performing global inference that simultaneously assigns values to all interrelated nodes in the structure. All these contribute to significant scalability problems. We describe a collection of results that address several aspects of these problems – by carefully selecting and caching samples,structures, or latent items. Our results lead to efficient learning algorithms for structured prediction models and for online clustering models which, in turn, support reduction in problem size, improvements in training and evaluation speed and improved performance. We have used our algorithms to learn expressive models from large amounts of annotated data and achieve state-of-the art performance on several natural language processing tasks. Bio: Kai-Wei Chang is a doctoral candidate advised by Prof. Dan Roth in the Department of Computer Science, University of Illinois at Urbana-Champaign. His research interests lie in designing practical machine learning techniques for large and complex data and applying them to real world applications. He has been working on various topics in Machine learning and Natural Language Processing, including large-scale learning, structured learning, coreference resolution, and relation extraction. Kai-Wei was awarded the KDD Best Paper Award in 2010 and won the Yahoo! Key Scientific Challenges Award in 2011. He was one of the main contributors of a popular linear classification library, LIBLINEAR.
Feb 13 (Fri) 12:30 pm	2405	Benjamin Raichel	Fast geometric algorithms via netting, pruning, and sketching Abstract: The scale of modern geometric data sets necessitates fast algorithms. In this talk I will discuss several optimal linear (or near linear) time algorithms, which work by quickly throwing out and summarizing data, creating a compact sketch of the input. In the first part of the talk I will present a general framework called Net and Prune, which provides linear time approximation algorithms for a large class of well studied geometric optimization problems, such as k-center clustering and farthest nearest neighbor. The new approach is robust to variations in the input problem, and yet it is simple, elegant, and practical. In particular, many of these well studied problems which easily fit into our framework, either previously had no linear time approximation algorithms, or required rather involved algorithms and analysis. In the second part of the talk I will discuss contour trees, which provide a compact description of the level set behavior of structured geometric data. These trees are used in HPC applications such as combustion, chemical and fluid mixing simulations, where they are used to both summarize and explore the significantly larger simulation data. Here I will discuss an instance optimal algorithm for their computation, which runs in linear time when the tree is balanced. Bio: Benjamin Raichel is a PhD student in the Computer Science Department at the University of Illinois, Urbana-Champaign. His research interests are in algorithms and their applications. In particular he has developed fast and practical algorithms for a variety of geometric problems. He is currently funded by the UIUC Dissertation Completion Fellowship, and previously was awarded the Andrew and Shana Laursen Fellowship (2011-12) from the Department of Computer Science. Benjamin holds an MS degree in Computer Science (2011), as well as a BS degree with highest distinction in both Math and Physics (2009), from the University of Illinois.
Feb 18 (Wed) 3:00pm	2405	Yangqiu Song	Machine Learning with World Knowledge Machine learning algorithms have become pervasive in multiple domains and have started to have impact in applications. Nonetheless, a key obstacle in making learning protocol realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. However, while annotated data is difficult to get, we have available large amounts of data from the Web. In this talk, I will introduce learning paradigms which use existing world knowledge to “supervise” machine learning algorithms. By “world knowledge” we refer to general-purpose knowledge collected from the Web, and that can be used to extract both common sense knowledge and diverse domain specific knowledge and thus help supervise machine learning algorithms. I will discuss two projects, demonstrating that we can perform better machine learning and text data analytics by adapting general-purpose knowledge to domain specific tasks. For the first project, I will introduce the dataless classification algorithm which requires no labeled data to perform completely unsupervised text classification. In this case, the Wikipedia knowledge is embed to represent the text documents and the category labels into the same semantic space. For the second project, I will discuss how to perform hierarchical clustering of domain-specific short texts, e.g., Web queries and tweets, using a probabilistic concept based knowledge base, Probase. In both cases, we provide realistic and scalable algorithms to address large scale and fundamental text analytics problems. Bio: Dr. Yangqiu Song is a post-doctoral researcher at the Cognitive Computation Group at the University of Illinois at Urbana-Champaign. Before that, he was a post-doctoral fellow at Hong Kong University of Science and Technology and visiting researcher at Huawei Noah's Ark Lab, Hong Kong (2012-2013), an associate researcher at Microsoft Research Asia (2010-2012) and a staff researcher at IBM Research China (2009-2010) respectively. He received his B.E. and Ph.D. degrees from Tsinghua University, China, in July 2003 and January 2009, respectively. His current research focuses on using machine learning and data mining to extract and infer insightful knowledge from big data. The knowledge helps users better enjoy their daily living and social activities, or helps data scientists do better data analytics. He is particularly interested in working on large scale learning algorithms, on natural language understanding, text mining and visual analytics, and on knowledge engineering for domain applications.
Feb 25 (Wed) 2:00 pm	3403	Parasara Sridhar Duggirala	Dynamic Analysis of Cyber-Physical Systems Progress in computation and communication technologies has made it easier to integrate software in all walks of life. The social, economical, and environmental benefits of integrating software into avenues such as avionics, automotives, power grid, and medicine lead to the rise of CPS as an important area of research. However, bugs in software systems deployed in such safety-critical scenarios can lead to loss of property and in some cases life. In this talk, I will present dynamic analysis technique for formally verifying annotated Cyber-Physical Systems and prove the absence of bugs. The annotations, called discrepancy functions, are extensions of proof certificates for analyzing convergence or divergence of systems. One of the key advantages of dynamic analysis is that it leverages the testing procedures which are the only known scalable way of ensuring the system specification. I have developed a tool C2E2 that implements this technique and verifies temporal properties of CPS. C2E2 has been applied to verify alerting mechanisms in parallel aircraft landing protocol developed by NASA and to verify specification of powertrain control system presented as a verification challenge problem by Toyota. Bio: Parasara Sridhar Duggirala is a PhD Candidate in Computer Science Department at the University of Illinois at Urbana Champaign (UIUC). His main research interests are in Cyber-Physical Systems, Formal Methods, and Control Theory. His paper on Safety Verification of Linear Control Systems won the best paper award at the International Conference on Embedded Software (conducted as part of ESWeek) 2013. He also received Feng Chen Memorial Award in Software Engineering from the Department of Computer Science at UIUC, and has also been selected as a Young Research to attend Heidelberg Laureate Forum. He did internships at NEC Labs America and SRI International. He received his B.Tech (2009) from Indian Institute of Technology Guwahati.
Feb 26 (Thurs) 4:00pm	2405	Pranav Garg	Learning Invariants for Software Reliability and Security The central problem in software verification today is in establishing invariants that prove the system reliable or secure. Current technology requires invariants to be specified manually and is the bottleneck for adoption of verification in mainstream programming. I will describe recent advances in synthesizing invariants using machine learning techniques, embracing an inductive rather than a deductive approach to this problem. In particular, I propose a new learning model called ICE (involving Implication Counter-Examples) and develop new machine learning ICE algorithms that effectively synthesize invariants. I'll also describe ways of specifying infinite enumerated invariants using finite representations of it. These invariant generation techniques are applied to security and reliability domains including ExpressOS, a secure mobile operating system, GPU kernels, cloud systems, and the verification of the responsiveness of the full USB Windows phone driver. Bio: Pranav Garg is a PhD candidate in the Department of Computer Science at the University of Illinois at Urbana-Champaign. His research interests span areas at the intersection of programming languages, formal methods, and software engineering. His PhD research, in particular, focuses on automating verification for building reliable and secure software systems. He received his B.Tech (2009) in Computer Science and Engineering from the Indian Institute of Technology Kanpur.