PILOT SEMINAR
PILOT SEMINAR
Organized by Computer Science Associate DGS (started by Madhusudan Parthasarathy and Darko Marinov)
This seminar series is for practice of academic job talks by Illinois students and postdocs.
We have several students and postdocs going on the job market each year, and this seminar aims to give them feedback from faculty who are outside their area.
What do you need to do to give a talk in this seminar?
Any graduating student or postdoc who is going to be on the academic job market can give a PILOT seminar.
All faculty, postdocs, and students will be invited to attend and give comments on the talk.
These talks are meant to be the penultimate talk before the interview, where the presenter seeks feedback from faculty outside the area.
We expect the presenter to have already given some practice talks earlier (we encourage at least two earlier talks, one to the advisor's group, and another to the relevant area).
To ensure there will be some faculty who attend the talks, we expect the following:
- The presenter, with the help of the advisor, must invite at least five faculty personally to attend their talk (consider inviting seminar founders, DGS, and Associate DGS), and ensure that at least three faculty can come to the talk. These faculty should be outside the primary area of the presenter. When you ask for available time, plan for 90min slots (60min for talk + 30min for feedback), and if you create a poll, enable the "maybe" option because some people may attend only a part of the talk and meet you later.
- The time for the seminar should be fixed based on the availability of these faculty. To find a time that doesn't overlap with the job talks or departmental seminars, please ask your advisor to check the available dates on the departmental calendar or faculty Wiki that lists all upcoming job talks (e.g., in Spring 2022 departmental talks take place on Mondays and Wednesdays at 3:30 pm, so it is recommended to avoid scheduling your seminar on these dates/times).
More faculty may attend the talk as it will be publicly announced, but we would like to see some effort by the presenter and advisor in ensuring at least some people come to the talk.
Once you have a list of three faculty outside your area who have promised to attend your talk at some agreed time, please:
- edit this page to enter the date/time of your talk (sorted by date), your name, the talk title and abstract, and short bio;
- email Erin Henkelman (speakerseries@cs.illinois.edu) and Darko Marinov (marinov@illinois.edu) so that the department can schedule a physical room (hopefully we don't go fully online ever again!) and announce your talk; and
- fill out the form https://forms.illinois.edu/sec/3480516 no later than Thursday the week before your presentation date so the Speakers Series team can have sufficient time to set up and advertise your talk (materials sent after Thursday may not be included in departmental advertising). A member of the Speakers Series team will try to be at the beginning of your talk to help you get set up.
If you want to be hybrid and use Zoom, create a room on your own (so you get the video faster than if the department created a room for you). The room should be reserved for at least 90 minutes (60 minutes to present and at least 30 minutes to get feedback). If you use Zoom, please ask one of your attending faculty members to serve as a question moderator for your talk. They can help you manage the chat/questions during your seminar.
Please put slide numbers (in a visible place) on your slides during practice job talks.
You may find it useful to read these guidelines about academic job interviews:
- Getting an academic job by Michael Ernst - https://homes.cs.washington.edu/~mernst/advice/academic-job.html
Computer Science Grad Student Job Application & Interview Guide by Westley Weimer, Claire Le Goues, and Zak Fry - http://web.eecs.umich.edu/~weimerw/grad-job-guide/guide
How to get a faculty job, Part 2: The interview by Matt Welsh - http://matt-welsh.blogspot.com/2012/12/how-to-get-faculty-job-part-2-interview.html
Tips on the Interview Process by Jeannette M. Wing - https://www.cs.cmu.edu/afs/cs/usr/wing/www/talks/tips.pdf
Five Surprises from My Computer Science Academic Job Search by Arvind Narayanan - https://33bits.wordpress.com/2012/10/01/five-surprises-from-the-computer-science-academic-job-search
Welcome to the Job Market by Elizabeth Bondi-Kelly - https://sites.google.com/view/elizabethbondi/blog
Tips for Computer Science Faculty Applications by Yisong Yue - https://yisongyue.medium.com/checklist-of-tips-for-computer-science-faculty-applications-9fd2480649cc
Reflections on the CS academic and industry job markets by Rowan Zellers - http://rowanzellers.com/blog/rowan-job-search
Fantastic Faculty Jobs and How to Get Them by Jia-Bin Huang - https://dropbox.com/s/avkflol8mx99c7e/2022_12_05%20Academic%20Job%20workshop.pptx?dl=0
Faculty Application Advice by Sylvia Herbert - https://sylviaherbert.com/faculty-application-advice
UPenn has a lot of resources, e.g., https://cdn.uconnectlabs.com/wp-content/uploads/sites/74/2019/08/Faculty-job-application-guide.pdf linked from https://careerservices.upenn.edu/resources/guide-to-faculty-job-applications
If you are going on the job market soon, please add your info to https://cs.illinois.edu/about/people/graduating-phd-students or https://cs.illinois.edu/about/people/postdocs
2022-2023 Schedule
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
Jan 4th (Wednesday) 10am-11:30am | 2405 SC, zoom (https://illinois.zoom.us/my/manling2?pwd=SzM5Wk5neWlEK3VVTXBoa2ZMUXduZz09) | Manling Li | Title: From Entity-Centric to Event-Centric Multimodal Knowledge Acquisition Abstract: Events (what happened, who, when, where, why) describe fundamental human activities and are the core knowledge communicated through multiple forms of information, such as text, images, videos, or other data modalities. Our minds represent events at various levels of granularity and abstraction, which allows us to quickly access historical scenarios and reason about the future. Traditionally, multimodal information consumption has been entity-centric with a focus on concrete concepts (such as objects, object types, physical relations), or oversimplifying event understanding to be single-modal (text-only or vision-only), local, sequential and flat. Real events are multimodal, structured and probabilistic. Hence, I focus on Multimodal Information Extraction, and propose Event-Centric Multimodal Knowledge Acquisition to transform traditional entity-centric single-modal knowledge into event-centric multi-modal knowledge. Such a transformation poses two significant challenges: (1) understanding multimodal semantic structures that are abstract (such as events and semantic roles of objects): I will present a novel framework, CLIP-Event, to learn visual semantic structures via a zero-shot cross-modal transfer; (2) understanding temporal dynamics: I will introduce Event Graph Schema to capture complex timelines, intertwined participant relations and multiple possible outcomes. Such Event-Centric Multimodal Knowledge opens up the next generation of information access for deep semantic understandings behind the multimodal information. I will also show its positive results on long-standing open problems, such as timeline generation, meeting summarization, and question answering. Bio: Manling Li is a Ph.D. candidate at the Computer Science Department of University of Illinois Urbana-Champaign. Her work on multimodal knowledge extraction won the ACL'20 Best Demo Paper Award, and the work on scientific information extraction from COVID literature won NAACL'21 Best Demo Paper Award. She was a recipient of Microsoft Research PhD Fellowship in 2021. She was selected as a DARPA Riser in 2022, and a EE CS Rising Star in 2022. She was awarded C.L. Dave and Jane W.S. Liu Award, and has been selected as a Mavis Future Faculty Fellow. She led 19 students to develop the UIUC information extraction system and ranked 1st in DARPA AIDA TA1 evaluation each year. She has more than 30 publications on multimodal knowledge extraction and reasoning, and gave tutorials about event-centric multimodal knowledge at ACL'21, AAAI'21, NAACL'22, AAAI'23, etc. Additional information is available at https://limanling.github.io/. |
Jan 16th (Monday) 1pm-2:30pm | 2405 SC, zoom https://illinois.zoom.us/j/85237726901?pwd=SXYxTUloOHAwaGIxa3pvbW93SUdRZz09 | Saikat Dutta | Title: Randomness-Aware Testing of Machine Learning-based Systems Abstract: Machine Learning is rapidly revolutionizing the development of many modern-day systems. However, testing Machine Learning-based systems is challenging due to 1) the presence of non-determinism in internal components (e.g., stochastic algorithms) and external factors (e.g., execution environment) and 2) the lack of accuracy specifications. Most traditional software testing techniques, while widely used to improve software reliability, cannot tackle these challenges since they predominantly rely on an assumption of determinism and lack domain knowledge. The goal of my research is to develop novel testing techniques and tools to make Machine Learning-based systems more reliable. In this talk, I will present my work on automatically detecting bugs in Machine Learning-based systems and improving the quality of developer-written tests in such systems. My research exploits the fundamental principle that we can systematically reason about non-determinism and accuracy using rigorous statistical and probabilistic reasoning. I develop novel static and dynamic analyses for testing ML-based systems that build on this principle. My research exposed more than 50 bugs and improved the quality of hundreds of tests in more than 60 popular Machine Learning libraries, some of which are used in large-scale software ecosystems at companies like Microsoft, Google, Meta, Uber, and DeepMind. Bio: Saikat Dutta is a PhD Candidate in the Computer Science Department at UIUC, advised by Prof. Sasa Misailovic. Saikat’s research interests are at the intersection of Software Engineering and Machine Learning. Saikat’s research focuses on improving the reliability of Machine-learning based systems by developing novel testing techniques and tools. Saikat is the recipient of the Facebook PhD Fellowship, 3M Foundation Fellowship, and the Mavis Future Faculty Fellowship. More information at https://saikatdutta.web.illinois.edu. |
Jan 26th (Thursday) 1:30 pm to 3pm | Online on Zoom - https://illinois.zoom.us/j/89767697233?pwd=TlcxanpMWDlmMEVZSk1xem1UOUp5Zz09 | Pubali Datta | Title: Looking Past the Abstractions: Characterizing Information Flow in Real-World Systems Abstract: Abstractions have proven essential for us to manage computing systems that are constantly growing in size and complexity. However, as core design primitives are obscured, these abstractions can also engender new security challenges. My research investigates these abstractions and the underlying core functionalities to identify the implicit flow violations in modern computing systems. In this talk, I will detail my efforts in characterizing flow violations and investigating attacks leveraging them. I will first describe how the “stateless” abstraction of serverless computing platforms masks a reality in which functions are cached in memory for long periods of time, enabling attackers to gain quasi-persistence and how such attacks can be investigated through building serverless-aware provenance collection mechanisms. Then I will further investigate how IoT automation platforms (i.e., Trigger-Action Platforms) abstracts the underlying information flows among rules installed within a smart home. I will present my findings on modeling and discovering inter-rule flow violations through building an information flow graph for smart homes. These efforts demonstrate how practical and widely deployable secure systems can be built through understanding the requirements of systems as well as identifying the root cause of violations of these requirements. Bio: Pubali Datta is a PhD candidate at the University of Illinois Urbana-Champaign where she is advised by Professor Adam Bates in the study of system security and privacy. Pubali has conducted research on a variety of security topics, including IoT security, serverless cloud security, system auditing and provenance. Her dissertation is in the area of serverless cloud security, particularly in designing information flow control, access control and auditing mechanisms for serverless platforms – tailored to meet the design and operational requirements of such systems. Pubali has participated in graduate internships at Samsung Research America, SRI International and VMware. She will earn her Ph.D in Computer Science from the University of Illinois Urbana-Champaign in the Spring of 2023. |
Jan 30th (Monday) 2pm-3:30pm | 2405 SC, zoom: https://illinois.zoom.us/j/87476498465?pwd=UGV4b2dGU3ZFZ3dCckFDVEkwbzd3dz09 | Riccardo Paccagnella | Title: Software Security Challenges in the Era of Modern Hardware Abstract: Today’s hardware cannot keep secrets. Indeed, the past two decades have seen the discovery of a slew of attacks where an adversary exploits hardware features to leak software’s sensitive data. These attacks have shaken the foundations of computer security and caused a major disruption in the software industry. Fortunately, there has been a saving grace, namely the widespread adoption of models that have enabled developers to build secure software while comprehensively preventing hardware vulnerabilities. In this talk, I will present two new classes of vulnerabilities that fundamentally undermine these prevailing models for building secure software. In the first part, I will demonstrate that the current constant-time programming model is insufficient to guarantee constant-time execution. In the second part, I will demonstrate that the current resource partitioning model is insufficient to guarantee software isolation. Finally, I will provide an overview of my future research plans for enabling the design of more secure software and hardware systems. Bio: Riccardo Paccagnella is a PhD candidate in Computer Science at the University of Illinois Urbana-Champaign. His research is in system and hardware security. Riccardo is a recipient of a Distinguished Reviewer Award at the IEEE S&P 2021 Shadow PC, a Siebel Scholars Award, and a Chirag Foundation Graduate Fellowship. His work has been covered by national and international press — including Ars Technica, New Scientist, and Wired — and recognized with prestigious awards, including the Pwnie 2022 Award for Best Cryptographic Attack, the CSAW 2022 Applied Research Competition Best Paper Runner-up Award, a Pwnie 2021 Nomination for Most Innovative Research, and a CSLSC 2022 Best Presentation Award. In light of his research, the cryptographic community and several companies (including Cloudflare, Microsoft, Intel, AMD, Ampere, ARM) have taken action that includes patching cryptographic libraries, issuing security advisories, and creating new guidance for writing secure cryptographic code. |
Feb 6th (Monday) 11am-12:30pm | 2405 SC, Zoom: https://illinois.zoom.us/j/5494764956?pwd=MDNnaE5CWG0yRVlEZWl5bldoRnErZz09 passcode if asked: 021795 | Xiaohong Chen | Title: Matching Logic: Foundation of a Trustworthy Programming Language Framework Abstract: We write programs in programming languages and use various language tools to perform various computing/analyzing tasks. For example, we use a compiler or an interpreter to execute programs, a symbolic executer to execute programs with symbolic input, and a formal verifier to verify programs. However, these language tools work like a "black box" and produce no correctness certificates for the tasks they perform. Therefore, we have to trust them for what they claim about our programs, which creates a very large "trust base" in today's computing space. My research aims at reducing the trust base of language execution and analysis tools using a trustworthy programming language framework. In this framework, programming languages are rigorously and completely defined using logical axioms and mathematical notations. Language tools are automatically generated by the framework, and their correctness is certified by complete, rigorous, transparent, machine-checkable, and human-accessible proof certificates. Most importantly, these proof certificates can be automatically checked using a very small proof checker, serving as the minimal trust base of the framework. In this talk, I will present matching logic as the unifying logical foundation of such a trustworthy programming language framework. I will present the basics of matching logic and show how various program properties and programming languages can be uniformly specified using matching logic formulas and axioms. I will show how to generate matching logic proofs to certify the correctness of program interpreters and formal verifiers, and how to check those proofs using the matching logic proof checker, which has only 240 lines of code. Finally, I will provide an overview of my future research plans for enabling the design and implementation of more transparent and trustworthy programming language tools. Bio: Xiaohong Chen is a Ph.D. candidate in the computer science department at UIUC, advised by Prof. Grigore Rosu. Xiaohong's research interests are in logic, formal methods, and programming languages, with a focus on using rigorous machine-checkable proof certificates to reduce the trust base of various programming language tools. Xiaohong's research on matching logic (http://matching-logic.org) as a unifying foundation for programming has helped improve the safety and reliability of the K language framework (https://kframework.org). Xiaohong is the recipient of the Yunni and Maxine Pao Memorial Fellowship, the Mavis Future Faculty Fellowship, and the Graduate School Dissertation Completion Fellowship. His research proposal has been funded by the Ethereum Foundation for its potential to make smart contracts more trustworthy and transparent. More information at http://xchen.page/. |
Feb 20th (Monday) 11am-12:15pm | Online on Zoom: https://illinois.zoom.us/j/7030162755?pwd=QkY2OHI2K1ZFdjY3S3FwcU5FT05tUT09 | Jiaxin Huang | Title: Label-Efficient Textual Knowledge Extraction and Utilization Abstract: With tremendous amounts of texts across the Internet nowadays, various Natural Language Processing (NLP) systems are built to help people seek for valuable knowledge from massive corpora, by performing knowledge-intensive tasks like text retrieval, concept organization, commonsense reasoning, and question answering. Despite the remarkable success, most existing NLP systems still rely on large amounts of task-specific training data, which are costly to obtain. My research designs principled approaches for label-efficient, knowledge-based NLP applications which rely on minimal human supervision. In this talk, I will introduce a general framework for textual knowledge extraction and utilization: (1) concept ontology construction by transforming generic linguistic knowledge encoded in pre-trained language models into hierarchical structures connecting entities; (2) entity extraction by replacing manual prompt template designs with automatic soft verbalizer learning; (3) commonsense reasoning via entity knowledge prompting and iteratively optimizing reasoning paths generated by language models. Bio: Jiaxin Huang is a final-year Ph.D. candidate in the Department of Computer Science at University of Illinois, Urbana-Champaign, fortunately advised by Prof. Jiawei Han. Jiaxin's research interests lie in text mining and natural language processing with minimal human supervision. Her recent research focuses on (1) using pre-trained language models to automatically extract domain-specific hierarchical concepts and entities for structured knowledge construction; (2) extracting human actionable knowledge such as commonsense reasoning by prompting and training language models via machine-generated explicit reasoning paths. She is a recipient of the Microsoft Research PhD Fellowship (2021-2023). |
Feb 23rd (Thursday) 11am-12:30pm | Online on Zoom: https://illinois.zoom.us/j/2432644784?pwd=M0NYZkpIUThXM0I2bCtpcUxYbHJjZz09 password if asked: 209453 | Linyi Li | Title: Certifying Trustworthy Deep Learning Systems at Scale Abstract: Along with the wide deployment of deep learning (DL) systems, their lack of trustworthiness (robustness, fairness, numerical reliability, etc) is raising serious social concerns, especially in safety-critical scenarios such as autonomous driving, aircraft navigation, and facial recognition. Hence, a rigorous and accurate evaluation of the trustworthiness of DL systems is critical before their large-scale deployment. In this talk, I will introduce my research on certifying critical trustworthiness properties of large-scale DL systems. Inspired by techniques in optimization, cybersecurity, and software engineering, my work computes rigorous worst-case bounds to characterize the degree of trustworthiness for a given DL system and further improve such bounds via strategic training. Specifically, I will introduce two representative frameworks: (1) DSRS is the first framework with theoretically optimal certification tightness. DSRS along with our training method DRT and accompanying open-source tools (VeriGauge and alpha-beta-CROWN) is the state-of-the-art and award-winning solution for achieving DL robustness against constrained perturbations. (2) TSS is the first framework for building and certifying large DL systems with high accuracy against semantic transformations. TSS opens a series of subsequent research on guaranteeing semantic robustness for various downstream DL and AI applications. I will conclude this talk with a roadmap that outlines several core research questions and future directions on trustworthy machine learning. Bio: Linyi Li is a Computer Science PhD candidate advised by Prof. Bo Li and co-advised by Prof. Tao Xie at UIUC. Prior to his PhD, Linyi Li earned his bachelor’s degree in Computer Science from Tsinghua University in 2018. His research lies in the intersection of computer security, machine learning, and software engineering. He focuses on building certifiably trustworthy deep learning systems at scale by proposing state-of-the-art certification and training methods for various trustworthy properties such as robustness, fairness, and numerical reliability. He has published over 20 papers at S&P, CCS, ICML, NeurIPS, ICLR, ICSE, FSE, etc. He is the main developer or key contributor of several widely-known and award-winning deep learning certification tools including alpha-beta-CROWN (winner of VNN-COMP 2022), VeriGauge, CROP, and COPA. Linyi is a recipient of Adversarial Machine Learning Rising Star Award, Rising Star in Data Science Award, the Wing Kai Cheng Fellowship, and a finalist of Qualcomm Innovation Fellowship and Two Sigma PhD Fellowship. |
2021-2022 Schedule
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
February 9th (Wednesday) 4pm-5:30pm | Zoom set up by presenter | Xinya Du | Title: Towards More Intelligent Extraction of Information from Documents Abstract: Large amounts of text are written and published daily. As a result, applications such as reading through the documents to automatically extract useful and structured information from the text have become increasingly needed for people’s efficient absorption of information. They are essential for applications such as answering user questions, information retrieval, and knowledge base population. In this talk, I will focus on the challenges of finding and organizing information about events and introduce my research on leveraging knowledge and reasoning for document-level information extraction. In the first part, I’ll introduce methods for better modeling the knowledge from context: (1) generative learning of output structures that better model the dependency between extracted events to enable more coherent extraction of information (i.e., event A happening in the earlier part of the document is usually correlated with event B in the later part). (2) How to utilize information retrieval to enable memory-based learning with even longer context. Bio: Xinya Du is a Postdoctoral Research Associate at the University of Illinois at Urbana-Champaign working with Prof. Heng Ji. He earned a Ph.D. degree in Computer Science from Cornell University, advised by Prof. Claire Cardie. Before Cornell, he received a bachelor's degree in Computer Science from Shanghai Jiao Tong University. His research is on natural language processing, especially methods that leverage knowledge & reasoning skills for document-level information extraction. His work has been published in leading NLP conferences such as ACL, EMNLP, NAACL and has been covered by major media like New Scientist. He has received awards including the CDAC Spotlight Rising Star award and SJTU National Scholarship. |
February 25th (Friday) 11:30am-1pm | Zoom set up by presenter | Suraj Jog | Title: Scalable Next-Generation Wireless Networks Abstract: The next generation of wireless technologies will provide unprecedented capabilities -- gigabyte communication speeds at ultra-low latencies, hyper-precise localization, and vision-like perception. This will enable a plethora of new applications like wireless virtual and augmented reality, self-driving cars, space communications, precision agriculture, high-performance computing, and more. However, while these performance leaps have been demonstrated in the context of constrained networks with single users and controlled environments, the question of scaling these next-gen wireless technologies to large networks in the wild consisting of multiple heterogeneous nodes remains unsolved. Bio: Suraj Jog is a Ph.D. candidate in Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign (UIUC), working with Haitham Hassanieh. His research is focused on next-generation wireless networking and wireless sensing. Through his research, he has designed and built systems that can deliver seamless scalability in multiple application domains for millimeter-wave technology, such as gigabit-speed wireless communications, localization and imaging, and wireless networks-on-chip. His research has been recognized with the Qualcomm Innovation Fellowship, Joan and Lalit Bahl Fellowship, Mavis Future Faculty Fellowship, M.E Van Valkenburg Fellowship, Rambus Computer Engineering Fellowship, and more. |
March 2nd (Wednesday) 9am-10:30am | Zoom set up by presenter | Xuan Wang | Title: Automated Scientific Knowledge Extraction from Massive Text Data Abstract: Text mining is promising for advancing human knowledge in many fields, given the rapidly growing volume of text data (e.g., scientific articles, medical notes, and news reports) we are seeing nowadays. In this talk, I will present my work on automatically extracting knowledge from massive text data to enable and accelerate scientific discovery. First, I will talk about my work on information extraction with minimum human supervision. With the growing volume of text data and the breadth of information, it is inefficient or nearly impossible for humans to manually find, integrate, and digest useful information. To address the above challenge, I have developed methods that automatically extract entity and relation information from massive text data with minimum human supervision. Second, I will talk about my work on literature-based scientific knowledge discovery. This research direction aims to enable and accelerate real-world knowledge discovery with the rich information we automatically extracted from scientific text. I have collaborated with domain experts in various scientific disciplines (e.g., chemistry, biomedicine, and health) to achieve this goal. Last, I will conclude my talk with future directions on using text mining to address open scientific problems, such as to assist chemical and biological molecule design and to support clinical drug discovery. Bio: Xuan Wang is a fifth-year Ph.D. student in the Computer Science Department at the University of Illinois at Urbana-Champaign (UIUC). She is working in the Data Mining Group under the supervision of Prof. Jiawei Han. Xuan received M.S. in Statistics (2017) and M.S. in Biochemistry (2015) from UIUC. She received B.S. in Biological Science (2013) from Tsinghua University, China. Her research interests are in text mining and natural language processing, emphasizing applications to biological and health sciences. Her current research theme is developing effective and scalable algorithms and systems for automatically understanding massive text data to enable and accelerate scientific discovery. Xuan has published about 20 research/demo papers in top NLP conferences (e.g., ACL and EMNLP) and biomedical informatics journals (e.g., Bioinformatics) and conferences (e.g., ACM-BCB and IEEE-BIBM). She is the recipient of the YEE Fellowship Award in 2020-2021 from UIUC. |
March 2nd (Wednesday) 2:30pm-4pm | Zoom set up by presenter | Jing Liu | Title: Robust Learning & Inference with Applications in Distributed Learning and IoT Abstract: Robustness is of paramount importance in modern, scalable, and distributed machine learning (ML) and artificial intelligence (AI), particularly for safety-critical applications. On the one hand, distributed learning (e.g., Federated Learning) has emerged as a communication efficient, privacy-enhancing, and scalable approach for training without explicit centralized data collection. Unfortunately, training models with distributed data and computation further increases vulnerability to adversarial corruptions. This talk will outline modern solutions to fundamental estimation problems such as certifiable Robust Linear Regression, Robust PCA, and High-dimensional Robust Mean Estimation. Using these tools as building blocks, I will present recent work on Robust Distributed Learning & Inference. I will conclude the talk with future directions in efficient and trustworthy Artificial Intelligence of Things (AIoT). Bio: Jing Liu is an Illinois Future Faculty fellow in computer science at the University of Illinois at Urbana Champaign. His research interests include Data Science, the Internet of Things (IoT), and Distributed Learning & Inference. Liu was a postdoc in the Coordinated Science Lab and obtained his Ph.D. from UCSD. Liu is the recipient of several awards, including the Shannon Graduate Fellowship nomination award and Frontiers of Innovation Fellowship in UCSD, Guanghua Fellowship in Tsinghua University, National Fellowships of China, as well as Silver Medal, Young Mentor award in Beijing Institute of Technology, and a prize of Beijing Science & Technology Award. |
2020-2021 Schedule
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
February 1st (Monday) 11AM-12:30PM | Zoom set up by department | Wing Lam | Title: Taming Flaky Tests in a Non-Deterministic World Abstract: As software evolves, developers typically perform regression testing to ensure that their code changes do not break existing functionality. During regression testing, developers often waste time debugging their code changes because of spurious failures from flaky tests, which are tests that nondeterministically pass or fail on the same code. These spurious failures mislead developers because the failures are due to bugs that existed before the code changes. My work on characterizing flaky tests has helped open the research topic of flaky tests, and many companies (e.g., Facebook, Google, Microsoft) have since highlighted flaky tests as a major challenge in their software development. In this talk, I will describe my recent work on taming flaky tests. Two prominent kinds of flaky tests are order-dependent flaky tests, which pass when run in one order but fail when run in a different order, and async-wait flaky tests, which pass if an asynchronous call finishes on time but fail if it finishes too late. My results include the first automated techniques to (1) fix order-dependent flaky tests, fixing 92% of such flaky tests in a public dataset; (2) reduce the number of spurious failures from order-dependent flaky tests, reducing such failures by 73%; and (3) speed up async-wait flaky tests while also reducing their spurious failures, speeding up such tests by 38%. Overall, my work has helped detect more than 2000 flaky tests and fix more than 500 flaky tests in over 150 open-source projects. Bio: Wing Lam is a PhD candidate in the Computer Science department at the University of Illinois at Urbana-Champaign where he is co-advised by Professors Tao Xie and Darko Marinov. He works on several topics in software engineering, with a focus on software testing. Wing's research improves software dependability by characterizing bugs and developing novel techniques to detect and tame bugs. He has published in top-tier conferences such as ESEC/FSE, ICSE, ISSTA, OOPSLA, and TACAS. His techniques have helped detect and fix bugs in open-source projects and have impacted how Microsoft and Tencent developers test their code. Wing has been awarded several fellowships and scholarships, including a Google - CMD-IT Dissertation Fellowship Award. More information is available on his web page: |
February 2nd (Tuesday) 11:30AM-1PM | Zoom set up by department | Wajih Ul Hassan | Title: Detecting and Investigating System Intrusions with Provenance Analytics Abstract: Stories of devastating data breaches continue to dominate headlines around the world. Equifax, Target, and Office of Personnel Management are just a few examples of high-profile data breaches over the past decade. Despite a panoply of security products and increasing investment in data security, attackers are continually finding new ways to outsmart defenses to gain access to valuable data, indicating that current security approaches are ineffective. Data provenance describes the detailed history of system execution, allowing us to understand how system objects came to exist in their present state and providing means to identify the root cause of system intrusions. My research leverages provenance analytics to empower system defenders to quickly and effectively detect and investigate malicious behaviors. In this talk, I will first present a provenance-based solution for combatting the “Threat Alert Fatigue” problem that currently plagues enterprise security. Next, I will describe an approach for performing accurate and high-fidelity attack forensics using a novel adaptation of program analysis techniques. I will conclude by discussing the promise of provenance analytics to address open security and auditing problems in complex computing systems and emerging technologies. Speaker Bio: Wajih Ul Hassan is a doctoral candidate advised by Professor Adam Bates in the Department of Computer Science at the University of Illinois at Urbana-Champaign. His research focuses on securing complex networked systems by leveraging data provenance approaches and scalable system design. He has collaborated with NEC Labs and Symantec Research Labs to integrate his defensive techniques into commercial security products. He received a Symantec Research Labs Graduate Fellowship, a Young Researcher in Heidelberg Laureate Forum, an RSA Security Scholarship, a Mavis Future Faculty Fellowship, a Sohaib and Sara Abbasi Fellowship, and an ACM SIGSOFT Distinguished Paper Award. |
February 23 (Tuesday) 11:00AM-12:30PM | Zoom set up by department | Yunan Luo | Presentation title: Machine learning for large- and small-data biomedical discovery Abstract: In modern biomedicine, the role of computation becomes more crucial in light of the ever-increasing growth of biological data, which requires effective computational methods to integrate them in a meaningful way and unveil previously undiscovered biological insights. In this talk, I will discuss my research on machine learning for large- and small-data biomedical discovery. First, I will describe a representation learning algorithm for the integration of large-scale heterogeneous data to disentangle out non-redundant information from noises and to represent them in a way amenable to comprehensive analyses; this algorithm has enabled several successful applications in drug repurposing. Next, I will present a deep learning model that utilizes evolutionary data and unlabeled data to guide protein engineering in a small-data scenario; the model has been integrated into lab workflows and enabled the engineering of new protein variants with enhanced properties. I will conclude my talk with future directions of using data science methods to assist biological design and to support decision making in biomedicine. Bio: Yunan Luo (http://yunan.cs.illinois.edu/) is a Ph.D. student advised by Prof. Jian Peng in the Department of Computer Science, University of Illinois at Urbana-Champaign. Previously, he received his Bachelor’s degree in Computer Science from Tsinghua University in 2016. His research interests are in computational biology and machine learning. His research has been recognized by a Baidu Ph.D. Fellowship and a CompGen Ph.D. Fellowship. |
March 16th (Tuesday) 1PM-2:30PM CT Alternative Time: March 18th (Thursday) 7PM-8:30PM CT | Zoom set up by department | Liyuan Liu | Title: Towards Easy-to-Use Deep Learning: Effort-Light Transformer Training as an Example Abstract: Deep learning methods stand out with their ability to handle complicated data and tasks. However, successfully applying cutting-edge deep learning methods usually requires lots of extra care (e.g., heuristic tricks, excessive tuning on hyper-parameters, and data annotation costs). Given the inherent resource limitation of real-world applications, the demand of these efforts has hindered various applications and research. Bearing this in mind, I strive to build productive algorithms that can effectively make deep learning effort-light and easy-to-use. Bio: Liyuan Liu is a Ph.D. candidate in Computer Science at the University of Illinois at Urbana-Champaign, advised by Prof. Jiawei Han. He received his B.Eng. in Computer Science and Engineering at the University of Science and Technology of China in 2016. In his research, he strives to develop productive algorithms that can effectively reduce the resource consumption of deep learning, including expert efforts for data annotation and computation resources for tuning and training. Liyuan has published more than 20 papers in Top-Tier Conferences during his Ph.D. study. Liyuan has been awarded several fellowships and scholarships, including 2020 Yee Fellowship and 2015 Guo Moruo Scholarship. More information is available on his web page: http://liyuanlucasliu.github.io/ |
2019-2020 Schedule
If you want to present in 2405 SC (recommended), preferred times this semester are Tuesday, Thursday, and Friday afternoons. You should try to avoid times with talks already scheduled at /wiki/spaces/dls/pages/52953149 or Featured Lectures.
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
Feb 4 (Tuesday) 1:00 PM | 2405 SC | Raghavendra Pothukuchi | Title: Intelligent Computing Systems for Extreme-Efficiency and Security Abstract: My vision is to develop a new generation of computing systems that deliver extreme efficiency, together with reliability and security. Each component in the computing system continuously senses its execution and configures itself using intelligent control derived from principled methods like formal control and machine learning. In my talk, I will describe the techniques and prototypes I developed so far, which cover multiple system layers and heterogeneous hardware, and present the remarkable benefits of systems built with intelligent control. Bio: |
Feb 6 (Thursday) 2:00 PM | 2405 SC | August Shi | Title: Mitigating Flaky Tests Abstract: To mitigate the negative effects of flaky tests, I have developed several Bio: |
Feb 10 (Monday) 3:00 pm | 2405 SC | Radha Venkatagiri | Abstract: We live in a world were errors in computation will become ubiquitous and come from a wide variety of sources -- from unintentional soft errors in shrinking transistors to deliberate errors introduced by approximation or malicious attacks. Guaranteeing perfect functionality across a wide range of future systems will be prohibitively expensive. Error-Efficient computing offers a promising solution by allowing the system to make controlled errors and only preventing those errors that it absolutely must to ensure an acceptable user experience. Allowing the system to intelligently make errors can lead to significant resource (time, energy, bandwidth, etc.) savings. Error-efficient computing can transform the way we design hardware and software to exploit new sources of compute efficiency; however, excessive programmer burden and a lack of principled design methodologies have thwarted its adoption. My research addresses these limitations through foundational contributions that enable the adoption of error-efficiency as a first-class design principle by a variety of users and application domains. In this talk, I will show how my work (1) enables an understanding of how errors affect program execution by providing a suite of automated and scalable error analysis tools, (2) demonstrates how such an understanding can be exploited to build customized error-efficiency solutions targeted to low-cost hardware resiliency and approximate computing and (3) develops methodologies for principled integration of error-efficiency into the software and hardware design workflow. Finally, I will discuss future research avenues in error-efficient computing with multi-disciplinary implications in core disciplines (programming languages, software engineering, hardware design, systems) and emerging application areas (AI, VR, robotics, edge computing). Bio: Radha is a doctoral candidate in Computer Science at the University of Illinois at Urbana-Champaign. Her research interests lie in the area of Computer Architecture and Systems. Radha’s dissertation work aims to build efficient computing systems that redefine “correctness” as producing results that are good enough to ensure an acceptable user experience. Radha’s research work has been nominated to the IBM Pat Goldberg Memorial Best Paper Award for 2019. She was among 20 people invited to participate in an exploratory workshop on error-efficient computing systems initiated by the Swiss National Science Foundation and is one of 200 young researchers in Math and Computer Science worldwide to be selected for the prestigious 2018 Heidelberg Laureate Forum. Radha was selected for the Rising Stars in EECS and the Rising Stars in Computer Architecture (RISC-A) workshops for the year 2019. Before joining the University of Illinois, Radha was a CPU/Silicon validation engineer at Intel where her work won a divisional award for key contributions in validating new industry standard CPU features. Prior to that, she worked briefly at Qualcomm on architectural verification of the Snapdragon processor. |
Feb 18 (Tuesday) 12:30 pm | 2405 SC | Umang Mathur | Title: Algorithmic Advances for Dynamic Concurrency Bug Detection Abstract: Concurrency is indispensable in modern software applications, In the first part of my talk, I will describe a new partial order, Bio: Umang Mathur is a PhD candidate in the CS Department of |
2018-2019 Schedule
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
Feb 1 (Fri) | 3401SC | Qi Li | Title: Pattern-Based Mining of Entity/Relation Structures from Massive Text Abstract: In this talk, I will present a pattern-based methodology that conducts information extraction from the massive corpora using existing resources with little human effort. The first component, WW-PIE, discovers meaningful textual patterns that contain the entities of interest. The second component, TruePIE, discovers high quality textual patterns for target relation types. I will demonstrate how semi-supervised methods can empower information extraction for broad applications and provide explainable results. Bio: Qi Li is currently a postdoc researcher and adjunct professor at Department of Computer Science, University of Illinois at Urbana-Champaign, working with Prof. Jiawei Han. Her research interests lie in the area of data mining with a focus on the extraction and aggregation of information from multiple data sources. Qi obtained her PhD in Computer Science and Engineering from the State University of New York at Buffalo in 2017 advised by Prof. Jing Gao, and MS in Statistics from University of Illinois at Urbana-Champaign in 2012. She has received several awards including the Presidential Fellowship of University at Buffalo, the Best CSE Graduate Research Award and the CSE Best Dissertation Award at Department of Computer Science and Engineering, University at Buffalo. More information can be found at https://publish.illinois.edu/qili5/. |
Feb 1 (Fri) 3:30pm | 4405SC | Owolabi Legunsen | Title: Evolution-Aware Runtime Verification In this talk, I will describe my work on studying and improving runtime verification during testing. My large-scale study was the first to show that runtime verification during testing is beneficial for finding many important bugs from tests that developers already have. However, my study also showed that runtime verification still incurs high overhead, both in machine time to monitor properties and in developer time to inspect violations of the properties. Moreover, all prior runtime verification techniques consider only one program version and would wastefully re-monitor unaffected properties and code as software evolves. To reduce the overhead across multiple program versions, I proposed the first evolution-aware runtime verification techniques. My techniques exploit the key insight that software evolves in small increments and reduce the accumulated runtime verification overhead by up to 10x, without missing new violations. |
Feb 5 (Tue) 10:00am | SC 2405 | Mengjia Yan | Title: Secure Computer Hardware in the Age of Pervasive Security Attacks Abstract: BIO: |
Feb 8 (Fri) 3:30 pm | SC 4405 | Sangeetha Abdu Jyothi | Title: Abstract: In this talk, I will discuss the design of application-aware self-optimizing systems through automated resource management that helps meet the varied goals of the provider and applications in large-scale networked environments. The key steps in closed-loop resource management include learning of application resource needs, efficient scheduling of resources, and adaptation to variations in real-time. I will describe how I apply this high-level approach in two distinct environments using (a) Morpheus in enterprise clusters, and (b) Patronus in cellular provider networks with geo-distributed micro data centers. I will also touch upon my related work in application-specific context at the intersection of network scheduling and deep learning. I will conclude with my vision for self-optimizing systems including fully automated clouds and an elastic geo-distributed platform forthousandsofmicro data centers. Bio: Sangeetha Abdu Jyothi is a Ph.D. candidate at the University of Illinois at Urbana-Champaign advised by Brighten Godfrey. Her research interests lie in the areas of computer networking and systems with a focus on building application-aware self-optimizing systems through automated resource management. She is a winner of the Facebook Graduate Fellowship (2017-2019) and the Mavis Future Faculty Fellowship (2017-2018). She was invited to attend the Rising Stars in EECS workshop at MIT (2018). |
Feb 26 (Tue) 3:00 pm | SC 3403 | Motahhare Eslami | Title: Communicating Opaque Algorithmic Processes in Socio-Technical Systems Abstract: Algorithms play a vital role in curating online information in socio-technical systems, however, they are usually housed in black-boxes that limit users’ understanding of how an algorithmic decision is made. While this opacity partly stems from protecting intellectual property and preventing malicious users from gaming the system, it is also designed to provide users with seamless, effortless system interactions. However, this opacity can result in misinformed behavior among users, particularly when there is no clear feedback mechanism for users to understand the effects of their own actions on an algorithmic system. The increasing prevalence and power of these opaque algorithms coupled with their sometimes biased and discriminatory decisions raise questions about how knowledgeable users are and should be about the existence, operation and possible impacts of these algorithms. In this talk, I will address these questions by exploring ways to investigate users’ behavior around opaque algorithmic systems. I will then present new design techniques that communicate opaque algorithmic processes to users and provide them with a more informed, satisfying, and engaging interaction. In doing so, I will add new angles to the old idea of understanding the interaction between users and automation by designing around algorithm sensemaking and algorithm transparency. Bio: Motahhare Eslami is a Ph.D. Candidate in Computer Science at the University of Illinois at Urbana-Champaign, where she is advised by Karrie Karahalios. Motahhare’s research develops new communication techniques between users and opaque algorithmic socio-technical systems to provide users a more informed, satisfying, and engaging interaction. Her work has been recognized with a Google PhD Fellowship, Best Paper Award at ACM CHI, and has been covered in mainstream media such as Time, The Washington Post, Huffington Post, the BBC, Fortune, and Quartz. Motahhare is also a Facebook and Adobe PhD fellowship finalist, and a recipient of C.W. Gear Outstanding Graduate Student Award, Saburo Muroga Endowed Fellowship, Feng Chen Memorial Award, Young Researcher in Heidelberg Laureate Forum and Rising Stars in EECS. |
Mar 5 (Tue) 4:00 pm | SC 3403 | Jingbo Shang | Title: AutoNet: Automated Network Construction from Massive Text Corpora Abstract: Mining structured knowledge from massive unstructured text data is a key challenge in data science. In this talk, I will discuss my proposed framework, AutoNet, that transforms unstructured text data into structured heterogeneous information networks, on which actionable knowledge can be further uncovered flexibly and effectively. AutoNet is a data-driven approach using distant supervision instead of human curation and labeling. It consists of four essential steps: (1) quality phrase mining; (2) entity recognition and typing; (3) relation extraction; and (4) taxonomy construction. Along this line, I have developed a number of state-of-the-art distantly-supervised/unsupervised methods and published them in top conferences and journals. Specifically, I will present my work about phrase mining, entity recognition, and taxonomy construction in details, while touching the other work slightly. Finally, I will summarize the AutoNet framework with a demo video and conclude by discussing future work collaborating with other disciplines. Bio: Jingbo Shang is a Ph.D. candidate in Department of Computer Science, the University of Illinois at Urbana-Champaign. He received his B.E. from the Computer Science Department, Shanghai Jiao Tong University, China. His research focuses on mining and constructing structured knowledge from massive text corpora with minimum human effort. His research has been recognized by many prestigious awards, including Computer Science Excellence Scholarship from CS@Illinois, Grand Prize of Yelp Dataset Challenge in 2015, and Google Ph.D. Fellowship in Structured Data and Database Management in 2017, and C.W. Gear Outstanding Graduate Award in 2018. |
2017-2018 Schedule
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
2/15 Thursday 1:30PM | SC2405 | Wei Yang | Title: Adversarial-Resilience Assurance for Mobile Security Systems Abstract: For too long, researchers have often tackled security in an attack-driven, ad hoc, and reactionary manner with large manual efforts devoted by security analysts. In order to make substantial progress in security, I advocate to shift such manner to be automated, intelligent, and adversarial resilient. Over the course of my Ph.D. research, I have built security systems incorporating intelligent security techniques based on program analysis, natural language processing, and machine learning, and I have developed corresponding defenses and testing methodologies to guard against emerging attacks specifically adversarial to these newly-proposed security techniques. In this talk, I will first highlight two of these systems for mobile security: AppContext and WHYPER. Then I will show how to generate adversarial inputs for testing and further strengthening these systems. I will conclude by discussing how future research efforts can leverage the interplay between AI and security techniques toward a defense-driven security ecosystem. |
2/23 Friday 10am | SC3403 | Chao Zhang | Title: Knowledge Cube Construction from Massive Social Sensing Data |
3/9 Friday 10am | CSL301 | Izzat El Hajj | Title: Building Programming Systems in a World of Increasing Heterogeneity Abstract: The breakdown of Dennard Scaling and slowing down of Moore's Law has led to an explosion of new processor and memory technologies which is making computing systems evolve to become increasingly heterogeneous. We are seeing GPUs, FPGAs, and special purpose accelerators become central parts of systems, as well as a growing interest in persistent byte-addressable memories and near-memory acceleration. While these technologies provide massive performance gains and energy savings that are not possible on traditional systems, they tend to be very tedious to program which introduces a heavy burden on software developers and presents a significant barrier to adoption. It is therefore critical that these hardware innovations be met with software innovations that facilitate programmability. In this talk, I will discuss my work on building programming systems (languages, compilers, runtimes, OS support) for emerging processor and memory technologies. My talk will focus on two particular systems: (1) a compiler and runtime for improving performance and programmability of irregular applications on GPUs, and (2) a novel programmable accelerator and compiler that leverage analog computing via memristive crossbars to accelerate deep learning workloads. I will also discuss my future directions in both lines of work. Bio: Izzat is a PhD candidate in Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign, and a member of the IMPACT Research Group working with Prof. Wen-mei Hwu. Izzat's research interests are in building programming systems for emerging processor and memory technologies. He has worked on programming systems for GPUs tackling issues of performance portability (CGO'15, MICRO'16), irregular application optimization (MICRO'16), and collaborative execution via unified virtual memory (ISPASS'17). For his work on GPU programming systems, he holds the Dan Vivoli Endowed Fellowship '17-'18. He has also worked on programming systems for emerging resistive memory technologies, tackling the issue of persistent object representation (ASPLOS'16, OOPSLA'17) and the use of memristive crossbars for accelerating deep learning workloads (in submission). For the former, he received the HiPEAC paper award and has submitted multiple patent applications. Izzat received his BE in Electrical and Computer Engineering in 2011 at the American University of Beirut (AUB), where he graduated with high distinction and received the Distinguished Graduate Award. |
2016-2017 Schedule
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
February 9 (Thursday) 330 PM | 2405 SC | Matt Sinclair | Title: Efficient Coherence and Consistency for Specialized Memory Hierarchies Abstract: As the benefits from transistor scaling slow down, specialized accelerators and heterogeneous computing are becoming increasingly important because they can significantly improve performance and energy efficiency for specific applications. An efficient and easy-to-program memory and communication architecture is critical in achieving the promise of such systems. Traditionally, accelerators in heterogeneous systems used discrete address spaces and employed specialized memories, e.g., scratchpads, for specific access patterns. These attributes make these systems difficult to program and inefficient in the presence of high data reuse and fine-grained synchronization -- traits that are common in emerging applications such as graph analytics workloads. My thesis resolves these inefficiencies with cross-cutting research that rethinks the software, hardware, and hardware-software interface of heterogeneous systems. Underlying my work is the efficient support of a global address space across all accelerator memories (for easier programming), an efficient cache coherence protocol (for efficient hardware), and a familiar memory consistency model (for an appropriate hardware-software interface). First, I consider heterogeneous systems that have recently started to support global address spaces. I expose the imbalance between coherence and consistency in such systems. Current systems use simple, software-based coherence protocols that require heavyweight actions at synchronization points. To deal with this, industry has moved to complex consistency models that use scoped synchronization, making consistency models for heterogeneous systems even more complex than the already complicated CPU consistency models. I introduce a low overhead cache coherence protocol, DeNovo, that adjusts the imbalance and enables heterogeneous systems to use the standard, simpler data-race-free (DRF) consistency model. Second, I explore a further source of complexity in consistency models: relaxed atomics. These have been the Achilles heel of CPU consistency models -- they have the promise of higher performance but have no known formal semantics. Heterogeneous systems' inefficient support for atomics makes using relaxed atomics particularly tempting. I extend the DRF consistency model to retain the efficiency benefits of relaxed atomics and provide better semantics for the common use cases of relaxed atomics in heterogeneous systems. Third, current systems continue to support specialized memories in private address spaces, which negate some of the benefits of specialization. I integrate these specialized memories into the global address space while retaining the benefits they provide. Overall, my research introduces a more efficient and easier-to-program heterogeneous memory hierarchy that significantly improves both performance and energy compared to the state-of-the-art heterogeneous systems. Bio: Matt Sinclair is a doctoral candidate in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He is interested in computer architecture and systems, with a current focus on building efficient memory hierarchies for heterogeneous systems. His papers at the 2015 International Symposium on Computer Architecture (ISCA) and 2015 International Symposium on Microarchitecture (MICRO) were recognized as 2016 IEEE Micro Top Picks Honorable Mentions. He is the recipient of a Qualcomm Innovation Fellowship, two Mavis Future Faculty Fellowships, the Feng Chen Memorial Award, the W.J. Poppelbaum Award, and a Saburo Muroga Fellowship. He was also selected to attend and present his research at the 2016 Heidelberg Laureate Forum. He received a BS in Computer Science with Honors & Computer Engineering (2009) and a MS in Electrical Engineering (2011) from the University of Wisconsin-Madison. |
February 16 (Thursday) 3.30 PM | 2405 SC | Renato Mancuso | Title: Safe, Real-Time Software Reference Architectures for Cyber-Physical Systems Abstract: There has been an uptrend in the demand and need for complex Cyber-Physical Systems (CPS), such as self-driving cars, unmanned aerial vehicles (UAVs), and smart manufacturing systems for Industry 4.0. CPS often need to accurately sense the surrounding environment by using high-bandwidth acoustic, imaging and other types of sensors; and to take coordinated decisions and issue time critical actuation commands. Hence, temporal predictability in sensing, communication, computation, and actuation is a fundamental attribute. Additionally, CPS must operate safely even in spite of software and hardware misbehavior to avoid catastrophic failures. To satisfy the increasing demand for performance, modern computing platforms have substantially increased in complexity; for instance, multi-core systems are now mainstream, and partially re-programmable system-on-chip (SoC) have just entered production. Unfortunately, extensive and unregulated sharing of hardware resources directly undermines theabilityofguaranteeingstrongtemporal determinism in modern computing platforms. Novel software architectures are needed to restore temporal correctness of complex CPS when using these platforms. My research vision is to design and implement software architectures that can serve as a reference for the development of high-performance CPS, and that embody two main requirements: temporal predictability and robustness. In this talk, I will address the following questions concerning modern multi-core systems: Why application timing can be highly unpredictable? What techniques can be used to enforce safe temporal behaviors on multi-core platforms? I will also illustrate possible approaches for time-aware fault tolerance to maximize CPS functional safety. Finally, I will review the challenges faced by the embedded industry when trying to adopt emerging computing platforms, and I will highlight some novel directions that can be followed to accomplish my research vision. Bio: Renato Mancuso is a doctoral candidate in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He is interested in high-performance cyber-physical systems, with a specific focus on techniques to enforce strong performance isolation and temporal predictability in multi-core systems. He has publishedaround20 papers in major conferences and journals. His papers were awarded a best student paper awardandabestpresentationaward at the Real-Time and Embedded Technology and Applications Symposium (RTAS) in 2013 and 2016, respectively. He was the recipient of a Computer Science Excellence Fellowship, and a finalist for the Qualcomm Innovation Fellowship. Some of the design principles for real-time multi-core computing proposed in his research have been officially incorporated in recent certification guidelines for avionics systems. They have also been endorsed by government agencies,industriesandresearch institutions worldwide. He received a B.S. in Computer Engineering with honors (2009)andaM.S.in Computer Engineering with honors (2012) from the University of Rome "Tor Vergata". |
February 17 11am-12pm | 2405 SC | Xiang Ren | Title: Effort-Light StructMine: Turning Massive Corpora into Structures Abstract: In this talk, I will introduce a data-driven framework, Effort-Light StructMine, that extracts structured facts from massive corpora without explicit human labeling effort. In particular, I will discuss how to solve three StructMine tasks under Effort-Light StructMine framework: from identifying typed entitiesintext,tofine-grained entity typing, to extracting typed relationships between entities. Together, these threesolutionsformaclear roadmap for turning a massive corpus into a structured network to represent its factual knowledge. Finally, I will share some directions towards mining corpus-specific structured networks for knowledge discovery. Bio: |
February 24 (Friday) 1:30 PM | 2405 SC | Man-Ki Yoon | Title: CPS Security: Algorithms, Analysis and Experimental Validation Abstract: The increased computational power and connectivity in modern Cyber-Physical Systems (CPS) inevitably introduce more security vulnerabilities. CPS poses unique security challenges, as such systems are required to meet stringent requirements such as timing constraints as well as strong safety requirements. On the other hand, it provides defenders with an opportunity to take advantage of the design and implementation constraints and the tight coupling of cyber and physical components to deter attackers. In this talk, we will discuss how such intrinsic characteristics of CPS can be used as an asymmetric advantage to detect security attacks to safety-critical CPS. We will particularly discuss (a) modeling and reasoning about the logical (temporal and spatial) and physical behaviors of CPS, (b) architecturalandoperating-systemsupports for trusted, efficient run-time behavior monitoring, and (c) attack-resilient architectures. Bio: Man-Ki Yoon isaPhD candidate in Computer Science at the University of Illinois at Urbana-Champaign. His research interests are in analytic tools and system design principles for secure cyber-physical and real-time embedded systems, applying computer architecture, real-time scheduling, and statistical learning techniques. He is a recipient of Qualcomm Innovation Fellowship, Qualcomm Roberto Padovani Scholarship, and IntelPhD fellowship. |
March 8 (Wednesday) 1:00 PM | 2405 SC | Snigdha Chaturvedi | Title: Structured Approaches to Natural Language Understanding Abstract: Despite recent advancements in Natural Language Processing, computers today cannot understand text in the ways that humans can. My research aims at creating computational methods that not only read but also interpret and reason about text. To accomplish this, I develop machine-learning methods that incorporate different sources of social context, linguistic structure and semantic knowledge while processing text. In this talk, I exhibit this by discussing two specific applications of natural language understanding that focus on comprehension of narratives: (i) Choosing correct endings to stories, and (ii) Identifying inter-personal relationships from narratives. Automatic narrative comprehension is a fundamental challenge in Natural Language Understanding, and can enable computers to understand social norms, human behavior and common sense by processing large corpora of such texts. In the first part of the talk, I present a model that attempts to understand a story on three semantic axes: (i) its sequence of events, (ii) its emotional trajectory, and (iii) its plot consistency. We judge the model’s understanding by inquiring if, like humans, it can develop an expectation of what will happen next in a story and predict the correct ending from possible alternatives. In the next part of the talk I address another important aspect of Natural Language Understanding: identifying social relationships from unstructured text. Understanding such relationships is essential for developing an understanding of people's goals, actions and expected behavior in stories. We develop structured models that incorporate linguistic as well as contextual cues for capturing the evolving nature of human relationships. We automatically discover various types of relationships in a data-driven manner. I conclude with a discussion of future directions, and some real-world scenarios that would gain from such advancements in natural language understanding, including social networks, discussion fora, intelligent virtual assistants, and artificial tutors. Bio: Snigdha Chaturvedi is a postdoctoral fellow at University of Illinois, Urbana Champaign, working with Professor Dan Roth. She specializes in the field of Natural Language Processing with emphasis on developing methods for natural language understanding. Her research has been recognized with the IBM Ph.D. Fellowship (twice), a best paper award at NAACL, and first prize at ACM student research competition held at Grace Hopper Conference. She completed her Ph.D. in Computer Science at University of Maryland, College Park in 2016 and Bachelors in Technology from Indian Institute of Technology, Kanpur in 2009. She has previously held a position as a Blue Scholar at IBM Research, India. |
March 9 (Thursday) 4:00pm-5:30pm | 2405 SC | Yingyan Lin | Title: Energy-efficient Systems for Information Processing and Transfer Abstract: Machine learning (ML) algorithms are increasingly pervasive in tackling the data deluge of the 21st Century. Current ML systems adopt either a centralized cloud computing or a distributed mobile computing paradigm. In both paradigms, the challenge of energy efficiency is drawing increased attention. In cloud computing, data transfer due to inter-chip, inter-board, inter-shelf and inter-rack communications (I/O interface) within data centers is one of the dominant energy costs. This will only intensify with the growing demand for increased I/O bandwidth for high-performance computing in data centers. On the other hand, in mobile computing, energy efficiency is the primary design challenge, as mobile devices have limited energy, computation and storage resources. This challenge is being exacerbated by the need to embed ML algorithms for enabling local inference capabilities. In this talk, I will present system-to-circuit approaches for addressing these energy efficiency challenges. First, I will describe the design of a 4-bit, 4 GS/s bit-error-rate optimal analog-to-digital converter in 90nm CMOS and its use in realizing an energy-efficient 4Gb/s serial link receiver for I/O interface in data centers. Next, I will describe two techniques that can potentially enable on-device deployment of convolutional neural networks (CNNs) by significantly reducing the energy consumption via algorithmic/architectural innovation. Finally, I will identify future research directions in the emerging area of machine learning on resource-constrained silicon platforms. Bio: Yingyan Lin is a Ph.D. candidate in the Electrical and Computer Engineering Department at the University of Illinois at Urbana-Champaign under the advisement of Professor Naresh Shanbhag. She expects to receive her Ph.D. degree in June 2017. Her research includes analog and mixed signal circuits for I/O interfaces, error resiliency techniques, VLSI circuits and architectures for machine learning on resource-constrained silicon platforms. She has 11 peer-reviewed publications as the first author on the subject and designed three high-speed interface circuit IPs for large flat panel display applications that were acquired by TOSHIBA Microelectronics Corporation in Japan. She received the 2016 IEEE International Workshop on Signal Processing Systems second place Best Student Paper Award and is the recipient of the 2016-2017 Robert T. Chien Memorial Award for Excellence in Research at UIUC. |
2015-2016 Schedule
Organized by Madhusudan Parthasarathy and Darko Marinov
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
Feb 8 (Monday) 4PM | 3405SC | Parisa Kordjamshidi | Title: Declarative Learning based Programming for Structured Machine Learning in Natural Language Processing Developing intelligent problem solving systems that deal with real world messy data requires addressing a range of scientific and engineering challenges. Conventional programming languages offer no help to application programmers that attempt to make use of real world data, and reason about it in a way that involves learning interdependent concepts from data, incorporating existing models, and reasoning about them. Over the last few years the research community has tried to address these problems from multiple perspectives, most notably various approaches based on Probabilistic programming, Logical programming and integrated paradigms. In this talk I present Saul, a new declarative learning based programming (DeLBP) language that aims at facilitating the design and development of intelligent real world applications that use machine learning and reasoning. Our new language addresses the following challenges: Interaction with messy data; Specifying the problem at a high level i.e. application level; Dealing with uncertainty in data and knowledge; Supporting structured learning and reasoning while considering expert knowledge that is represented declaratively. An additional advantage of such a paradigm is generating easily reusable models and code, henceforth increasing the replicability of research results. I exemplify the flexibility and the expressive power of this language using a number of applications in natural language processing domain. |
Feb 19 (Friday) 2PM | 2405SC | Jia-Bin Huang | Visual Analysis and Synthesis with Physically Grounded Constraints The past decade has witnessed remarkable progress in image-based, data-driven vision and graphics. However, existing approaches often treat the images as pure 2D signals and not as a 2D projection of the physical 3D world. As a result, a lot of training examples are required to cover sufficiently diverse appearances and inevitably suffer from limited generalization capability. In this talk, I will present "inference-by-composition" approaches to overcome these limitations by modeling and interpreting visual signals in terms of physical surface, object, and scene. I will show how we can incorporate physically grounded constraints in a non-parametric optimization framework for (1) revealing the missing parts of an image due to removal of a foreground or background element, (2) recovering high spatial frequency details that are not resolvable in low-resolution observations, and (3) discovering multiple approximately linear structures in extremely noisy videos with an ecological application to bird migration monitoring at night. The resulting algorithms are simple and intuitive while achieving state-of-the-art performance without the need of training on an exhaustive set of visual examples. I will end my talk with a brief discussion of some key challenges and opportunities in visual learning with weak supervision. Bio: Jia-Bin Huang is a Ph.D. candidate in the Department of Electrical and Computer Engineering at University of Illinois, Urbana-Champaign advised by Prof. Narendra Ahuja. His research interests include computer vision, computer graphics, and machine learning with a focus on visual analysis and synthesis with physically grounded constraints. His research received the best student paper award in IAPR International Conference on Pattern Recognition (ICPR) in 2012 for the work on computational modeling of visual saliency and the best paper award in the ACM Symposium on Eye Tracking Research and Applications (ETRA) in 2014 for work on learning-based eye gaze tracking. Huang is the recipient of the UIUC Graduate College Dissertation Completion Fellowship (2015), Thomas and Margaret Huang Award for Graduate Research (2015), Beckman Cognitive Science/Artificial Intelligence Award (2015), Sundaram Seshu Fellowship (2014), MOE Technologies Incubation Scholarship (2014), and the PURE Best Research Mentor Award (2012). Personal website: http://www.jiabinhuang.com |
2014-2015 Schedule
Organized by Madhusudan Parthasarathy and Darko Marinov
Date and Time | Room | Speaker | Title and Abstract |
---|---|---|---|
Feb 4 (Wed) 10am | 2405 | Milos Gligoric | Regression Testing: Theory and Practice Developers often build regression test suites that are automatically run for each code revision to check that code changes did not break any functionality. While regression testing is important, it is also expensive due to both the number of revisions and the number of tests. For example, Google recently reported that they observed a quadratic increase in daily test-suite run time (a linear increase in the number of revisions per day and a linear increase in the number of tests per revision). In this talk, I present a technique, called Ekstazi, to substantially reduce test-suite run time. Ekstazi introduces a novel approach to regression test selection, which runs only a subset of tests whose dependencies may be affected by the latest changes; Ekstazi keeps file dependencies for each test. Ekstazi also speeds up test-suite runs for software that uses modern distributed version-control systems; by modeling different branch and merge commands directly, Ekstazi computes test sets that can be significantly smaller than the entire test suite. I developed Ekstazi for JVM languages and evaluated it on several hundred revisions of 32 open-source projects (totaling 5M lines of code). Ekstazi can reduce test-suite run time an order of magnitude, including runs for merge revisions. Finally, only a few months after the initial release, Ekstazi was adopted and used daily by many developers from several open-source projects, including Apache Camel, Commons Math, and CXF. Bio: Milos Gligoric is a PhD candidate in Computer Science at the University of Illinois at Urbana-Champaign (UIUC). His research interests are in software engineering and formal methods, especially in designing techniques and tools that improve software quality and developers' productivity. His PhD work has explored test input generation, test quality assessment, testing concurrent code, and regression testing. He won an ACM SIGSOFT Distinguished Paper Award (ICSE 2010), and three of his papers were invited for a journal submission. He was awarded the Saburo Muroga Fellowship (2009), the C.L. and Jane W-S. Liu Award (2012), and the C. W. Gear Outstanding Graduate Award (2014) from the UIUC Department of Computer Science, and the Mavis Future Faculty Fellowship (2014) from the UIUC College of Engineering. He did internships at NASA Ames, Intel, Max Planck Institute for Software Systems, and Microsoft Research. Milos holds a BS (2007) and MS (2009) from the University of Belgrade, Serbia. |
Feb 10 (Tue) 10 am | 2405 | Kai-Wei Chang | Practical Learning Algorithms for Structured Prediction Models The desired output in many machine learning tasks is a structured object such as a tree, a clustering of nodes, or a sequence. Learning accurate prediction models for such problems requires training on large amounts of data, making use of expressive features and performing global inference that simultaneously assigns values to all interrelated nodes in the structure. All these contribute to significant scalability problems. We describe a collection of results that address several aspects of these problems – by carefully selecting and caching samples,structures, or latent items. Our results lead to efficient learning algorithms for structured prediction models and for online clustering models which, in turn, support reduction in problem size, improvements in training and evaluation speed and improved performance. We have used our algorithms to learn expressive models from large amounts of annotated data and achieve state-of-the art performance on several natural language processing tasks. Bio: Kai-Wei Chang is a doctoral candidate advised by Prof. Dan Roth in the Department of Computer Science, University of Illinois at Urbana-Champaign. His research interests lie in designing practical machine learning techniques for large and complex data and applying them to real world applications. He has been working on various topics in Machine learning and Natural Language Processing, including large-scale learning, structured learning, coreference resolution, and relation extraction. Kai-Wei was awarded the KDD Best Paper Award in 2010 and won the Yahoo! Key Scientific Challenges Award in 2011. He was one of the main contributors of a popular linear classification library, LIBLINEAR. |
Feb 13 (Fri) 12:30 pm | 2405 | Benjamin Raichel | Fast geometric algorithms via netting, pruning, and sketching Abstract: The scale of modern geometric data sets necessitates fast algorithms. In this talk I will discuss several optimal linear (or near linear) time algorithms, which work by quickly throwing out and summarizing data, creating a compact sketch of the input. In the first part of the talk I will present a general framework called Net and Prune, which provides linear time approximation algorithms for a large class of well studied geometric optimization problems, such as k-center clustering and farthest nearest neighbor. The new approach is robust to variations in the input problem, and yet it is simple, elegant, and practical. In particular, many of these well studied problems which easily fit into our framework, either previously had no linear time approximation algorithms, or required rather involved algorithms and analysis. In the second part of the talk I will discuss contour trees, which provide a compact description of the level set behavior of structured geometric data. These trees are used in HPC applications such as combustion, chemical and fluid mixing simulations, where they are used to both summarize and explore the significantly larger simulation data. Here I will discuss an instance optimal algorithm for their computation, which runs in linear time when the tree is balanced. Bio: Benjamin Raichel is a PhD student in the Computer Science Department at the University of Illinois, Urbana-Champaign. His research interests are in algorithms and their applications. In particular he has developed fast and practical algorithms for a variety of geometric problems. He is currently funded by the UIUC Dissertation Completion Fellowship, and previously was awarded the Andrew and Shana Laursen Fellowship (2011-12) from the Department of Computer Science. Benjamin holds an MS degree in Computer Science (2011), as well as a BS degree with highest distinction in both Math and Physics (2009), from the University of Illinois. |
Feb 18 (Wed) 3:00pm | 2405 | Yangqiu Song | Machine Learning with World Knowledge Machine learning algorithms have become pervasive in multiple domains and have started to have impact in applications. Nonetheless, a key obstacle in making learning protocol realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. However, while annotated data is difficult to get, we have available large amounts of data from the Web. In this talk, I will introduce learning paradigms which use existing world knowledge to “supervise” machine learning algorithms. By “world knowledge” we refer to general-purpose knowledge collected from the Web, and that can be used to extract both common sense knowledge and diverse domain specific knowledge and thus help supervise machine learning algorithms. I will discuss two projects, demonstrating that we can perform better machine learning and text data analytics by adapting general-purpose knowledge to domain specific tasks. For the first project, I will introduce the dataless classification algorithm which requires no labeled data to perform completely unsupervised text classification. In this case, the Wikipedia knowledge is embed to represent the text documents and the category labels into the same semantic space. For the second project, I will discuss how to perform hierarchical clustering of domain-specific short texts, e.g., Web queries and tweets, using a probabilistic concept based knowledge base, Probase. In both cases, we provide realistic and scalable algorithms to address large scale and fundamental text analytics problems. Bio: Dr. Yangqiu Song is a post-doctoral researcher at the Cognitive Computation Group at the University of Illinois at Urbana-Champaign. Before that, he was a post-doctoral fellow at Hong Kong University of Science and Technology and visiting researcher at Huawei Noah's Ark Lab, Hong Kong (2012-2013), an associate researcher at Microsoft Research Asia (2010-2012) and a staff researcher at IBM Research China (2009-2010) respectively. He received his B.E. and Ph.D. degrees from Tsinghua University, China, in July 2003 and January 2009, respectively. His current research focuses on using machine learning and data mining to extract and infer insightful knowledge from big data. The knowledge helps users better enjoy their daily living and social activities, or helps data scientists do better data analytics. He is particularly interested in working on large scale learning algorithms, on natural language understanding, text mining and visual analytics, and on knowledge engineering for domain applications. |
Feb 25 (Wed) 2:00 pm | 3403 | Parasara Sridhar Duggirala | Dynamic Analysis of Cyber-Physical Systems Progress in computation and communication technologies has made it easier to integrate software in all walks of life. The social, economical, and environmental benefits of integrating software into avenues such as avionics, automotives, power grid, and medicine lead to the rise of CPS as an important area of research. However, bugs in software systems deployed in such safety-critical scenarios can lead to loss of property and in some cases life. In this talk, I will present dynamic analysis technique for formally verifying annotated Cyber-Physical Systems and prove the absence of bugs. The annotations, called discrepancy functions, are extensions of proof certificates for analyzing convergence or divergence of systems. One of the key advantages of dynamic analysis is that it leverages the testing procedures which are the only known scalable way of ensuring the system specification. I have developed a tool C2E2 that implements this technique and verifies temporal properties of CPS. C2E2 has been applied to verify alerting mechanisms in parallel aircraft landing protocol developed by NASA and to verify specification of powertrain control system presented as a verification challenge problem by Toyota. |
Feb 26 (Thurs) 4:00pm | 2405 | Pranav Garg | Learning Invariants for Software Reliability and Security |