2013/1 - 2017/1 Co-founder and Advisor, Involvement in/Creation of Start-up Group/Organization/Business Serviced: Data Tamer
Target Stakeholder: Industry/Business (>500 employees)
Outcome / Deliverable: Co-founded Data Tamer, a startup in the Boston Area along with my former student and collaborators.
Evidence of Uptake/Impact: Data Tamer has raised seed funding from Google Ventures and NEA. Pilot deployments of platform are underway
Activity Description: Co-founder and Co-inventor of core technology (licensed to the company), Currently acting as advisor on algorithms and technical details of the platform 2008/11 - 2016/11 Consultant, Consulting for Industry
Group/Organization/Business Serviced: Primal Fusion Inc
Target Stakeholder: Industry/Business-Medium (100 to 500 employees)
Outcome / Deliverable: Consulting and leading the research activities at Primal since 2008 Evidence of Uptake/Impact: more than 20 patents filed and multiple pilot prototypes. References / Citations / Web Sites: www.primal.com
Activity Description: Consulting and leading the research team: developing algorithms and solutions for the various technical challenges facing the product team.
International Collaboration Activities
2011/12 - 2013/9 Principal Investigator - MIT QCRI collaborative research Agreement, United States Lead the collaborative data analytics research activities at the Qatar computing research
Professor Ihab Ilyas
118
institute (QCRI) as part of the collaborative research agreement between MIT and QCRI. Sitting on the technical Advisory Committee of the collaborative research program for vetting and choosing funded research projects between QCRI and MIT
Presentations
1. (2018). Building Scalable Machine Learning Solutions for Data Curation. University of Michigan - MIDAS, Ann Arbor, United States
Main Audience: General Public Invited?: Yes, Keynote?: No
2. (2018). Building Scalable Machine Learning Solutions for Data Curation. Strata Data Summit, London, United Kingdom
Main Audience: Knowledge User Invited?: Yes, Keynote?: No
3. (2017). Building Scalable Machine Learning Solutions for Data Curation. HKUST University FinTech Education Forum, Hong Kong
Main Audience: General Public Invited?: Yes, Keynote?: Yes
4. (2017). Building Scalable Machine Learning Solutions for Data Curation. University of Hong Kong, Hong Kong
Main Audience: Researcher Invited?: Yes, Keynote?: No
5. (2017). Building Scalable Machine Learning Solutions for Data Curation. Strata Data Summit, New York, United States
Main Audience: Knowledge User Invited?: Yes, Keynote?: No
6. (2017). Building Scalable Machine Learning Solutions for Data Curation. University of Grenoble, France Main Audience: Researcher
Invited?: Yes, Keynote?: No
7. (2017). Building Scalable Machine Learning Solutions for Data Curation. HKUST, Hong Kong Main Audience: Researcher
Invited?: Yes, Keynote?: No
8. (2017). Building Scalable Machine Learning Solutions for Data Curation. DaQuaTa, Internal Workshop on Data Quality, Lyon, France
Main Audience: Researcher Invited?: Yes, Keynote?: Yes
9. (2017). Building Scalable Machine Learning Solutions for Data Curation. Tulane University, New Orleans, United States
Main Audience: Researcher Invited?: Yes, Keynote?: No
10. (2017). Data Cleaning from Theory to Practice. TU Berlin, Berlin, Germany Invited?: Yes, Keynote?: No 11. (2017). Building Scalable Machine Learning Solutions for Data Curation. Huawei Noah's Ark Lab, Hong
Kong
Main Audience: Researcher Invited?: Yes, Keynote?: No
Professor Ihab Ilyas
119 States
Main Audience: Knowledge User Invited?: Yes, Keynote?: No
13. (2016). Data Cleaning from Theory to Practice. DaQuaTa, Internal Workshop on Data Quality, Lyon, France Main Audience: Researcher
Invited?: Yes, Keynote?: Yes
14. (2016). Data Cleaning from Theory to Practice. Brown University, United States Invited?: Yes, Keynote?: No
15. (2015). The Case for Continuous and Proactive Data Curation. Thomson Reuters C-Level Panel, New York, United States
Main Audience: Decision Maker Invited?: Yes, Keynote?: No
16. (2015). Tackling Machine Learning Challenges in Data Cleaning. Strata Hadoop Summit, New York, United States
Main Audience: Knowledge User Invited?: Yes, Keynote?: No
17. (2014). Data Cleaning from Theory to Practice. Two Sigma (Investment Company), New York, United States
Main Audience: Researcher Invited?: Yes, Keynote?: No
18. (2014). Data Cleaning from Theory to Practice. Pivotal (Greenplum), California, United States Main Audience: Researcher
Invited?: Yes, Keynote?: No
19. (2014). Data Cleaning from Theory to Practice. Smart Nations Data Works, Singapore, Singapore Main Audience: Knowledge User
Invited?: Yes, Keynote?: Yes
20. (2014). Data Cleaning from Theory to Practice. Microsoft Research, Redwood, United States Main Audience: Researcher
Invited?: Yes, Keynote?: No
21. (2014). Data Cleaning from Theory to Practice. INFOS 2014, Egypt Main Audience: Researcher
Invited?: Yes, Keynote?: Yes
22. (2014). Data Cleaning from Theory to Practice. Yahoo Research Barcelona, Barcelona, Spain Main Audience: Researcher
Invited?: Yes, Keynote?: No
23. (2014). Data Cleaning from Theory to Practice. MIT, CSAIL Big Data Lecture Series, Boston, United States Main Audience: Researcher
Invited?: Yes, Keynote?: Yes
24. (2013). UClean: Non-Destructive Data Cleaning. IBM Almaden Research Center, Almaden, United States Main Audience: Researcher
Invited?: Yes, Keynote?: No
25. (2013). On the Relative Trust between Inconsistent Data and Inaccurate Constraints. The IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Brisbane, Australia Main Audience: Researcher
Professor Ihab Ilyas
120 Invited?: No, Keynote?: No
26. (2013). Interpreting Keyword Queries over Web Knowledge Bases. New York University, Abu-Dhabi, Abu Dhabi, United Arab Emirates
Main Audience: Researcher Invited?: Yes, Keynote?: No
27. (2012). UClean: Probabilistic Data Cleaning. The 10th International Workshop on Quality in Databases (QDB) in conjunction with VLDB 2012, Istanbul, Istanbul, Turkey
Main Audience: Researcher Invited?: Yes, Keynote?: Yes
Publications
Journal Articles
1. Liu X*, Golab L, Golab W, Ilyas I, Jin S*,. (2017). Smart Meter Data Analytics: Systems, Algorithmsand Benchmarking. ACM Transactions on Databases Systems (TODS). 42(1): 2:1 - 2:39. Published
Refereed?: Yes
2. Rekatsinas T*,Chu X*, Ilyas I, Ré C. (2017). HoloClean: Holistic Data Repairs withProbabilistic Inference. PVLDB - Proceedings of the VLDB Endowment. 10(11): 1190-1201.
Published Refereed?: Yes
3. Ilyas I. (2016). Effective Data Cleaning with Continuous Evaluation. IEEE Data Engineering Bulletin. 39(2): 38-46.
Published Refereed?: No
4. Chu X*, Ilyas I. (2016). Distributed Data Deduplication. PVLDB - Proceedings of the VLDB Endowment. 9(11): 864 - 875.
Published Refereed?: Yes
5. Khabsa M*, Elmagarmid A, Ilyas I, Hammady H, Ouzzani M. (2016). Learning to Identify Relevant Studies forSystematic Reviews Using Random Forest and External Information. Machine Learning. 102(3): 465- 482.
Published Refereed?: Yes
6. Abedjan Z*, Chu X*, Deng D*, Fernandez R*,Ilyas I, Ouzzani M, Papotti P, Stonebraker M, Tang N. (2016). Detecting Data Errors: Where are we and whatneeds to be done?. PVLDB - Proceedings of the VLDB Endowment. 9(12): 993-1004.
Published Refereed?: Yes
7. Chu X*, Ouzzani M, Morcos J*, Ilyas I, Papotti P, Tang N, and Ye Y,. (2015). KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing. PVLDB - Proceedings of the VLDB Endowment,. 8(12): 1952-1963.
Published Refereed?: Yes
8. Ihab I, Chu X*. (2015). Trends in Cleaning Relational Data: Consistencyand Deduplication. Foundations and Trends in Databases (FnTDB). 5(4): 281 - 393.
Professor Ihab Ilyas
121 Published
Refereed?: Yes
9. Dallachiesa M*, Palpanas T, Ilyas I. (2014). Top-k Nearest Neighbor Search in Uncertain DataSeries. PVLDB - Proceedings of the VLDB Endowment,. 8(1): 13-24.
Published Refereed?: Yes
10. Beskales G*, Ilyas I, Golab L, Galiullin A*. (2014). Sampling from Repairs of Conditional Functional
Dependency Violations. the VLDB Journal: The International Journal on Very Large Databases. 23(1): 103- 128.
Published Refereed?: Yes
11. Chu X*, Ilyas I, Papotti P. (2013). Discovering Denial Constraints. PVLDB - Proceeding of the VLDB Endowment. 6(13): 1498-1509.
Published Refereed?: Yes