Provenance documentation to enable explainable and trustworthy AI
Kale, Amruta. (2023-05). Provenance documentation to enable explainable and trustworthy AI. Theses and Dissertations Collection, University of Idaho Library Digital Collections. https://www.lib.uidaho.edu/digital/etd/items/kale_idaho_0089e_12560.html
- Title:
- Provenance documentation to enable explainable and trustworthy AI
- Author:
- Kale, Amruta
- Date:
- 2023-05
- Program:
- Computer Science
- Subject Category:
- Artificial intelligence
- Abstract:
-
Although (Artificial Intelligence/Machine Learning) AI/ML systems have outperformed humans in a variety of sectors, the inability to explain their autonomous decisions and actions has created a new challenge in the research community. The need for explainability has shifted the focus of AI research from complex black-box models to explainable and interpretable models. Recently the topic of Explainable AI (XAI) and Trustworthy AI (TAI) has become a hotspot and is widely acknowledged by academia, industry, and government. XAI aims to make AI/ML results more understandable and explainable to humans. While there are a variety of explainability approaches and methodologies designed for providing explanations and user-friendly decisions, each has its benefits and drawbacks as well as several unsolved challenges. Through this Ph.D. research, our objective is to analyze the inter-relationship between provenance, XAI, and TAI, build a software package to document provenance and extend reproducibility of AI/ML workflows, and test the package in real-world applications to support XAI and TAI. We want to demonstrate that provenance holds great promise for the new state-of-the-art AI/ML solutions; and adopting provenance documentation is increasingly important for illustrating the details of AI/ML workflows and guiding human decision-making.
In order to achieve our objective, we proposed five research topics and the corresponding activities:1. Identify the inter-relationship between provenance, XAI and TAI through a literature review. 2. Study different software tools/applications and workflow management system (WfMS) to understand how provenance is documented. 3. Highlight the importance of workflow standardization like Common Workflow Language (CWL) which provides a standardized framework to describe the AI/ML workflows and enable computational reproducibility and portability. 4. Build a software package to document provenance and describe AI/ML workflows into CWL-compliant format to extend reproducibility of workflows. 5. Test the package in real-world applications to support XAI and TAI. To address the first research topic, we investigated a variety of research papers, techniques, tools, and WfMS that support provenance, XAI, and TAI together. An extensive literature study was carried out with the Scopus database from 2010 - 2020 to discover records indicating provenance, XAI, and TAI, and identify the inter-relationship between them. To address the second research topic, we examined and demonstrated various WfMS, packages, and software applications that capture provenance to make AI/ML models transparent, explainable, and understandable. In order to address the third and fourth research topics, we developed a python package called geoweaver_cwl, which translates Geoweaver AI/ML workflows into the standardized format known as CWL. This package not only ensures that all the essential details of the workflows are documented, but it also enhances the computational reproducibility and portability of workflows. To demonstrate the practical application of geoweaver_cwl in our fifth research topic, we conducted a series of tests on various use cases ranging from simple to complex, drawn from Geoweaver and other domains.
The need for explainability in AI/ML models has attracted great attention in recent years. However, it is not sufficient to explain AI/ML models using post-hoc explanations alone. Provenance documentation is one of the means to accomplish transparency, traceability, explainability, and reproducibility in AI/ML models. In this research, we provide a community-driven solution geoweaver_cwl, which addresses the current struggles in attaining portability, reproducibility, transparency, and scalability of AI/ML workflows. We evaluated geoweaver_cwl using various use cases from different domains. The study indicates that the geoweaver_ cwl package can greatly assist the students, researchers, and geoscience community in translating their AI/ML workflows into CWL-compliant WfMS software applications.
We hope this Ph.D. research not only serves as a starting point for future research advances but also as a reference material that encourages experts and professionals from all disciplines to embrace the benefits of reproducible workflows, provenance, XAI, and TAI.
- Description:
- doctoral, Ph.D., Computer Science -- University of Idaho - College of Graduate Studies, 2023-05
- Major Professor:
- Ma, Xiaogang
- Committee:
- Xian, Min; Song, Jia; Nguyen, Tin; Soule, Terence
- Defense Date:
- 2023-05
- Identifier:
- Kale_idaho_0089E_12560
- Type:
- Text
- Format Original:
- Format:
- application/pdf
- Rights:
- In Copyright - Educational Use Permitted. For more information, please contact University of Idaho Library Special Collections and Archives Department at libspec@uidaho.edu.
- Standardized Rights:
- http://rightsstatements.org/vocab/InC-EDU/1.0/