- 84
- 10 551
LF AI & Data Foundation
Приєднався 17 вер 2020
LF AI & Data is an umbrella foundation of the Linux Foundation that supports open source innovation in artificial intelligence (AI) and data. LF AI & Data was created to support open source AI and data, and to create a sustainable open source AI and data ecosystem that makes it easy to create AI and data products and services using open source technologies. We foster collaboration under a neutral environment with an open governance in support of the harmonization and acceleration of open source technical projects.
Cleveland Big Data Meetup - November 18, 2024
Paco Nathan! Paco is an O'Reilly author on AI and Machine Learning.
"Catching Bad Guys using open data, open models in AI: a tour through anti-fraud use cases with graphs and entity resolution"
GraphRAG is a popular way to use knowledge graphs to ground AI apps in facts. Most GraphRAG tutorials use LLMs to build graph automatically from unstructured data. However, what if you're working on use cases such as investigative journalism and sanctions compliance -- "catching bad guys" -- where transparency for decisions and evidence are required?
This talk explores how to leverage open data and open models for AI apps -- using entity resolution to build investigative graphs which are accountable, exploring otherwise hidden relations in the data that indicate fraud or corruption. Professionals who work in sanctions compliance, tax fraud, counter-terrorism, etc., -- which our team helps support -- generally don't present a lot in public. However, we can use open data and open source to illustrate where machine learning assists in these kinds of use cases.
For this talk we'll construct an investigative graph about potential money laundering, using ER to merge open data from ICIJ Offshore Leaks, Open Ownership, and OpenSanctions. We'll explore techniques used in production use cases for anti-money laundering (AML), ultimate beneficial owner (UBO), rapid movement of funds (RMF), and other areas of sanctions compliance.
First we'll build a "backbone" for the graph in ways which preserve evidence and allow for audits. Next we'll use spaCy pipelines to parse related news articles, using `GLiNER` to extract entities, then the new `spacy-lancedb-linker` to link them into the graph. Finally, we'll show graph analytics that make use of the results -- tying into what's needed for use cases such as GraphRAG.
This approach uses Python open source libraries, and all of the code is provided on GitHub organized in Jupyter notebooks. For each NLP task we use state-of-the-art open models (mostly not LLMs) emphasizing how to tune for a domain context: named entity recognition, relation extraction, textgraph, entity linking, as well as entity resolution to merge structured data and produce a semantic overlay that organizes the graph.
"Catching Bad Guys using open data, open models in AI: a tour through anti-fraud use cases with graphs and entity resolution"
GraphRAG is a popular way to use knowledge graphs to ground AI apps in facts. Most GraphRAG tutorials use LLMs to build graph automatically from unstructured data. However, what if you're working on use cases such as investigative journalism and sanctions compliance -- "catching bad guys" -- where transparency for decisions and evidence are required?
This talk explores how to leverage open data and open models for AI apps -- using entity resolution to build investigative graphs which are accountable, exploring otherwise hidden relations in the data that indicate fraud or corruption. Professionals who work in sanctions compliance, tax fraud, counter-terrorism, etc., -- which our team helps support -- generally don't present a lot in public. However, we can use open data and open source to illustrate where machine learning assists in these kinds of use cases.
For this talk we'll construct an investigative graph about potential money laundering, using ER to merge open data from ICIJ Offshore Leaks, Open Ownership, and OpenSanctions. We'll explore techniques used in production use cases for anti-money laundering (AML), ultimate beneficial owner (UBO), rapid movement of funds (RMF), and other areas of sanctions compliance.
First we'll build a "backbone" for the graph in ways which preserve evidence and allow for audits. Next we'll use spaCy pipelines to parse related news articles, using `GLiNER` to extract entities, then the new `spacy-lancedb-linker` to link them into the graph. Finally, we'll show graph analytics that make use of the results -- tying into what's needed for use cases such as GraphRAG.
This approach uses Python open source libraries, and all of the code is provided on GitHub organized in Jupyter notebooks. For each NLP task we use state-of-the-art open models (mostly not LLMs) emphasizing how to tune for a domain context: named entity recognition, relation extraction, textgraph, entity linking, as well as entity resolution to merge structured data and produce a semantic overlay that organizes the graph.
Переглядів: 80
Відео
IBM TechXchange: The Linux Foundation's Impact on Advancing Open Source AI Development & Innovation
Переглядів 675 місяців тому
IBM TechXchange: The Linux Foundation's Impact on Advancing Open Source AI Development & Innovation
Strategic AI Adoption: Leadership Insights for the Modern Enterprise - Trustmark Webinar
Переглядів 197 місяців тому
Strategic AI Adoption: Leadership Insights for the Modern Enterprise - Trustmark Webinar
The Importance of Openness in Generative AI
Переглядів 708 місяців тому
The Importance of Openness in Generative AI
Open Voice Interoperability Sandbox Webinar February 8th, 2024
Переглядів 2511 місяців тому
Open Voice Interoperability Sandbox Webinar February 8th, 2024
Cleveland Big Data Meetup - January 30, 2024
Переглядів 3511 місяців тому
Cleveland Big Data Meetup - January 30, 2024
Cleveland Big Data Meetup - November 13, 2023
Переглядів 148Рік тому
Cleveland Big Data Meetup - November 13, 2023
Open Voice Trustmark Webinar - 'Empowering Trust'
Переглядів 49Рік тому
Open Voice Trustmark Webinar - 'Empowering Trust'
Cleveland Big Data Meetup - January 23, 2023
Переглядів 74Рік тому
Cleveland Big Data Meetup - January 23, 2023
Cleveland Big Data Meetup - November 14, 2022
Переглядів 2032 роки тому
Cleveland Big Data Meetup - November 14, 2022
Video Interview with Dr Ibrahim Haddad
Переглядів 702 роки тому
Video Interview with Dr Ibrahim Haddad
Cleveland Big Data Meetup - May 13, 2022
Переглядів 512 роки тому
Cleveland Big Data Meetup - May 13, 2022
Cleveland Big Data Meeting - March 21, 2022
Переглядів 252 роки тому
Cleveland Big Data Meeting - March 21, 2022
Trusted AI Principles: Tools and Techniques
Переглядів 552 роки тому
Trusted AI Principles: Tools and Techniques
Cleveland Big Data Meetup - January 24, 2022
Переглядів 1082 роки тому
Cleveland Big Data Meetup - January 24, 2022
Cleveland Big Data Meetup - November 18th, 2021
Переглядів 163 роки тому
Cleveland Big Data Meetup - November 18th, 2021
014 ONNX 20211021 Schmuelling Chen Huang ONNX SIG Converters
Переглядів 753 роки тому
014 ONNX 20211021 Schmuelling Chen Huang ONNX SIG Converters
013 ONNX 20211021 Karzynski ONNX SIG Operators
Переглядів 363 роки тому
013 ONNX 20211021 Karzynski ONNX SIG Operators
006 ONNX 20211021 Kuah ONNX and OneAPI for xPU
Переглядів 1103 роки тому
006 ONNX 20211021 Kuah ONNX and OneAPI for xPU
000 ONNX 20211021 ONNX SC Welcome Progress Roadmap Release
Переглядів 1013 роки тому
000 ONNX 20211021 ONNX SC Welcome Progress Roadmap Release
016 ONNX 20211021 Anton ONNX WG Preprocessing
Переглядів 843 роки тому
016 ONNX 20211021 Anton ONNX WG Preprocessing
015 ONNX 20211021 Li ONNX SIG Model ZooNHub and Tutorials
Переглядів 4133 роки тому
015 ONNX 20211021 Li ONNX SIG Model ZooNHub and Tutorials
012 ONNX 20211021 Khade ONNX SIG Architecture and Infrastructure
Переглядів 423 роки тому
012 ONNX 20211021 Khade ONNX SIG Architecture and Infrastructure
011 ONNX 20211021 Salehi ONNX Runtime and Triton
Переглядів 7923 роки тому
011 ONNX 20211021 Salehi ONNX Runtime and Triton
010 ONNX 20211021 Lyalin ONNX and the OpenVINO Ecosystem
Переглядів 1173 роки тому
010 ONNX 20211021 Lyalin ONNX and the OpenVINO Ecosystem
009 ONNX 20211021 Knight ONNX TVM for dynamic shapes, control flow, quantization compiler OctoML
Переглядів 2003 роки тому
009 ONNX 20211021 Knight ONNX TVM for dynamic shapes, control flow, quantization compiler OctoML
008 ONNX 20211024 Krishnamurthy ONNX and Audit Considerations and benchmarking with QuSandbox
Переглядів 233 роки тому
008 ONNX 20211024 Krishnamurthy ONNX and Audit Considerations and benchmarking with QuSandbox
How do you talk to an AI and how do you communicate abstract ideas to it? Do you start with language or do teach it to improvise? Or do you teach to improvise it's own language? This is an abstract idea but how do you teach an AI to develop a language that is infinitely adaptive to express complex ideas for symbolic language by infinitely customizable symbols and codes to represent chunks of data and complete thoughts.
Great insights into this important space
This rambles far too much